Automated questionnaire analysis with generative AI
- Status
- Finished
- Project Period
- June 2024 – October 2024
- Outputs
- Code Repository Data Report
- Partner
The Challenge
Babylotse Frankfurt places great emphasis on family-friendliness. To gather well-founded information about the current status and potential for improvement in this regard, a survey was conducted. This survey took place both online and in paper form at selected events. Due to the open-ended questions, the collected data is partially unstructured and also contains spam responses. Therefore, Babylotse Frankfurt is seeking support in evaluating the data. The goal is to identify the most important strengths ("Tops") and weaknesses ("Flops") and, if applicable, break them down by district. In this way, Babylotse aims to convey a clear and concise message to political representatives to improve family-friendliness in Frankfurt.
Data Basis and Structure
All survey responses were transferred into a shared Excel spreadsheet. The survey essentially consisted of three questions: "I live in the district," "What I like about having a baby in Frankfurt," and "What I don’t like about having a baby in Frankfurt." There were no guidelines for answering the questions, so the responses vary greatly in length, structure, and style. Approximately 1,000 responses were collected in total, of which about 10% were spam.
The Solution Approach

This project was implemented entirely using generative AI, particularly large language models (LLMs). Initially, the language model was used to identify general categories from the survey responses. In a second step, these categories were refined and revised. The final categories were determined in collaboration with Babylotse Frankfurt and subsequently used to systematically classify all survey responses. During classification, the language model assigned each response to one or more categories. If sufficient assignments could not be made, the response was labeled as "Unknown."
In the final step, the classified survey responses were analyzed to determine the most frequently mentioned points, both overall and at the regional level. The language model was also used here to extract central themes from the three most frequent categories. This analysis made the categories more tangible and provided the basis for specific action areas. The entire project was implemented using the Python programming language and an API for accessing large language models (OpenAI). The results were aggregated and summarized in a final report for Babylotse Frankfurt.
The Impact
The evaluation significantly helped Babylotse Frankfurt to provide the results in time for the meeting with municipal decision-makers. By drastically reducing manual evaluation processes and achieving a high degree of automation, available resources could be focused more intensively on interpreting the identified categories and preparing the presentation.
The meeting with municipal decision-makers was a clear success. The results were presented clearly and understandably, and distinct action areas were identified. The analysis revealed that survey participants had particularly negative experiences in the areas of childcare, transportation, accessibility, and family-friendliness in public spaces, and they wish for comprehensive changes. These findings were especially surprising to the decision-makers, as measures to improve childcare had already been implemented recently. It appears that these measures have not yet effectively reached the survey participants.
Originally, the survey was planned as a one-time measure. Thanks to the successful use of generative AI and large language models, as well as the cooperation with CorrelAid e.V., Babylotse Frankfurt—subject to the approval of financial resources—aims to repeat this survey and potentially develop it into permanent monitoring. CorrelAid e.V. has already provided valuable suggestions and outlined next steps and possible adjustments to the questionnaire to achieve this goal.
CorrelAid Team
- Michael Aydinbas
- Sören Etler (er/ihm)
- Luke Bölling