Automated questionnaire analysis with generative AI

Babylotse Frankfurt surveys families with babies but lacks capacity to analyze 1000 open-text responses. We use LLMs to categorize feedback and surface regional patterns.

Reporting Survey

Status: Finished
Project Period: June 2024 – October 2024
Outputs: Code Repository Data Report
Partner: Babylotse

Babylotse Frankfurt is an initiative of the Kinderschutzbund (German Child Protection Association) Frankfurt am Main district branch. The Babylotsinnen are qualified socio-pedagogical professionals who support pregnant women and young families in Frankfurt and Bad Soden directly in maternity hospitals. Their work often begins during pregnancy, offering personal counseling and information about the available Early Support services. Contact with the Babylotsinnen is voluntary and free of charge, helping families navigate the complex system of support services.
Homepage

The Challenge

Babylotse Frankfurt places great emphasis on family-friendliness. To gather well-founded information about the current status and potential for improvement in this regard, a survey was conducted. This survey took place both online and in paper form at selected events. Due to the open-ended questions, the collected data is partially unstructured and also contains spam responses. Therefore, Babylotse Frankfurt is seeking support in evaluating the data. The goal is to identify the most important strengths ("Tops") and weaknesses ("Flops") and, if applicable, break them down by district. In this way, Babylotse aims to convey a clear and concise message to political representatives to improve family-friendliness in Frankfurt.

Data Basis and Structure

All survey responses were transferred into a shared Excel spreadsheet. The survey essentially consisted of three questions: "I live in the district," "What I like about having a baby in Frankfurt," and "What I don’t like about having a baby in Frankfurt." There were no guidelines for answering the questions, so the responses vary greatly in length, structure, and style. Approximately 1,000 responses were collected in total, of which about 10% were spam.

The Solution Approach

Image with four tiles: 1. Find categories, 2. Improve categories, 3. Classify comments, 4. Analyze comments; below the 4 tiles it says:

This project was implemented entirely using generative AI, particularly large language models (LLMs). Initially, the language model was used to identify general categories from the survey responses. In a second step, these categories were refined and revised. The final categories were determined in collaboration with Babylotse Frankfurt and subsequently used to systematically classify all survey responses. During classification, the language model assigned each response to one or more categories. If sufficient assignments could not be made, the response was labeled as "Unknown."

In the final step, the classified survey responses were analyzed to determine the most frequently mentioned points, both overall and at the regional level. The language model was also used here to extract central themes from the three most frequent categories. This analysis made the categories more tangible and provided the basis for specific action areas. The entire project was implemented using the Python programming language and an API for accessing large language models (OpenAI). The results were aggregated and summarized in a final report for Babylotse Frankfurt.

The Impact

The evaluation significantly helped Babylotse Frankfurt to provide the results in time for the meeting with municipal decision-makers. By drastically reducing manual evaluation processes and achieving a high degree of automation, available resources could be focused more intensively on interpreting the identified categories and preparing the presentation.

The meeting with municipal decision-makers was a clear success. The results were presented clearly and understandably, and distinct action areas were identified. The analysis revealed that survey participants had particularly negative experiences in the areas of childcare, transportation, accessibility, and family-friendliness in public spaces, and they wish for comprehensive changes. These findings were especially surprising to the decision-makers, as measures to improve childcare had already been implemented recently. It appears that these measures have not yet effectively reached the survey participants.

Originally, the survey was planned as a one-time measure. Thanks to the successful use of generative AI and large language models, as well as the cooperation with CorrelAid e.V., Babylotse Frankfurt—subject to the approval of financial resources—aims to repeat this survey and potentially develop it into permanent monitoring. CorrelAid e.V. has already provided valuable suggestions and outlined next steps and possible adjustments to the questionnaire to achieve this goal.

CorrelAid Team

Michael Aydinbas
Sören Etler (er/ihm)
Luke Bölling