Scaling qualitative insight
An agentic workflow for analysing student voices
DOI:
https://doi.org/10.65106/apubs.2025.2728Keywords:
Thematic analysis, large language model, text analytics, qualitative analysisAbstract
Educators often rely on textual data from student evaluation comments and feedback survey responses to gain insights into students’ learning, understand their perceptions of educational innovations, as well as to evaluate curricula for improving educational practices. Such nuanced data from individual students capture subjective perceptions and experiences, and are analysed through interpretive lenses in qualitative research (Denzin & Lincoln, 2011). However, large corpora of data present significant challenges in being able to scale qualitative analysis. In this poster submission, we present a novel multi-agent architecture using large language models (LLMs) for analysing open-text responses as a possible solution to this problem. Building on our previous LLM-based workflow (Bakharia et. al., 2025), our agentic workflow involves multiple steps for responsibly automating the inductive thematic analysis process (Lochmiller, 2021) including validation with a multi-stage process designed to ensure analytical rigour and reliability.
Our workflow first finds stable themes within each document by making multiple parallel calls to a LLM, generating a wide range of possible themes. We then use semantic clustering to identify themes that appear across many runs, going beyond just keywords. A verification step checks that all quoted evidence actually exists in the original text, preventing hallucinations and grounding themes in real student voices. Next, all themes go through a refine-and-review loop. A critic agent gives feedback on the quality of each theme, and a refiner agent improves the name, rationale, and keywords. Once all documents are complete, the system groups similar themes using hierarchical clustering to find broader categories. To support human interpretation, we built a user interface that includes a Sankey diagram to show how themes connect back to the original documents. Researchers can interact with the diagram to see the actual quotes behind each theme, providing clarity and context.
Our approach emphasises trustworthiness through built-in verification and ensuring transparency at every level of abstraction. Our workflow also incorporates human-in-the-loop processes to ensure rigour.
Downloads
Published
Issue
Section
Categories
License
Copyright (c) 2025 Aneesha Bakharia, Antonette Shibani, Brayam Alexander Pineda Miranda, Lisa-Angelique Lim, Trish McCluskey, Simon Buckingham Shum

This work is licensed under a Creative Commons Attribution 4.0 International License.