LLMs as a Reader: Extracting Topics from Text

Abstract
This talk demonstrates two approaches to LLM-based topic extraction. First, the LLM
is used as a component within BERTopic, replacing the embedding stage with a
transformer encoder. Second, the LLM acts as the complete reader, assigning labels
with supporting quotes to each document. Throughout, the work emphasizes a
“measurement-first” perspective: treating the LLM as an instrument with calibration
and an explicit error budget.
Date
May 19, 2026
Event
2nd Workshop on Frontiers in Measurement and Survey Methods (MeToD)
Location
University of Calabria
The slides below are rendered with Slidev and embedded directly from their own deployment. Use the arrow keys (or on-screen controls) to navigate, or open them full-screen.

Authors
Andres L. Marin
(he/him)
AI Scientist & Postdoctoral Researcher
Doctor in AI from the Universitat Politècnica de València (VRAIN Institute) and Scientific Project Officer at the
European Commission’s Joint Research Centre, with an associated position at the Social and Behavioral Data Science Lab
(University of Konstanz). I work on AI and social data science applied to sustainable transport, from public discourse
analysis to behavioral modeling. Background in Theoretical Physics and AI, with experience in machine learning,
deep learning, and NLP.