On Thursday, February 8, 2024, our colleague Charlotte Panušková will present her master thesis during the semester colloquium Phänomenologie der Digital Humanities at Freie Universität Berlin. In her paper, she explores methods of remote reading of literature and the possibilities of Top2Vec topic modeling.
The Top2Vec algorithm enables the automated identification of the latent semantic structure and topics of submitted texts, thus compensating for the weaknesses of other models that often require prior knowledge of the number of topics, stop-word lists of words that appear in the text with high frequency without carrying any semantic meaning (e.g. prepositions, conjunctions, articles), or manual lemmatization of texts.
Charlotte Panušková applies this approach to contemporary Czech prose, where she examines how the themes in the texts are constructed and how they correlate with the findings of literary science.