Skip to content

Incorporating structural topic modeling into short text analysis

Notifications You must be signed in to change notification settings

diff94/STM_shorttext

Repository files navigation

STM_shorttext

Here provides the codes and data for the paper of Concentric 49:1, scheduled to be published in June 2023.: Incorporating structural topic modeling into short text analysis

Introduction

The past few decades have seen the rapid development of topic modeling. So far, research has been more concerned with determining the ideal number of topics or meaningful topic clustering words than with applying topic modeling techniques to evaluate linguistic theories. This study proposes the Structural Topic Model (STM)-led framework to facilitate the interpretation of topic modeling results and standardize text analysis. STM encompasses various model training mechanisms, thereby requiring systematic designs to properly combine language studies. “Structural” in STM refers to the inclusion of metadata structure. Unlike the corpus-based keyness approach, STM can capture contextual cues and meta-information for the interpretation of topical results. Besides, STM can make crosscorpora comparisons via topical contrast, a challenging task for corpusdriven related models such as the Biterm Topic Model (BTM). Stylistic variations in song lyrics are taken as an illustration to show how to use the suggested framework to delve into the linguistic theory proposed by Pennebaker (2013). The topical model and iterable model in the proposed paradigm can clarify how pronouns affect style distinction. We believe the proposed STM-led framework can shed light on text analysis by conducting a reproducible cross-corpora comparison on short texts.

About

Incorporating structural topic modeling into short text analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published