Where academic tradition
meets the exciting future

Exploratory Topic Modeling with Distributional Semantics

Samuel Rönnqvist, Exploratory Topic Modeling with Distributional Semantics. In: Elisa Fromont, Tijl De Bie, Matthijs van Leeuwen (Eds.), Advances in Intelligent Data Analysis XIV, 241–252, Springer, 2015.

Abstract:

As we continue to collect and store textual data in a multitude of domains, we are regularly confronted with material whose largely unknown thematic structure we want to uncover. With unsupervised, exploratory analysis, no prior knowledge about the content is required and highly open-ended tasks can be supported. In the past few years, probabilistic topic modeling has emerged as a popular approach to this problem. Nevertheless, the representation of the latent topics as aggregations of semi-coherent terms limits their interpretability and level of
detail.

This paper presents an alternative approach to topic modeling that maps topics as a network for exploration, based on distributional semantics using learned word vectors. From the granular level of terms and their semantic similarity relations global topic structures emerge as clustered regions and gradients of concepts. Moreover, the paper discusses the visual interactive representation of the topic map, which plays an important role in supporting its exploration.

BibTeX entry:

@INPROCEEDINGS{inpRonnqvist_Samuel15a,
  title = {Exploratory Topic Modeling with Distributional Semantics},
  booktitle = {Advances in Intelligent Data Analysis XIV},
  author = {Rönnqvist, Samuel},
  editor = {Fromont, Elisa and De Bie, Tijl and van Leeuwen, Matthijs},
  publisher = {Springer},
  pages = {241–252},
  year = {2015},
  ISSN = {0302-9743},
}

Belongs to TUCS Research Unit(s): Data Mining and Knowledge Management Laboratory

Publication Forum rating of this publication: level 1

Edit publication