Where academic tradition
meets the exciting future

Combining Hidden Markov Models and Latent Semantic Analysis for Topic Segmentation and Labeling: Method and Clinical Application

Filip Ginter, Hanna Suominen, Sampo Pyysalo, Tapio Salakoski, Combining Hidden Markov Models and Latent Semantic Analysis for Topic Segmentation and Labeling: Method and Clinical Application . In: Dietrich Rebholz-Schuhmann Sampo Pyysalo Tapio Salakoski (Ed.), Proceedings of the Third International Symposium on Semantic Mining in Biomedicine (SMBM 2008), TUCS General Publication, 37-44, Turku Centre for Computer Science, Turku, Finland, 2008.

Abstract:

Topic segmentation and labeling systems enable fine-grained information search.
However, previously proposed methods require annotated data
to adapt to different information needs and
have limited applicability to texts with short segment length.
We introduce an unsupervised method based on a combination of
Hidden Markov Models and latent semantic indexing
which allows the topics of interest to be defined freely,
without the need for data annotation, and can identify short segments.
The method is evaluated in an application domain of
intensive care nursing narratives.
It is shown to considerably outperform a keyword-based heuristic baseline
and to achieve a level of performance comparable to that of a related
supervised method trained on 3600 manually annotated words.

Files:

Abstract in PDF-format

BibTeX entry:

@INPROCEEDINGS{inpGiSuPySa08a,
  title = {Combining Hidden Markov Models and Latent Semantic Analysis for Topic Segmentation and Labeling: Method and Clinical Application },
  booktitle = {Proceedings of the Third International Symposium on Semantic Mining in Biomedicine (SMBM 2008)},
  author = {Ginter, Filip and Suominen, Hanna and Pyysalo, Sampo and Salakoski, Tapio},
  number = {51},
  series = {TUCS General Publication},
  editor = {Tapio Salakoski, Dietrich Rebholz-Schuhmann Sampo Pyysalo},
  publisher = {Turku Centre for Computer Science, Turku, Finland},
  pages = {37-44},
  year = {2008},
}

Belongs to TUCS Research Unit(s): Turku BioNLP Group

Edit publication