Where academic tradition
meets the exciting future

Towards Automated Processing of Clinical Finnish: Sublanguage Analysis and a Rule-Based Parser

Veronika Laippala, Filip Ginter, Sampo Pyysalo, Tapio Salakoski, Towards Automated Processing of Clinical Finnish: Sublanguage Analysis and a Rule-Based Parser. International Journal of Medical Informatics 78(12), e7–e12, 2009.

Abstract:

Abstract
Introduction
In this paper, we present steps taken towards more efficient automated processing of clinical Finnish, focusing on daily nursing notes in a Finnish Intensive Care Unit (ICU). First, we analyze ICU Finnish as a sublanguage, identifying its specific features facilitating, for example, the development of a specialized syntactic analyser. The identified features include frequent omission of finite verbs, limitations in allowed syntactic structures, and domain-specific vocabulary. Second, we develop a formal grammar and a parser for ICU Finnish, thus providing better tools for the development of further applications in the clinical domain.
Methods
The grammar is implemented in the LKB system in a typed feature structure formalism. The lexicon is automatically generated based on the output of the FinTWOL morphological analyzer adapted to the clinical domain. As an additional experiment, we study the effect of using Finnish constraint grammar to reduce the size of the lexicon. The parser construction thus makes efficient use of existing resources for Finnish.
Results
The grammar currently covers 76.6% of ICU Finnish sentences, producing highly accurate best-parse analyzes with F-score of 91.1%. We find that building a parser for the highly specialized domain sublanguage is not only feasible, but also surprisingly efficient, given an existing morphological analyzer with broad vocabulary coverage. The resulting parser enables a deeper analysis of the text than was previously possible.

BibTeX entry:

@ARTICLE{jLaGiPySa09a,
  title = {Towards Automated Processing of Clinical Finnish: Sublanguage Analysis and a Rule-Based Parser},
  author = {Laippala, Veronika and Ginter, Filip and Pyysalo, Sampo and Salakoski, Tapio},
  journal = {International Journal of Medical Informatics},
  volume = {78},
  number = {12},
  pages = {e7–e12},
  year = {2009},
}

Belongs to TUCS Research Unit(s): Turku BioNLP Group

Publication Forum rating of this publication: level 3

Edit publication