Where academic tradition
meets the exciting future

Extracting Protein-Protein Interaction Sentences by Applying Rough Set Data Analysis

Filip Ginter, Tapio Pahikkala, Sampo Pyysalo, Jorma Boberg, Jouni Järvinen, Tapio Salakoski, Extracting Protein-Protein Interaction Sentences by Applying Rough Set Data Analysis. In: Roman Slowinski Jan Komorowski et al. Shusaku Tsumoto (Ed.), Proceedings of the Fourth International Conference on Rough Sets and Current Trends in Computing (RSCTC 2004), Lecture Notes in Artificial Intelligence , Vol. 3066, 780-785, 2004.

Abstract:

In this paper, we introduce a way to apply rough set data analysis to the problem of extracting protein-protein interaction sentences in biomedical literature. Our approach builds on decision rules of protein names, interaction words, and their mutual positions in sentences. In order to broaden the set of potential interaction words, we develop a morphological model which generates spelling and inflection variants of the interaction words. We evaluate the performance of the proposed method on a hand-tagged dataset of 1894 sentences and show a precision-recall break-even performance of 79,8% by using leave-one-out cross-validation.

Files:

Abstract in PDF-format

BibTeX entry:

@INPROCEEDINGS{inpGiPaPyBoJaSa04a,
  title = {Extracting Protein-Protein Interaction Sentences by Applying Rough Set Data Analysis},
  booktitle = {Proceedings of the Fourth International Conference on Rough Sets and Current Trends in Computing (RSCTC 2004)},
  author = {Ginter, Filip and Pahikkala, Tapio and Pyysalo, Sampo and Boberg, Jorma and Järvinen, Jouni and Salakoski, Tapio},
  series = {Lecture Notes in Artificial Intelligence , Vol. 3066},
  editor = {Shusaku Tsumoto, Roman Slowinski Jan Komorowski et al.},
  pages = {780-785},
  year = {2004},
}

Belongs to TUCS Research Unit(s): Turku BioNLP Group

Edit publication