Where academic tradition
meets the exciting future

Evaluation of Two Dependency Parsers on Biomedical Corpus Targeted at Protein–Protein Interactions

Sampo Pyysalo, Filip Ginter, Tapio Pahikkala, Jorma Boberg, Jouni Järvinen, Tapio Salakoski, Evaluation of Two Dependency Parsers on Biomedical Corpus Targeted at Protein–Protein Interactions. International Journal of Medical Informatics, special issue on Recent Advances in Natural Language Processing for Biomedical Applications 75(6), 430-442, 2006.

Abstract:

We present an evaluation of Link Grammar and Connexor Machinese Syntax, two major broad-coverage dependency parsers, on a custom hand-annotated corpus consisting of sentences regarding protein–protein interactions. In the evaluation, we apply the notion of an interaction subgraph, which is the subgraph of a dependency graph expressing a protein–protein interaction. We measure the performance of the parsers for recovery of individual dependencies, fully correct parses, and interaction subgraphs. For Link Grammar, an open system that can be inspected in detail, we further perform a comprehensive failure analysis, report specific causes of error, and suggest potential modifications to the grammar. We find that both parsers perform worse on biomedical English than previously reported on general English. While Connexor Machinese Syntax significantly outperforms Link Grammar, the failure analysis suggests specific ways in which the latter could be modified for better performance in the domain.

Files:

Full publication in PDF-format

BibTeX entry:

@ARTICLE{jPyGiPaBoJaSa06a,
  title = {Evaluation of Two Dependency Parsers on Biomedical Corpus Targeted at Protein–Protein Interactions},
  author = {Pyysalo, Sampo and Ginter, Filip and Pahikkala, Tapio and Boberg, Jorma and Järvinen, Jouni and Salakoski, Tapio},
  journal = {International Journal of Medical Informatics, special issue on Recent Advances in Natural Language Processing for Biomedical Applications},
  volume = {75},
  number = {6},
  pages = {430-442},
  year = {2006},
  keywords = {Natural language processing; Evaluation; Parser comparison; Dependency syntax; Protein–protein interactions},
}

Belongs to TUCS Research Unit(s): Turku BioNLP Group

Edit publication