Where academic tradition
meets the exciting future

All-paths Graph Kernel for Protein-Protein Interaction Extraction With Evaluation of Cross-Corpus Learning

Antti Airola, Sampo Pyysalo, Jari Björne, Tapio Pahikkala, Filip Ginter, Tapio Salakoski, All-paths Graph Kernel for Protein-Protein Interaction Extraction With Evaluation of Cross-Corpus Learning. BMC Bioinformatics 9 (Suppl 11), 2008.

Abstract:

Background

Automated extraction of protein-protein interactions (PPI) is an
important and widely studied task in biomedical text mining. We propose a
graph kernel based approach for this task. In contrast to earlier
approaches to PPI extraction, the introduced all-paths graph kernel has
the capability to make use of full, general dependency graphs
representing the sentence structure.

Results

We evaluate the proposed method on five publicly available PPI corpora,
providing the most comprehensive evaluation done for a machine learning
based PPI-extraction system. We additionally perform a detailed
evaluation of the effects of training and testing on different resources,
providing insight into the challenges involved in applying a system
beyond the data it was trained on. Our method is shown to achieve
state-of-the-art performance with respect to comparable evaluations, with
56.4 F-score and 84.8 AUC on the AImed corpus.

Conclusion

We show that the graph kernel approach performs on state-of-the-art level
in PPI extraction, and note the possible extension to the task of
extracting complex interactions. Cross-corpus results provide further
insight into how the learning generalizes beyond individual corpora.
Further, we identify several pitfalls that can make evaluations of
PPI-extraction systems incomparable, or even invalid. These include
incorrect cross-validation strategies and problems related to comparing
F-score results achieved on different evaluation resources.
Recommendations for avoiding these pitfalls are provided.

BibTeX entry:

@ARTICLE{jAiPyBjPaGiSa08a,
  title = {All-paths Graph Kernel for Protein-Protein Interaction Extraction With Evaluation of Cross-Corpus Learning},
  author = {Airola, Antti and Pyysalo, Sampo and Björne, Jari and Pahikkala, Tapio and Ginter, Filip and Salakoski, Tapio},
  journal = {BMC Bioinformatics},
  volume = {9 (Suppl 11)},
  year = {2008},
}

Belongs to TUCS Research Unit(s): Turku BioNLP Group

Publication Forum rating of this publication: level 2

Edit publication