Where academic tradition
meets the exciting future

Properties of Object-Level Cross-Validation Schemes for Symmetric Pair-Input Data

Juho Heimonen, Tapio Salakoski, Tapio Pahikkala, Properties of Object-Level Cross-Validation Schemes for Symmetric Pair-Input Data. In: Pasi Fränti, Gavin Brown, Marco Loog, Francisco Escolano, Marcello Pelillo (Eds.), Structural, Syntactic, and Statistical Pattern Recognition, Lecture Notes in Computer Science 8621, 384–393, Springer Berlin Heidelberg, 2014.

http://dx.doi.org/10.1007/978-3-662-44415-3_39

Abstract:

In bioinformatics, many learning tasks involve pair-input data (i.e., inputs representing object pairs) where inputs are not independent. Two cross-validation schemes for symmetric pair-input data are considered. The mean and variance of cross-validation estimate deviations from respective generalization performances are examined in the situation where the learned model is applied to pairs of two previously unseen objects. In experiments with the task of learning protein functional similarities, large positive mean deviations were observed with the relaxed scheme due to training–validation dependencies while the strict scheme yielded small negative mean deviations and higher variances. The properties of the strict scheme can be explained by the reduction in cross-validation training set sizes when avoiding training–validation dependencies. The results suggest that the strict scheme is preferable in the given setting.

BibTeX entry:

@INPROCEEDINGS{inpHeSaPa14a,
  title = {Properties of Object-Level Cross-Validation Schemes for Symmetric Pair-Input Data},
  booktitle = {Structural, Syntactic, and Statistical Pattern Recognition},
  author = {Heimonen, Juho and Salakoski, Tapio and Pahikkala, Tapio},
  volume = {8621},
  series = {Lecture Notes in Computer Science},
  editor = {Fränti, Pasi and Brown, Gavin and Loog, Marco and Escolano, Francisco and Pelillo, Marcello},
  publisher = {Springer Berlin Heidelberg},
  pages = {384–393},
  year = {2014},
  keywords = {cross-validation; pair-input; AUC; K-Nearest Neighbor},
}

Belongs to TUCS Research Unit(s): Turku BioNLP Group

Publication Forum rating of this publication: level 1

Edit publication