Where academic tradition
meets the exciting future

A Comparison of AUC Estimators in Small-Sample Studies

Antti Airola, Tapio Pahikkala, Willem Waegeman, Bernard De Baets, Tapio Salakoski, A Comparison of AUC Estimators in Small-Sample Studies. In: Saso Geurts Pierre Rousu Juho Dzeroski (Ed.), Proceedings of the Third International Workshop on Machine Learning in Systems Biology (MLSB'09), 15-23, 2009.

Abstract:

Reliable estimation of the classification performance of learned predictive
models is difficult, when working in the small sample setting. When dealing with
biological data it is often the case that separate test data cannot be afforded.
Cross-validation is in this case a typical strategy for estimating the performance.
Recent results, further supported by experimental evidence presented in this article,
show that many standard approaches to cross-validation suffer from extensive bias
or variance when the area under ROC curve (AUC) is used as performance measure.
We advocate the use of leave-pair-out cross-validation (LPOCV) for performance
estimation, as it avoids many of these problems.

Files:

Full publication in PDF-format

BibTeX entry:

@INPROCEEDINGS{inpAiPaWaDeSa09a,
  title = {A Comparison of AUC Estimators in Small-Sample Studies},
  booktitle = {Proceedings of the Third International Workshop on Machine Learning in Systems Biology (MLSB'09)},
  author = {Airola, Antti and Pahikkala, Tapio and Waegeman, Willem and De Baets, Bernard and Salakoski, Tapio},
  editor = {Dzeroski, Saso Geurts Pierre Rousu Juho},
  pages = {15-23},
  year = {2009},
  keywords = {AUC, area under ROC curve, cross-validation},
}

Belongs to TUCS Research Unit(s): Turku BioNLP Group

Edit publication