You are here: TUCS > PUBLICATIONS > Publication Search > Fast n-Fold Cross-Validation f...
Fast n-Fold Cross-Validation for Regularized Least-Squares
Tapio Pahikkala, Jorma Boberg, Tapio Salakoski, Fast n-Fold Cross-Validation for Regularized Least-Squares. In: Timo Honkela, Tapani Raiko, Jukka Kortela, Harri Valpola (Eds.), Proceedings of the Ninth Scandinavian Conference on Artificial Intelligence (SCAI 2006), 83-90, Otamedia Oy, 2006.
Abstract:
Kernel-based learning algorithms have recently become the state-of-the-art machine learning methods of which the support vector machines are the most popular ones. Regularized least-squares (RLS), another kernel-based learning algorithm that is also known as the least-squares support vector machine, is shown to have a performance comparable to that of the support vector machines in several machine learning tasks. In small scale problems, RLS have several computational advantages as compared to the support vector machines. Firstly, it is possible to calculate the cross-validation (CV) performance of RLS on the training data without retraining in each CV round. We give a formal proof for this claim. Secondly, we can compute the RLS solution for several different values of the regularization parameter in parallel. Finally, several problems on the same data set can be solved in parallel provided that the same kernel function is used with each problem. We consider a simple implementation of the RLS algorithm for the small scale machine learning problems that takes advantage of all the above properties. The implementation is done via the eigen decomposition of the kernel matrix. The proposed CV method for RLS is a generalization of the fast leave-one-out cross-validation (LOOCV) method for RLS which is widely known in the literature. For some tasks, the LOOCV gives a poor performance estimate for the learning machines, because of the dependencies between the training data points. We demonstrate this by experimentally comparing the performance estimates given by LOOCV and CV in a ranking task of dependency parses generated from biomedical texts.
Files:
BibTeX entry:
@INPROCEEDINGS{inpPaBoSa06a,
title = {Fast n-Fold Cross-Validation for Regularized Least-Squares},
booktitle = {Proceedings of the Ninth Scandinavian Conference on Artificial Intelligence (SCAI 2006)},
author = {Pahikkala, Tapio and Boberg, Jorma and Salakoski, Tapio},
editor = {Honkela, Timo and Raiko, Tapani and Kortela, Jukka and Valpola, Harri},
publisher = {Otamedia Oy},
pages = {83-90},
year = {2006},
keywords = {Machine learning, Regularized least-squares, Least-squares support vector machines, Cross-validation},
}
Belongs to TUCS Research Unit(s): Turku BioNLP Group