Where academic tradition
meets the exciting future

Efficient Recurrent Local Search Strategies for Semi- and Unsupervised Regularized Least-Squares Classification

Fabian Gieseke, Oliver Kramer, Antti Airola, Tapio Pahikkala, Efficient Recurrent Local Search Strategies for Semi- and Unsupervised Regularized Least-Squares Classification. Evolutionary Intelligence 5(3), 189–205, 2012.

http://dx.doi.org/10.1007/s12065-012-0068-5

Abstract:

Binary classification tasks are among the most important ones in the field of machine learning. One prominent approach to address such tasks are support vector machines which aim at finding a hyperplane separating two classes well such that the induced distance between the hyperplane and the patterns is maximized. In general, sufficient labeled data is needed for such classification settings to obtain reasonable models. However, labeled data is often rare in real-world learning scenarios while unlabeled data can be obtained easily. For this reason, the concept of support vector machines has also been extended to semi- and unsupervised settings: In the unsupervised case, one aims at finding a partition of the data into two classes such that a subsequent application of a support vector machine leads to the best overall result. Similarly, given both a labeled and an unlabeled part, semi-supervised support vector machines favor decision hyperplanes that lie in a low density area induced by the unlabeled training patterns, while still considering the labeled part of the data. The associated optimization problems for both the semi- and unsupervised case, however, are of combinatorial nature and, hence, difficult to solve. In this work, we present efficient implementations of simple local search strategies for (variants of) the both cases that are based on matrix update schemes for the intermediate candidate solutions. We evaluate the performances of the resulting approaches on a variety of artificial and real-world data sets. The results indicate that our approaches can successfully incorporate unlabeled data. The unsupervised case was originally proposed by Gieseke et al. (2009). The derivations presented in this work are new and comprehend the old ones (for the unsupervised setting) as a special case.

Files:

Full publication in PDF-format

BibTeX entry:

@ARTICLE{jGiKrAiPa12a,
  title = {Efficient Recurrent Local Search Strategies for Semi- and Unsupervised Regularized Least-Squares Classification},
  author = {Gieseke, Fabian and Kramer, Oliver and Airola, Antti and Pahikkala, Tapio},
  journal = {Evolutionary Intelligence},
  volume = {5},
  number = {3},
  publisher = {Springer},
  pages = {189–205},
  year = {2012},
  keywords = {Machine learning, Semi-Supervised Support Vector Machines, Regularized Least-Squares, Non-Convex Optimization, Nyström Approximation},
}

Belongs to TUCS Research Unit(s): Algorithmics and Computational Intelligence Group (ACI), Turku BioNLP Group

Publication Forum rating of this publication: level 1

Edit publication