Where academic tradition
meets the exciting future

Parallel Feature Selection for Regularized Least-Squares

Sebastian Okser, Antti Airola, Tapio Salakoski, Tero Aittokallio, Tapio Pahikkala, Parallel Feature Selection for Regularized Least-Squares. In: Pekka Manninen, Per Öster (Eds.), Applied Parallel and Scientific Computing, Lecture Notes in Computer Science 7782, 280–294, Springer, 2013.

Abstract:

This paper introduces a parallel version of the machine learning based feature selection algorithm known as greedy regularized least-squares (RLS). The aim of such machine learning methods is to develop accurate predictive models on complex datasets. Greedy RLS is an efficient implementation of the greedy forward feature selection procedure using regularized least-squares, capable of efficiently selecting the most predictive features from large data cohorts. It has previously been shown, through the use of matrix algebra shortcuts, to perform feature selection in only a fraction of the time required by traditional implementations. In this paper, the algorithm is adapted to allow for efficient parallel-based feature selection in order to scale the method to run on modern clusters. To demonstrate its effectiveness in practice, we implemented it on a sample genome-wide association study, as well as a number of other high-dimensional datasets, scaling the method to up to 128 cores.

BibTeX entry:

@INPROCEEDINGS{inpOkAiSaAiPa13a,
  title = {Parallel Feature Selection for Regularized Least-Squares},
  booktitle = {Applied Parallel and Scientific Computing},
  author = {Okser, Sebastian and Airola, Antti and Salakoski, Tapio and Aittokallio, Tero and Pahikkala, Tapio},
  volume = {7782},
  series = {Lecture Notes in Computer Science},
  editor = {Manninen, Pekka and Öster, Per},
  publisher = {Springer},
  pages = {280–294},
  year = {2013},
}

Belongs to TUCS Research Unit(s): Biomathematics Research Unit (BIOMATH)

Publication Forum rating of this publication: level 1

Edit publication