Where academic tradition
meets the exciting future

Code Optimization of the Subroutine to Remove Near Identical Matches in the Sequence Database Homology Search Tool PSI-BLAST

Mats Aspnäs, Kimmo Mattila, Kristoffer Osowski, Jan Westerholm, Code Optimization of the Subroutine to Remove Near Identical Matches in the Sequence Database Homology Search Tool PSI-BLAST. Journal of Computational Biology 17(6), 819–823, 2010.

Abstract:

A central task in protein sequence characterization is the use of a sequence
database homology search tool to find similar protein sequences in other individuals
or species. The public domain sequence analysis toolkit provided by the National
Center for Biotechnology Information NCBI contains two frequently used methods
for sequence searches: BLAST (blastall program) and PSI-BLAST (blastpgp pro-
gram). For some input sequences, the blastpgp program for gapped basic local
alignment searches fails. Sometimes the program causes a segmentation fault, fail-
ing to produce any valid results, while in other cases it prints out a large number
of warning messages ”ObjMgrNextAvailEntityID failed with idx 2048”, and pro-
duces incorrect results. Many blastpgp searches are also computationally heavy
and require very long execution times, so there is a clear need for a more efficient
implementation. We present a corrected and performance optimized version of the
blastpgp program. On our test sequences, the improved version does not suffer from
the above mentioned problems and is clearly more efficient than the original version
of the program, in the best case by a factor of 1.8.

BibTeX entry:

@ARTICLE{jAsMaOsWe10a,
  title = {Code Optimization of the Subroutine to Remove Near Identical Matches in the Sequence Database Homology Search Tool PSI-BLAST},
  author = {Aspnäs, Mats and Mattila, Kimmo and Osowski, Kristoffer and Westerholm, Jan},
  journal = {Journal of Computational Biology},
  volume = {17},
  number = {6},
  pages = {819–823},
  year = {2010},
  keywords = {gapped sequence alignment, code optimization},
}

Belongs to TUCS Research Unit(s): High Performance Computing and Communication

Publication Forum rating of this publication: level 2

Edit publication