TUCS Publication Database: Implementation and Analysis of Block Dense Matrix Decomposition on Network-on-Chips

Implementation and Analysis of Block Dense Matrix Decomposition on Network-on-Chips

Thomas Canhao Xu, Tapio Pahikkala, Antti Airola, Pasi Liljeberg, Juha Plosila, Tapio Salakoski, Hannu Tenhunen, Implementation and Analysis of Block Dense Matrix Decomposition on Network-on-Chips. In: Geyong Min, Jia Hu, Lei (Chris) Liu, Laurence Tianruo Yang, Seetharami Seelam, Laurent Lefevre (Eds.), Proceedings of the 14th IEEE International Conference on High Performance Computing and Communications (HPCC), 516–523, IEEE, 2012.

http://dx.doi.org/10.1109/HPCC.2012.76

Abstract:

The decomposition of a dense matrix into lower and upper triangular matrices is an important linear algebra kernel that used in scientific and engineering applications. To decompose large matrices efficiently, the matrix is divided into sub-matrices as blocks. The block matrix decomposition is introduced for parallel hardware platforms, e.g. supercomputers, multicore processors and GPUs. Recently, the Network-on-Chip (NoC) paradigm is proposed as a promising multicore architecture for future Chip Multiprocessors (CMPs) with hundreds or even thousands of cores. The communication bottleneck of traditional bus or crossbar based on-chip interconnect is alleviated in the NoC architecture. However, the implementation and analysis of parallel block matrix decomposition in a NoC platform has not been well addressed. We design an NoC platform based on state-of-the-art systems. A block matrix decomposition algorithm is implemented on the NoC platform. Evaluation results are presented using a cycle accurate full system simulator. We achieve parallel efficiency of 74.8% with a 64-node NoC, which outperforms other three multiprocessor systems (30.5%, 67% and 50% respectively). We also analyzed the impact of block size, cache behavior and network pressure of the platform.

BibTeX entry:

@INPROCEEDINGS{inpXuPaAiLiPlSaTe12a,
  title = {Implementation and Analysis of Block Dense Matrix Decomposition on Network-on-Chips},
  booktitle = {Proceedings of the 14th IEEE International Conference on High Performance Computing and Communications (HPCC)},
  author = {Xu, Thomas Canhao and Pahikkala, Tapio and Airola, Antti and Liljeberg, Pasi and Plosila, Juha and Salakoski, Tapio and Tenhunen, Hannu},
  editor = {Min, Geyong and Hu, Jia and Liu, Lei (Chris) and Tianruo Yang, Laurence and Seelam, Seetharami and Lefevre, Laurent},
  publisher = {IEEE},
  pages = {516–523},
  year = {2012},
}

Belongs to TUCS Research Unit(s): Algorithmics and Computational Intelligence Group (ACI), Embedded Computer and Electronic Systems (ECES)

Publication Forum rating of this publication: level 1

Edit publication

Where academic tradition meets the exciting future

Implementation and Analysis of Block Dense Matrix Decomposition on Network-on-Chips

Abstract:

BibTeX entry:

Where academic tradition
meets the exciting future