Customized MMRF: Efficient Matrix Operations on SIMD Processors

Article Preview

Abstract:

Wireless communication and multimedia applications feature a large amount of matrix operations with different matrix size. These operations require accessing matrix in column order. This paper implements a Multi-Grained Matrix Register File (MMRF) that supports multi-grained parallel row-wise and column-wise access. We implement a 4*4 MIMO decoding with the help of MMRF to illustrate the efficient matrix operations on SIMD processors. Experimental results show that, compared with TMS320C64x+, our SIMD processor can achieve about 5.65x to 7.71x performance improvement by employing the MMRF. By customized design technology, we reduce the area and critical-path delay of MMRF by 17.9% and 39.1% respectively.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1727-1731

Citation:

Online since:

August 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Samsung. Downlink MIMO for EUTRA. 3GPP TSG RAN WG1 meeting #44, 2006. 3GPP R1-060335.

Google Scholar

[2] J. Andrews, A. Ghosh, R. Muhamed, Fundamentals of WiMAX: Understanding Broadband Wireless Networking, Prentice Hall, Mar. (2007).

Google Scholar

[3] Jesus Corbal, Roger Espasa, and Mateo Valero, MOM: a Matrix SIMD Instruction Set Architecture for Multimedia Applications, In Proceedings of the ACM/IEEE SC99 Conference, p.1–12, (1999).

DOI: 10.1145/331532.331547

Google Scholar

[4] Asadollah Shahbahrami, Ben Juurlink, and Stamatis Vassiliadis, Versatility of Extended Subwords and the Matrix Register File, ACM Transactions on Architecture and Code Optimization, Vol. 5, No. 1, Article 5, Publication date: May. (2008).

DOI: 10.1145/1369396.1369401

Google Scholar

[5] Mark Woh, Sangwon Seo, Scott Mahlke, Trevor Mudge, Chaitali Chakrabarti and Krisztian Flautner, AnySP: Anytime Anywhere Anyway Signal Processing, " ISCA, 09, June 20–24, (2009).

DOI: 10.1145/1555754.1555773

Google Scholar

[6] Brian Flachs, Shigehiro Asano, Sang H. Dhong, et al, The Microarchitecture of the Synergistic Processor for a Cell Processor, IEEE Journal of Solid-State Circuits, Vol. 41, NO. 1, Jan. (2006).

DOI: 10.1109/jssc.2005.859332

Google Scholar

[7] Ronny Krashinsky et al, The Vector-Thread Architecture, In Proceedings of the 31st Annual International Symposium on Computer Architecture, 2004., pp.52-63, Jun. (2004).

DOI: 10.1109/isca.2004.1310763

Google Scholar

[8] Catalin Ciobanu, Georgi Kuzmanov, Georgi Gaydadjiev, Alex Ramirez, A Polymorphic Register File for Matrix Operations, International Conference on Embedded Systems: Architectures, Modeling and Simulation, July. (2006).

DOI: 10.1109/icsamos.2010.5642059

Google Scholar

[9] Kai Zhang, Shuming Chen, Hu Chen, Yaohua Wang, Xiaowen Chen, Sheng Liu and Wei Liu, CMRF: a Configurable Matrix Register File for accelerating matrix operations on SIMD processors, IEICE Electron. Express, Vol. 9, No. 4, pp.283-289, (2012).

DOI: 10.1587/elex.9.283

Google Scholar