A Parallelization Cost Model for FPGA

Article Preview

Abstract:

Using FPGA for general-purpose computing has become an important research direction in high performance computing technology. However, it is not a lossless optimization method. Due to the impact of hardware reconfiguration overhead, data transmission cost, specific characteristics of programs, and other factors, the speedup of general-purpose computing on FPGA has visible difference. On the basis of in-depth analysis of FPGA architecture and development process, the main factors affecting FPGA implementation performance are pointed out, and a parallel cost model for FPGA based on static program analysis is proposed to provide judgment basis for using FPGA in general-purpose computing. The experiment results show that the algorithm estimates accurately FPGA execution performance.

You might also be interested in these eBooks

Info:

Periodical:

Advanced Materials Research (Volumes 181-182)

Pages:

623-628

Citation:

Online since:

January 2011

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2011 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] T. Grotker, S. Liao, G. Martin, and S. Swan: Syetem Design With SystemC. Kluwer Academic, Boston (2002).

Google Scholar

[2] S. Mohl. The Mitrion-C Programming Language. Mitrionics Inc. (2005), Information on http: /www. mitrionics. com/forum.

Google Scholar

[3] Celoxica Ltd. Handel-C Language Reference Manual (2003), Information on http: /www. celoxica. com/techlib/files/CEL-W0410251JJ4-60. pdf.

Google Scholar

[4] Impulse C, Impulse Accelerated Technologies, Inc., Information on http: /www. impulsec. com.

Google Scholar

[5] DIME-C, Nallatech, Inc., http: /www. nallatech. com.

Google Scholar

[28] MAP-C, SRC Computer Inc., Inormation on http: /www. srccomp. com.

Google Scholar

[6] Che Yonggang: Research on Performance Analysis and Optimization Techniques for Scientific Programs, Ph. D Thesis, National University of Defense Technology (2004).

Google Scholar

[7] B.M. Maggs, L.R. Matheson, and R.E. Tarjan: Models of parallel computation: a survey and synthesis. In HICSS'95, Proceedings of the 28th Hawaii International Conferenceon System Sciences (HICSS'95), Washington, DC, USA: IEEE Computer Society (1995).

DOI: 10.1109/hicss.1995.375476

Google Scholar

[8] Li wenlong, Lin haibong, Tang zhizhong: Cost Model and Decision Framework for Software Pipelining. Journal of Software 15(7) (2004), pp.1005-1011.

Google Scholar

[9] T.S. Karkhanis and J.E. Smith: A first-order superscalar processor model . In International Symposium on Computer Architecture (2004), pp.476-490.

DOI: 10.1109/isca.2004.1310786

Google Scholar

[10] X.E. Chen and T.M. Aamodt: A first-order fine-grained multithreaded throughput model. In HPCA (2009), pp.36-51.

Google Scholar

[11] R.H. Saavedra-Barrera and D.E. Culler: An analytical solution for a markov chain modeling multithreaded. Technical report, Berkeley, CA, USA (1991), pp.329-341.

Google Scholar

[12] D.J. Sorin, V.S. Pai, S.V. Adve, M.K. Vernon, and D.A. Wood: Analytic evaluation of shared-memory systems with ILP processors. In Proceedings of 25th Annual International Symposium on Computer Architecture (ISCA) (1998), pp.380-391.

DOI: 10.1109/isca.1998.694797

Google Scholar

[13] R. Ernst, J. Henkel, and T. Benner: Hardware-software cosynthesis for microcontrollers. IEEE Des & Test of Comput., vol. 10, no. 4 (Dec. 1993), pp.64-75.

DOI: 10.1109/54.245964

Google Scholar

[14] R. Gupta and G. De Micheli: Hardware-software cosynthesis for digital systems. IEEE Des & Test of Comput., vol. 10, no. 3 (1993), pp.29-41.

DOI: 10.1109/54.232470

Google Scholar

[15] A. Kalavade and E.A. Lee: A global criticality/local phase driven algorithm for the constrained hardware/software partitioning problem. In Proceedings of the Int. Workshop on Hardware-software Co-design (1994), pp.42-48.

DOI: 10.1109/hsc.1994.336724

Google Scholar

[16] W. Wolf: Hardware/software co-design of embedded systems. In Proceedings of the IEEE 82, 7 (1994), pp.967-989.

DOI: 10.1109/5.293155

Google Scholar

[17] D. Densmore, A. Donlin, and A. Sangiovanni-Vincentelli: FPGA Architecture Characterization for System Level Performance Analysis, in Design Automation and Test Europe 2006 (2006).

DOI: 10.1109/date.2006.244092

Google Scholar

[18] Y. Li, T. Callahan, E. Darnell, R.E. Harr, U. Kurkure, and J. Stockwood: Hardware-Software co-design of embedded reconfigurable architecture, In Proceedings of the Design Automation Conference(DAC'00) (2000), pp.507-512.

DOI: 10.1145/337292.337559

Google Scholar

[19] CHEN Yuan-feng, TANG Pu-shan, LAI Jin-mei, and TONG Jia-rong: Evaluation System for FPGA, Journal of Fudan University (Natural Science). vol. 45, no. 4 (2006), pp.523-528.

Google Scholar

[20] Shang Li and N.K. Jha: Hardware-Software Co-Synthesis of Low Power Real-Time Distributed Embedded System with Dynamically Reconfigurable FPGA's, In Proceedings of the 15th IEEE Int. Conference on VLSI Design(VLSID'02) (2002), pp.345-352.

DOI: 10.1109/aspdac.2002.994946

Google Scholar

[21] N. Shenoy, A. Choudhary, and P. Banerjee: An algorithm for synthesis of large time-constrained heterogeneous adaptive systems, ACM Trans. Design Automation of Electronic Systems, vol. 6, no. 2 (2001), pp.207-225.

DOI: 10.1145/375977.375979

Google Scholar

[22] Dan Zhang, Rongcai Zhao, Lin Han, Jin QU: A Fast Design Space Exploration Method for Reconfigurable Architecture Based on Loop optimization: Submitted to Key Engineering Materials Journal.

DOI: 10.4028/www.scientific.net/kem.467-469.812

Google Scholar

[23] H. Noori, F. Mehdipou, K. Murakami: A Reconfigurable Functional Unit for an Adaptive Dynamic Extensible Processor. In Proceedings of the 16th IEEE intenational conference on field programmable logic and applications (FPL 2006), pp.781-784.

DOI: 10.1109/fpl.2006.311313

Google Scholar

[24] A. Mitra, Z. Guo, A. Banerjee, W. Najjar: Dynamic Co-Processor Architecture for Software Acceleration on CsoCs. In IEEE Int. Conference on Computer Design (ICCD 2006).

DOI: 10.1109/iccd.2006.4380805

Google Scholar

[25] Guochang Zhou, Xubang Shen: An Architecture of Dynamically Reconfigurable Processing Unit(RPU). In: Workshop on Parallel Processing (2007), pp.20-21.

DOI: 10.1109/icppw.2007.22

Google Scholar

[26] Livermore Benchmarks: Information on http: /www. netlib. org/benchmark/livermorec.

Google Scholar