Two-Level Parallel Alignment Based on Sequence Parallel Vectorization

Article Preview

Abstract:

This paper proposed a way of two-level parallel alignment based on sequence parallel vectorization with GPU acceleration on the Fermi architecture, which integrates sequence parallel vectorization, parallel k-means clustering approximate alignment and parallel Smith-Waterman algorithm. The method converts sequence alignment into vector alignment by first. Then it uses k-means alignment to divide sequences into several groups and reduce the size of sequence data. The expected accurate alignment result is achieved using parallel Smith-Waterman algorithm. The high-throughput mouse T-cell receptor (TCR) sequences were used to validate the proposed method. Under the same hardware condition, comparing to serial Smith-Waterman algorithm and CUDASW++2.0 algorithm, our method is the most efficient alignment algorithm with high alignment accuracy.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

757-762

Citation:

Online since:

January 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] S.B. Needleman and C.D. Wunsch, A general method applicable to the search for similarities in the amino acid sequence of two proteins, Journal of molecular biology 48 (1970), 443-453.

DOI: 10.1016/0022-2836(70)90057-4

Google Scholar

[2] T.F. Smith and M.S. Waterman, Identification of common molecular subsequences, Journal of molecular biology 147 (1981), 195-197.

DOI: 10.1016/0022-2836(81)90087-5

Google Scholar

[3] O. Gotoh, An improved algorithm for matching biological sequences, Journal of molecular biology 162(1982), 705-708.

DOI: 10.1016/0022-2836(82)90398-9

Google Scholar

[4] T. Rognes and E. Seeberg, Six-fold speed-up of Smith-Waterman sequence database searches using parallel processing on common microprocessors, Bioinformatics 16 (2000), 699-706.

DOI: 10.1093/bioinformatics/16.8.699

Google Scholar

[5] M. Farrar, Striped Smith-Waterman speeds database searches six times over other SIMD implementations, Bioinformatics 23(2007), 156-161.

DOI: 10.1093/bioinformatics/btl582

Google Scholar

[6] A. Szalkowski, C. Ledergerber, P. Krahenbuhl and C. Dessimoz, SWPS3 - fast multi-threaded vectorized Smith-Waterman for IBM Cell B.E. and x86/SSE2, BMC research notes 2008, 1-107.

DOI: 10.1186/1756-0500-1-107

Google Scholar

[7] T. Rognes, Faster Smith-Waterman database searches with inter-sequence SIMD parallelization, BMC bioinformatics 2011, 12-221.

DOI: 10.1186/1471-2105-12-221

Google Scholar

[8] S.A. Manavski and G. Valle, CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment, BMC bioinformatics 2008, 9 Suppl 2: S10.

DOI: 10.1186/1471-2105-9-s2-s10

Google Scholar

[9] Y.C. Liu, D.L. Maskell and B. Schmidt, CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units, BMC research notes 2009, 2-73.

DOI: 10.1186/1756-0500-2-73

Google Scholar

[10] Y.C. Liu, B. Schmidt and D.L. Maskell, CUDASW++2. 0: enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions, BMC research notes 2010, 3-93.

DOI: 10.1186/1756-0500-3-93

Google Scholar

[11] W.G. Liu, B. Schmidt and W. Muller-Wittig, CUDA-BLASTP: accelerating BLASTP on CUDA-enabled graphics hardware, IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM 8 (2011), 1678-1684.

DOI: 10.1109/tcbb.2011.33

Google Scholar

[12] Y.C. Liu, B. Schmidt and D.L. Maskell, CUSHAW: a CUDA compatible short read aligner to large genomes based on the Burrows-Wheeler transform, Bioinformatics 28 (2012), 1830-1837.

DOI: 10.1093/bioinformatics/bts276

Google Scholar

[13] C.M. Liu, T. Wong, E. Wu, R. Luo, S.M. Yiu, Y. Li, B. Wang, C. Yu, X. Chu, K. Zhao, R. Li and T.W. Lam, SOAP3: ultra-fast GPU-based parallel alignment tool for short reads, Bioinformatics 28 (2012), 878-879.

DOI: 10.1093/bioinformatics/bts061

Google Scholar

[14] H. Liu and L. Wong, Data mining tools for biological sequences, J Bioinform Comput Biol 1 (2003), 139-167.

DOI: 10.1142/s0219720003000216

Google Scholar

[15] G.L. Ji, Q. Li, M.C. Wu, J.Y. Fu, X.R. Hu, L.W. Chi and Q. Liu, High throughput TCR sequence alignment using multi-GPU with inter-task parallelization, Biomedical Engineering and Sciences (IECBES), IEEE EMBS Conference on Langkawi, 17-19 Dec. 2012, 231-236.

DOI: 10.1109/iecbes.2012.6498184

Google Scholar