The Performance Assessment Strategy in DC-BTA Multiple Sequence Alignment

Article Preview

Abstract:

A brand new performance assessment model is proposed for multiple sequence alignment. The new strategy is based on beam constructing of DC-BTA algorithm, which is a Divide-and-Conquer alignment method with beams. Beams form blocks of almost the identical columns and contribute biggest similarity weight to sequences. A formula to compute all beam areas covering a sequence assigns a value or weight to the sequence. And the total beam area is a partial to the whole alignment. A rate value between 0 and 1 is computed to assess the performance. This scheme is a simple and effective assessment policy in DC-BTA for the convenience of collecting the beam areas.

You might also be interested in these eBooks

Info:

Periodical:

Key Engineering Materials (Volumes 439-440)

Pages:

35-40

Citation:

Online since:

June 2010

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2010 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Steven Henikoff1 and Jorja G. Henikoff, Position-based sequence weights, J Mol Biol. Vol. 243(4) (1994), pp.574-578.

Google Scholar

[2] Prakash, A. Tompa, M. Assessing the Discordance of Multiple Sequence Alignments, IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 6(4) (2009), pp.542-551.

DOI: 10.1109/tcbb.2007.70271

Google Scholar

[3] Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucl. Acids Res. Vol. 22(22) (1994).

DOI: 10.1093/nar/22.22.4673

Google Scholar

[4] Lee A. Newberg, Lee Ann McCue, Charles E. Lawrence, The relative inefficiency of sequence weights approaches in Determining a Nucleotide position weight matrix, Stat Appl Genet Mol Biol. Vol. 4(1) (2005), article 13.

DOI: 10.2202/1544-6115.1135

Google Scholar

[5] C. Notredame. Recent Progress in Multiple Sequence Alignment: A Survey, Pharmacogenomics, vol. 3(1) (2002), pp.131-144.

DOI: 10.1517/14622416.3.1.131

Google Scholar

[6] Christian Blum, MariaJ. Blesa, ManuelLópez-Ibáñez. Beam search for the longest common subsequence problem, Computers & Operations Research 36 (2009), pp.3178-3186.

DOI: 10.1016/j.cor.2009.02.005

Google Scholar

[7] Cao Zhanmao, Hong Shen, Gong Leiguang. A Beam-through Algorithm to compute Multiple Sequence Alignment, Proceedings of International Conference on Machine Learning and Cybernetics. Vol. 9 (9) ( 2005), pp.5704-5712.

DOI: 10.1109/icmlc.2005.1527954

Google Scholar

[8] C. Notredame, D.G. Higgins, and J. Heringa, T-Coffee: A Novel Method for Fast and Accurate Multiple Sequence Alignment, J. Molecular Biology, vol. 302(1), (2000), pp.205-217.

DOI: 10.1006/jmbi.2000.4042

Google Scholar

[9] McCue LA, Thompson W, Carmack CS, Lawrence CE. Factors influencing the identification of transcription factor binding sites by cross-species comparison. Genome Res. Vol. 12(10), (2002), pp.1523-1532.

DOI: 10.1101/gr.323602

Google Scholar

[10] Bruno WJ. Modeling residue usage in aligned protein sequences via maximum likelihood. Mol Biol Evol. Vol. 13(10), (1996), pp.1368-1374.

DOI: 10.1093/oxfordjournals.molbev.a025583

Google Scholar

[11] Xin Chen, Tao Jiang, An improved Gibbs sampling method for motif discovery via sequence weighting, Journal: Computational systems bioinformatics / Life Sciences Society. Computational Systems Bioinformatics Conference, (2006), pp.239-247.

DOI: 10.1142/9781860947575_0030

Google Scholar

[12] S. Altschul, W. Gish, W. Miller, E. Myers, and D. Lipman. Basic local alignment search tool. Journal of Molecular Biology, 215, (1990), pp.403-410.

DOI: 10.1016/s0022-2836(05)80360-2

Google Scholar

[13] S. Altschul. Evaluating the statistical significance of multiple distinct local alignments. In S. Suhai, editor, Theoretical and Computational Methods in Genome Research, (1997), pp.1-14.

DOI: 10.1007/978-1-4615-5903-0_1

Google Scholar

[14] S. Karlin and S.F. Altschul. Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proceedings of the National Academy of Science USA, 87(6), (1990), pp.2264-2268.

DOI: 10.1073/pnas.87.6.2264

Google Scholar

[15] T. Jiang and L. Wang, On the Complexity of Multiple Sequence Alignment, J. Computer Biology. Vol. 1(4), (1994), pp.337-348.

Google Scholar