Identifying Abbreviations in Biomedical Literature Based on Maximum Entropy with Web Features

Article Preview

Abstract:

The number of biomedical literatures is growing rapidly, and biomedical literature mining is becoming essential. A learning classifier based on maximum entropy (ME) for identifying abbreviations is proposed. Two innovative Web-based features for extracting additional semantic information are developed. The study shows the Web as a knowledge source can be incorporated effectively in the machine learning framework and significantly improves its performance. The ME classifier achieves 95% precision and 89% recall on the gold standard corpus “Medstract” and 91% precision and 84% recall on the larger test data that includes 128 full text literatures.

You might also be interested in these eBooks

Info:

Periodical:

Advanced Materials Research (Volumes 998-999)

Pages:

1024-1027

Citation:

Online since:

July 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] M.D. Yandell and W.H. Majoros: Nat Rev Genet, Vol. 3 (2002) No. 8, pp.601-610.

Google Scholar

[2] S.D. Rebholz, A. Oellrich and R. Hoehndorf: Nat Rev Genet, Vol. 13 (2012) No. 12, pp.829-839.

DOI: 10.1038/nrg3337

Google Scholar

[3] M. Torii, Z.Z. Hu and M. Song: BMC Bioinformatics, Vol. 8 (2007) Suppl 9, pp.5-17.

Google Scholar

[4] A. Ratnaparkhi: Maximum Entropy Models for Natural Language Ambiguity Resolution, (Ph.D., University of Pennsylvania, Philadelphia, PA, 1998), p.23.

Google Scholar

[5] H. Ao and T. Takegi: Alice: J AM Med Inform Assoc, Vol. 12 (2005), pp.576-586.

Google Scholar

[6] A.S. Schwartz and M.A. Hearst: A Simple Algorithm for Identifying Abbreviation Definitions in Biomedical Text, Proceedings of the Pacific Symposium on Biocomputing. (Lihue: PSB Association, 2003). P. 451-462.

DOI: 10.1142/9789812776303_0042

Google Scholar

[7] Y. Xu, Z. Wang and Y. Lei: MBA: BMC Bioinformatics, Vol. 10 (2009), pp.14-28.

Google Scholar