Feature Selection Algorithm for Hyperlipidemia Classification

Qi Rui Zhang; He Xian Wang; Jiang Wei Qin

doi:10.4028/www.scientific.net/AMM.701-702.110

Paper Titles

The Discretization of Continuous Attributes Based on Improved SOM Clustering
p.88

The Research of Order Picking Optimization Based on Jointown Pharmaceutical Logistics Center
p.94

Variables Sequence Calculating Method in the Process of Bayesian Network Constructing for Battlefield Damage Diagnosis
p.98

A Reverse Hilbert's Type Inequality
p.106

Feature Selection Algorithm for Hyperlipidemia Classification
p.110

Reconstruction of Linear Scrambler with Block Data
p.114

The Using of the Minimization of K-L Information in a Hypothesis Testing
p.119

Evaluation of Car-Following Models Based on Measured Data from Real Traffic
p.124

A Method for RAID Availability Analysis Based on Bernoulli Trials
p.129

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 701-702Feature Selection Algorithm for Hyperlipidemia...

Feature Selection Algorithm for Hyperlipidemia Classification

Abstract:

This paper reports a comparative study of feature selection algorithms on a hyperlipimedia data set. Three methods of feature selection were evaluated, including document frequency (DF), information gain (IG) and aχ² statistic (CHI). The classification systems use a vector to represent a document and use tfidfie (term frequency, inverted document frequency, and inverted entropy) to compute term weights. In order to compare the effectives of feature selection, we used three classification methods: Naïve Bayes (NB), k Nearest Neighbor (kNN) and Support Vector Machines (SVM). The experimental results show that IG and CHI outperform significantly DF, and SVM and NB is more effective than KNN when macro-averaging F₁ measure is used. DF is suitable for the task of large text classification.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 701-702)

Pages:

110-113

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.701-702.110

Citation:

Cite this paper

Online since:

December 2014

Authors:

Qi Rui Zhang*, He Xian Wang, Jiang Wei Qin

Keywords:

Feature Selection, Hyperlipemia, Text Categorization

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

* - Corresponding Author

References

[1] Elias A. Iliadis and Robert S. Rosenson. Long-Term safety of pravastatin-gemfibrozil therapy in mixed hyperlipidemia. Clinical Cardiology, Vol. 22(2), (2009), pp.25-28.

DOI: 10.1002/clc.4960220110

Google Scholar

[2] Wenhua Zhao, Jian Zhang, Yue You, and etc. Epidemiologic characteristics of dyslipidemia in people aged 18 years and over in China. Chinese Journal of Preventive Medicine, Vol. 39(5), (2005), pp.306-310. In Chinese.

Google Scholar

[3] Qirui Zhang, Man Luo, Hexian Wang and Jinghua Tan. A Hyperlipidemia Information Analysis System Based on Immune Algorithm. Proceedings of 2010 International Conference on Computer Application and System Modeling, (2010), pp.421-424.

DOI: 10.1109/iccasm.2010.5620593

Google Scholar

[4] Fabrizio Sebastiani. Machine learning in automatic text categorization. ACM Computing Surverys, Vol. 34(3), (2002), pp.1-47.

Google Scholar

[5] Yiming Yang, and Xin Liu. A re-examination of text categorization methods. SIGIR Forum (ACM Special Internet Group on Information Retrieval), (1999), pp.42-49.

Google Scholar

[6] Yiming Yang, and O.P. Jan. A comparative study on feature selection in text categorization. Proceeding of ICML-97, 14th International Conference on Machine Learning, (1997), pp.412-420.

Google Scholar

[7] Kandarp Dave. Study of feature selection algorithms for text categorization. University of Nevada, Las Vegas, (2011).

Google Scholar

[8] Stefano Baccianella, Andrea Esuli and Fabrizio Sebastiani. Feature Selection for Ordinal Text Classification. Neural Computation, Vol. 26(3), (2014), pp.557-591.

DOI: 10.1162/neco_a_00558

Google Scholar

[9] Salton. G., Wong. A., and Yang. C.S. A vector space model for automatic indexing. Communications of the ACM, Vol. 18(11), (1975), pp.613-620.

DOI: 10.1145/361219.361220

Google Scholar

[10] Pallabi Borah, Hasin A. Ahmed and Dhruba K. Bhattacharyya. A statistical feature selection technique. Network Modeling Analysis in Health Informatics and Bioinformatics, Vol. 55(3), (2014), pp.1-13.

DOI: 10.1007/s13721-014-0055-0

Google Scholar

[11] Qirui Zhang, Ling Zhang, Shoubin Dong and Jinghua Tan. Document indexing in text categorization. Proceedings of 2005 International Conference on Machine Learning and Cybernetics, (2005), pp.3792-3796.

DOI: 10.1109/icmlc.2005.1527600

Google Scholar