Efficient Feature Selection Model for Gene Expression Data

Patharawut Saengsiri; Sageemas Na Wichian; Phayung Meesad

doi:10.4028/www.scientific.net/AMM.110-116.1948

Paper Titles

Anisotropic Nanocomposite Hydrogels with High Mechanical Strength Using Hydrophilic Reactive Microgels and Self-Assemble
p.1923

One-Step Hydrothermal Synthesis and Characterization of ZnO Nanopowders
p.1928

Preparation of TiO₂ Nanopowders by Non-Hydrolytic Sol−Gel and Solvothermal Synthesis
p.1934

A Comparison of Kinetic Modeling of Fermentable Sugar Production from Walnut Green Skin as a Source of Energy
p.1943

Efficient Feature Selection Model for Gene Expression Data
p.1948

Efficacy and Safety Testing of a New Biologically Based Design Ankle Foot Orthosis in Healthy Volunteer
p.1953

Effect of Plant in Two Atrium Building Comfort: Report on Two Field-Monitored Case Studies
p.1958

Thermal Study of the Nickel Ion Interaction with Myelin Basic Protein
p.1963

Thermal Study of Lysozyme Binding with β-Cyclodextrin
p.1966

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 110-116Efficient Feature Selection Model for Gene...

Efficient Feature Selection Model for Gene Expression Data

Abstract:

Finding subset of informative gene is very crucial for biology process because several genes increase sharply and most of them are not related with others. In general, feature selection technique consists of two steps 1) all genes is ranked by a filter approach 2) rank list is sent to a wrapper approach. Nevertheless, the accuracy rate for recognition gene is not enough. Therefore, this paper proposes efficient feature selection model for gene expression data. First, two filter approaches are used to define many subset of attribute such as Correlation based Feature Selection (Cfs) and Gain Ratio (GR). Second, wrapper approach is used to evaluate each length of attribute that based on Support Vector Machine (SVM) and Random Forest (RF). The result of experiment depicts CfsSVM, CfsRF, GRSVM, and GRRF based on proposed model produce higher accuracy rate such as 87.10%, 90.32%, 87.10, and 88.71%, respectively.

You might also be interested in these eBooks

Mechanical and Aerospace Engineering, ICMAE2011

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 110-116)

Pages:

1948-1952

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.110-116.1948

Citation:

Cite this paper

Online since:

October 2011

Authors:

Patharawut Saengsiri, Sageemas Na Wichian, Phayung Meesad

Keywords:

Feature Selection, Gene Expression Data, Random Forest (RF), Support Vector Machine (SVM)

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] P. Lance, H. Ehtesham, and L. Huan, Subspace Clustering for High Dimensional Data: A Review, SIGKDD Explor. Newsl. 1931-0145, Vol. 6 (2004), pp.90-105.

DOI: 10.1145/1007730.1007731

Google Scholar

[2] Mukherjee, S. and S. J. Roberts. A Theoretical Analysis of Gene Selection, Computational Systems Bioinformatics Conference, CSB 2004. Proceedings (2004), pp.131-141.

DOI: 10.1109/csb.2004.1332425

Google Scholar

[3] P. Saengsiri, S.N. Wichian, P. Meesad, and U. Herwig, Comparison of hybrid feature selection models on gene expression data, in 8th International Conference on ICT and Knowledge Engineering (2010), pp.13-18.

DOI: 10.1109/ictke.2010.5692905

Google Scholar

[4] Pádraig Cunningham, Dimension Reduction, Technical Report UCD-CSI-2007-7, August , (2007), pp.1-4.

Google Scholar

[5] Jaeger J., R. Sengupta , W. L. Ruzzo, Improved Gene feature selection for Classification of Microarrays, Pacific Symposium on Biocomputing 8 (2003), pp.53-64.

DOI: 10.1142/9789812776303_0006

Google Scholar

[6] Cheng-San, Y., C. Li-Yeh, et al, A Hybrid Approach for Selecting Gene Subsets Using Gene Expression Data, " Soft Computing in Industrial Applications, SMCia , 08. IEEE Conference (2008), pp.159-164.

DOI: 10.1109/smcia.2008.5045953

Google Scholar

[7] Hikaru Mitsubayashi, Seiichiro Aso, Tomomasa Nagashima, and Yoshifumi Okada, Accurate and Robust Gene feature selectionfor Disease Classification Using a Simple Statistic, ΙSSN 0973-2063 (online) 0973-2063 (print), Bioinformation 3(2) (2008).

DOI: 10.6026/97320630003068

Google Scholar

[8] Jin-Hyuk H. and C. Sung-Bae, Cancer classification incremental gene feature selectionbased on DNA microarray data, Computational Intelligence in Bioinformatics and Computational Biology, IEEE Symposium (2008), pp.70-74.

DOI: 10.1109/cibcb.2008.4675761

Google Scholar

[9] Kamal A., X. Zhu, A. Pandya, S. Hsu, and M. hoaib, The Impact of Gene feature selectionon Imbalanced Microarray Expression Data, Bioinformatics and Computational Biology (2009), pp.259-269.

DOI: 10.1007/978-3-642-00727-9_25

Google Scholar

[10] Mark A. Hall, Correlation-based Feature Selection for Machine Learning, Doctor of Philosphy Department of Computer Science, The University of Waikato Newzealand (1999).

Google Scholar

[11] R. Gray, Entropy and Information Theory, Springer (1990), pp.12-18.

Google Scholar