Improve Abstract Data with Feature Selection for Classification Techniques

Vatinee Nuipian; Phayung Meesad; Pudsadee Boonrawd

doi:10.4028/www.scientific.net/AMR.403-408.3699

Paper Titles

Multi-Objective Chaos Memetic Algorithm for DTLZ Problems
p.3676

Adaboost Ensemble Data Classification Based on Diversity of Classifiers
p.3682

Combining Algorithms for Recommendation System on Twitter
p.3688

PM₁₀ Prediction Model by Support Vector Regression Based on Particle Swarm Optimization
p.3693

Improve Abstract Data with Feature Selection for Classification Techniques
p.3699

Web Based Application Maintenance Cost Estimation Modeling Using Bayesian SEM
p.3704

Determining the Optimal Parameter for Education Surveillance System
p.3709

AGRIX: An Ontology Based Agricultural Expertise Retrieval Framework
p.3714

Classifying Business Types on Twitter Based on User Influential Analysis
p.3719

HomeAdvanced Materials ResearchAdvanced Materials Research Vols. 403-408Improve Abstract Data with Feature Selection for...

Improve Abstract Data with Feature Selection for Classification Techniques

Abstract:

A universal problem with text classification has a problem due to the high dimensionality of feature space, e.g. word frequency vectors. To overcome this problem, this paper proposed a feature selection which focuses on statistical pattern based on SVM Attribute. Experiments have shown that the determination of word importance may increase the speed of the classification algorithm and save their resource used significantly. The proposed method was studied by comparing classification performance among Decision Tree, Naïve Bayes, and Support Vector Machine. The results showed that Support Vector Machine was found to be the best algorithm with F-measure 93.6%. It is found that the feature selection can reduce dimensionality of data significantly.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Advanced Materials Research (Volumes 403-408)

Pages:

3699-3703

DOI:

https://doi.org/10.4028/www.scientific.net/AMR.403-408.3699

Citation:

Cite this paper

Online since:

November 2011

Authors:

Vatinee Nuipian, Phayung Meesad, Pudsadee Boonrawd

Keywords:

Digital Library, Feature Selection, Support Vector Machine (SVM), SVM Attribute, Text Classification

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] T. Joachims: Text Categorization with Support Vector Machines: Learning with many Relevant Features, on www. citeseerx. ist. psu. edu.

Google Scholar

[2] B. Jiang, X. Ding, Lin-Tao Ma, Ying He, Tao Wang, and Wei-Wei Xie : A Hybrid Feature Selection Algorithm: Combination of Symmetrical Uncertainty and Genetic Algorithms, The Second International Symposium on Optimization and Systems Biology (OSB'08) Lijiang, China, October 31– November 3 (2008).

Google Scholar

[3] P. Soucy, G. W. Mineau: Beyond TFIDF Weighting for Text Categorization in the Vector Space Model, on http: /www. ijcai. org/papers/0304. pdf.

Google Scholar

[4] R. Baeza-Yate, B. Ribciro-Neto: Modern Information Retrieval, ACM Press, Addison Wesley, (1999).

Google Scholar

[5] R. Kohavi and G. H. John: Wrappers for feature subset selection, Artificial Intelligence 97 (1997) , pp.273-324, ELSEVIER.

DOI: 10.1016/s0004-3702(97)00043-x

Google Scholar

[6] P. Meesad, V. NuiPian, and P. Boonrawd: A Chi-Square-Test for Word Importance Differentiation in Text Classification, Proceedings of Computer Science and Information Technology Vol. 6 (2011), pp.110-114.

Google Scholar

[7] S. Senthamarai Kannan and N. Ramaraj: A novel hybrid feature selection via Symmetrical Uncertainty ranking based local memetic search algorithm, Knowledge-Based Systems 23 (2010) 580–585 on www. elsevier. com/locate/knosys.

DOI: 10.1016/j.knosys.2010.03.016

Google Scholar

[8] S. Zhang, Z. Zhao: Feature Selection Filtering Methods for Emotion Recognition in Chinese Speech Signal, 2008, IEEE on http: /ieeexplore. ieee. org/stamp/stamp. jsp?arnumber=04697464.

DOI: 10.1109/icosp.2008.4697464

Google Scholar

[9] A. Ahmad, L. Dey: A feature selection technique for classificatory analysis, Pattern Recognition Letter 26 (2005), pp.43-56.

DOI: 10.1016/j.patrec.2004.08.015

Google Scholar

[10] G. Salton, A. Wong, and C. S. Yang: A vector space model for automatic indexing, Journal of the American Society for Information Science, 18(11): 613-620, Nov. (1975).

DOI: 10.1145/361219.361220

Google Scholar

[11] J. R, Quinlan: Induction of Decision Trees, Machine Learning 1(1), 2006, 81-106.

Google Scholar

[12] D. Lewis: Naive Bayes at forty: The independence assumption in information retrieval, Proc. of European Conf. on Machine Learning, p.4–15, (1998).

DOI: 10.1007/bfb0026666

Google Scholar

[13] V. Vapnik: The Nature of Statistical Learning Theory, Springer, New York, (1995).

Google Scholar

[15] LexTo : Thai Lexeme Tokenizer on http: /www. sansarn. com/lexto.

Google Scholar