Paper Titles

System Security Monitoring Based on Complex Event Processing and Neural Network
p.626

The Research of Interactions in Date Processing and Data Mining
p.638

Incomplete Data Recovery Using Linear Regression
p.642

Linear Random Model and its Application to Data Recovery
p.646

Outlier Analysis in Large Sample and High Dimensional Data Based on Feature Weighting
p.650

A Method of Reconstructing Data Sample with Monte Carlo Method and GM (1,1) Model Theory
p.658

A Mathematical Morphological Processing of Spectrograms for the Tone of Chinese Vowels Recognition
p.665

A Novel Disparity Estimation Method Based on Multi-Block and Adaptive Window
p.672

A View-Dependent and Physical Feature-Preservation Streamline Simplification Method for 3D Vector Field Visualization
p.676

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 571-572Outlier Analysis in Large Sample and High...

Outlier Analysis in Large Sample and High Dimensional Data Based on Feature Weighting

Article Preview

Abstract:

The usual method of outlier analysis is mainly analyzing the outliers according to the Anomaly Index and Variable Contribution Measurement. But in the analysis of large samples of high-dimensional data, this method is difficult. Owing to this, this paper presents a method that weight value for outliers is introduced. The features of outliers are weighted by Analytic Hierarchy Process method. Through this method, the importance of each property of outlier for data mining’s target is rationed, namely the weight number of each property is calculated. And then the correlation values, which represent the degree of relevance between outliers and data mining target, are calculated by using the weight number multiplying by the property value. After correlation values computed, we array the correlation values of outlier from high to low then outlier analysis can become more efficient. At the end of this paper, an instance is presented to demonstrate the maneuverability and feasibility of the method.

You might also be interested in these eBooks

Computers and Information Processing Technologies I

Info:

Periodical:

Applied Mechanics and Materials (Volumes 571-572)

Pages:

650-657

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.571-572.650

Citation:

Cite this paper

Online since:

June 2014

Authors:

Zi Rong Yang*, Zhen Zeng

Keywords:

Analytic Hierarchy Process (AHP), Data Mining (DM), Outlier Analysis

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] D. Hawkins. Identification of Outliers. London: Chapman and Hall, 1980, 10-35.

[2] Varun Chandola, Arindam Banerjee, Vipin Kumar et al. Anomaly Detection: A Survey. ACM computing surveys, 2009, 41(3): 15. 1~15. 58.

DOI: 10.1145/1541880.1541882

[3] LI Qiang, LI Zhendong. Application of The isolated point analysis research in data mining. Microcomputer Applications, 2006, 27(3)：323～327.

[4] Asma S. Larik, Sajjad Haider. Clustering based anomalous transaction reporting. Procedia Computer Science, Volume 3, 2011, 606~610.

DOI: 10.1016/j.procs.2010.12.101

[5] Xudong Zhu, Zhijing Liu. Human behavior clustering for anomaly detection. Frontiers of Computer Science in China, 2011 5(3) 279~289.

[6] Mohammad Zaid Pasha and Nitin Umesh. Article: A Comparative Study on Outlier Detection Techniques. International Journal of Computer Applications 66(24): 23-27, March 2013. Published by Foundation of Computer Science, New York, USA.

[7] Budalakoti, S., Srivastava, A.N., Otey, M.E. et al. Anomaly Detection and Diagnosis Algorithms for Discrete Symbol Sequences with Applications to Airline Safety. IEEE transactions on systems, man and cybernetics. Part C, Applications and reviews, 2009, 39(1): 101~113.

DOI: 10.1109/tsmcc.2008.2007248

[8] Chen Change Loy , Tao Xiang, Shaogang Gong. Detecting and discriminating behavioural anomalies . Pattern Recognition. Volume 44, Issue 1, January 2011, Pages 117~132.

DOI: 10.1016/j.patcog.2010.07.023

[9] SPSS Inc. SPSS Clementine12. 0 Modeling Nodes. Printed in the United States of America(2007): 51.

[10] T. Zhang, R. Ramakrishnan, and M. Livny. BRICH: An efficient data clustering method for very large databases. In proc. of the 1996 ACM SIGMOD, Montreal, Canada, June 1996. 103~114.

DOI: 10.1145/235968.233324

[11] SPSS Inc. SPSS Clementine12. 0 Modeling Nodes. Printed in the United States of America(2007): 52.

[12] SPSS Inc. SPSS Clementine12. 0 Algorithms Guide Anomaly Detection[M]. Printed in the United States of America(2007): 15.

[13] Xing Liu-Wei. The Application of K-means Algorithm in Customer Segmentation. Chengdu: Southwestern University of Finance and Economics, (2007).

[14] Kamal M. Al-Subhi Al-Harbi. Application of the AHP in project management. International Journal of Project Managemnent 19(2001)19~27.

DOI: 10.1016/s0263-7863(99)00038-1

[15] Tversky A. Elimination by aspects: a theory of choice. Psychological Review 1972; 79(4): 281~99.