An Efficient Distance and Density Based Outlier Detection Approach

Article Preview

Abstract:

In order to solve the density based outlier detection problem with low accuracy and high computation, a variance of distance and density (VDD) measure is proposed in this paper. And the k-means clustering and score based VDD (KSVDD) approach proposed can efficiently detect outliers with high performance. For illustration, two real-world datasets are utilized to show the feasibility of the approach. Empirical results show that KSVDD has a good detection precision.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

342-347

Citation:

Online since:

February 2012

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2012 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Chiu, A.L. -M., Fu, A.W. -C.: Enhancements on Local Outlier Detection. In: Proceedings of the Seventh International Database Engineering and Applications Symposium, IDEAS 2003 (2003).

DOI: 10.1109/ideas.2003.1214939

Google Scholar

[2] Peter J. Rousseeuw and Mia Hubert, Robust statistic for outlier detection. 2011 John Wiley & Sons, Inc. WIREs Data Mining Knowl Discov 2011, 73-79, (2011).

DOI: 10.1002/widm.2

Google Scholar

[3] S. Guha, R. Rastogi and K. Shim, An efficient Clustering algorithm for lager databases, In Proceedings of the 1998 ACM SIGMOD international conference on management of data, Seattle, Washington, USA, pp.73-84, (1998).

DOI: 10.1145/276304.276312

Google Scholar

[4] D. Yu, G. Sheikholeslami and A. Zhang, Findout: Finding out outliers in very larger datasets, Knowledge and Information System, vol. 4, no. 4, pp.387-412, (2002).

DOI: 10.1007/s101150200013

Google Scholar

[5] Knorr, E.M., Ng, R.T., Tucakov and V., Distance-based outliers: Algorithms and applications, In: VLDB Journal 8, 237-253, (2000).

DOI: 10.1007/s007780050006

Google Scholar

[6] Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander and J, Lof: Identifying density-based local outliers, In: ACM SIGMOD on Management of Data, pp.386-395, (2000).

DOI: 10.1145/342009.335388

Google Scholar

[7] Hoang Vu Nguyen, Vivekanand Gopalkrishnan and Ira Assent, An Unbiased Distance-Based Outlier Detection Approach for High-Dimensional Data, In: DASFAA 2011, LNCS 6587, pp.138-152, (2011).

DOI: 10.1007/978-3-642-20149-3_12

Google Scholar

[8] Ke Zhang, Marcus Hutter and Huidong Jin, A New Local Distance-Based Outlier Detection Approach for Scattered Real-World Data, In: PAKDD 2009, LNAI 5476, pp.813-822, (2009).

DOI: 10.1007/978-3-642-01307-2_84

Google Scholar

[9] H. Huang, K. Mehrotra and C.K. Mohan, Rank-Based Outlier Detection, Syracuse University - Department of EECS, 4-206 CST, Syracuse, NY 13244, (P) 315. 443. 2652 (F) 315. 443. 2583, (2011).

Google Scholar

[10] Ke Zhang and Huidong Jin, An Effective Pattern Based Outlier Detection Approach for Mix Attribute Data, AI 2010, LNCS (LNAI), vol. 6464, pp.122-131. Springer, Heidelberg, (2010).

Google Scholar

[11] Rajendra Pamula, Jatindra Kumar Deka and Sukumar Nandi, An Outlier Detection Method Based on Clustering, In: 2011 Second International Conference on Emerging Applications of Information Technology, pages 253-256, (2011).

DOI: 10.1109/eait.2011.25

Google Scholar

[12] Monowar H. Bhuyan, D.K. Bhattacharyya and J.K. Kalita, RODD: An Effective Reference-Based Outlier Detection Technique for Larger Datesets, In: CCSIT 2011, CCIS 133, pp.76-84, (2011).

DOI: 10.1007/978-3-642-17881-8_8

Google Scholar

[13] R.M. Konijin and W. Kowalczyk, An Interactive Approach to Outlier Detection, In: RSKT 2010, LNAI 6401, pp.397-385, (2010).

Google Scholar

[14] K. Subramanian and E. Ramaraj, An Efficient Partition Algorithm to Find Un-Expected Behavioural Data Pints, In: International Journal of Information Technology and Knowledge Management, January-June 2011, pp.275-278, (2011).

Google Scholar

[15] http: /archive. ics. uci. edu/ml.

Google Scholar