A Granular Computing Based Decision Tree Algorithm and its Application in Intrusion Detection

Article Preview

Abstract:

Decision tree algorithms have been widely used in intrusion detection. In this paper, within the framework of granular computing (GrC), we propose a new decision tree algorithm called DTGAE and apply it to intrusion detection. First, by virtue of the GrC model using information tables, we propose a new information entropy model, which contains two basic notions: approximation entropy of granule (AEG) and GrC-based approximation entropy (GAE), where the latter is defined based on the former. In algorithm DTGAE, GAE is adopted as the heuristic information for the selection of splitting attributes. When calculating AEG and GAE, we not only utilize the concept of conditional entropy in Shannon's information theory, but also use the concept of approximation accuracy in rough sets. Second, we present a method of decision tree pre-pruning based on Düntsch's knowledge entropy. Finally, the KDDCUP99 data set is used to verify the effectiveness of our algorithm in intrusion detection.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1730-1734

Citation:

Online since:

December 2012

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] J.P. Anderson: Computer Security Threat Monitoring and Surveillance. James P. Anderson Co., Fort Washington, 1980.

Google Scholar

[2] D.E. Denning: An Intrusion Detection Model. IEEE Transaction on Software Engineering, 1987, 13(2): 222-232.

Google Scholar

[3] J.R. Quinlan: Induction of Decision Trees. Machine Learning,1986, 1(1): 81-106.

Google Scholar

[4] J.R. Quinlan: C4.5 programs for Machine Learning. New York: Morgan Kauffman ,1993, 44-78.

Google Scholar

[5] H.B. Zhao, F. Jiang, C.P. Wang: An Approximation Decision Entropy Based Decision Tree Algorithm and Its Application in Intrusion Detection. RSKT2012, Chengdu, China, 2012, 101-106.

DOI: 10.1007/978-3-642-31900-6_13

Google Scholar

[6] Z. Pawlak: Rough sets. Int. J. Inf. Comput. Sci., 1982, 11(5): 341-356.

Google Scholar

[7] Z. Pawlak: Rough sets: Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht, 1991.

Google Scholar

[8] L.A. Zadeh: Fuzzy sets and information granularity, Advances in Fuzzy Set Theory and Applications, North-Holland, Amsterdam, 1979, 3-18.

Google Scholar

[9] Y.Y. Yao, N. Zhong: Granular computing using information tables. Data Mining, Rough Sets and Granular Computing, T.Y. Lin Y.Y. Yao, L.A. Zadeh (Eds.). Physica-Verlag, 2002, 102-124.

DOI: 10.1007/978-3-7908-1791-1_5

Google Scholar

[10] C.E. Shannon: The mathematical theory of communication. Bell System Technical Journal, 1948, 27(3-4): 373-423.

Google Scholar

[11] I. Düntsch, G. Gediga: Uncertainty measures of rough set prediction. Artificial Intelligence, 1998, 106: 109-137.

DOI: 10.1016/s0004-3702(98)00091-5

Google Scholar

[12] G.Y. Wang, H. Yu, D.C. Yang: Decision Table Reduction Based on Conditional Information Entropy. Chinese Journal of Computers, 2002, 25(7), 759-766.

Google Scholar

[13] X.Y. Li, N. Ye: Decision tree classifiers for computer intrusion detection, Journal of Parallel and Distributed Computing Practices, 2001, 4(2): 179-190.

Google Scholar

[14] Z.Y. Xu, Z.P. Liu, B.R. Yang, W. Song: A quick attribute reduction algorithm with complexity of max(O(|C||U|), O(|C|2|U/C|)). Chinese Journal of Computers, 2006, 29(3): 391-399.

Google Scholar

[15] KDD Cup 99 Dataset, 1999, http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html.

Google Scholar

[16] I.H. Witten, E. Frank: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, 2000.

Google Scholar

[17] A. Øhrn: Rosetta Technical Reference Manual, 1999, http://www.idi.ntnu.no/_aleks/rosetta

Google Scholar

[18] S.T. Chen, G.L. Chen, W.Z. Guo, Y.H. Liu: Feature Selection of the Intrusion Detection Data Based on Particle Swarm Optimization and Neighborhood Reduction. Journal of Computer Research and Development, 2010, 47(7): 1261-1267.

Google Scholar