Use Subsampling to Solve Imbalanced Dataset Problem for Automatic Incident Detection Algorithm

Article Preview

Abstract:

Considering the fact that the amount of traffic incident data is rare compared to the large amount of normal traffic state data in the real word, we proposed an Automatic Incident Detection (AID) algorithm based on subsampling method. First, an improved subsampling method based on Edited Nearest Neighbor Rule (ENN) algorithm was used to reconstruct the training set to get a balanced dataset. Then, the Support Vector Machine (SVM) was adopted as a classifier to detect traffic incidents. The real traffic data collected from the I-880 freeway in American was used to build the model and test the performance of the proposed AID algorithm. In addition, we made a comparison of the detection performances between the AID algorithm obtained by the original training set and the one by the relatively balanced training set. The experimental results show that the proposed AID algorithm based on subsampling is suitable for imbalanced dataset and can obtain a better detection performance.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

2114-2119

Citation:

Online since:

July 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Yuan, F. and R.L. Cheu. Transportation Research Part C: Emerging Technologies, 2003. 11(3-4): pp.309-328.

Google Scholar

[2] Wang, F., Parallel Control and Management for Intelligent Transportation Systems: Concepts, Architectures, and Applications. IEEE Transactions on Intelligent Transportation Systems, 2010. 11(3): pp.630-638.

DOI: 10.1109/tits.2010.2060218

Google Scholar

[3] Chen, S. and W. Wang. Expert Systems with Applications, 2009. 36(2): pp.4101-4105.

Google Scholar

[4] R.L. Cheu, S.G. Transportation Research Part C – Emerging Technologies 3 (1995) 371–388.

Google Scholar

[5] Y. Hawas. Transportation Research Part C 15 (2007) 69–95.

Google Scholar

[6] Anguita, D., A. Boni and S. Ridella. IEEE Transactions on Neural Networks, 2003. 14(5): pp.993-1009.

Google Scholar

[7] Chawla, N., N. Japkowicz and A. Kotcz, Editorial. ACM SIGKDD Explorations Newsletter, 2004. 6(1): pp.1-6.

DOI: 10.1145/1007730.1007733

Google Scholar

[8] Wenchang Zheng , Shuyan Chen and Xuanqiang Wang. 2012 (11): page 58-62 +123 (In Chinese ).

Google Scholar

[9] Laurikkala, J., Improving identification of difficult small classes by balancing class distribution. 2001: Springer.

Google Scholar

[10] Batista, G., R. Prati and M. Monard. ACM SIGKDD Explorations Newsletter, 2004. 6(1): pp.20-29.

Google Scholar

[11] Wilson, D.R., Martinez, T.R. Mach. Learn. 38 (2000) 257-286.

Google Scholar

[12] CHEN, W., P. GUAN and Y. ZOU: School Of Electrical Southwest Jiaotong University, 2011. 46(1): page 63-67.

Google Scholar

[13] Jian XIAO , Long Yu, Yifeng Bai. Learned journal of Southwest Jiaotong University, 2008, 43 (3): pages 297-303 (In Chinese ).

Google Scholar

[14] Cortes, C. and V. Vapnik. 1995. 20(3): pp.273-297.

Google Scholar

[15] MATLAB Chinese forum on SVM parameter c & g selected summary posts [matlab-libsvm]. http: /www. ilovematlab. cn/forum. php?mod=viewthread&tid=47819&fromuid=600178.

Google Scholar

[16] Petty, K., Noeimi, H., Sanwal, K., Rydzewski, D., Skabardonis, A., Varaiya, P., et al. (1996). Transportation Research, 4C(3), 71–86.

DOI: 10.1016/0968-090x(96)00001-0

Google Scholar

[17] Yuan, F. and R.L. Cheu. 2003. 11(3-4): pp.309-328.

Google Scholar

[18] Gong Jiong. 2010, Southwest Jiaotong University, page 67.

Google Scholar

[19] Turner, S. M., Albert, L., Gajewski, B., & Eisele, W. (2000). TRB (1719, p.77–84). Washington, DC: National Research Council.

Google Scholar

[20] Wilson, D.L. IEEE Transactions on Systems, Man, and Cybernetics, 1972. 2(3): pp.408-421.

Google Scholar