Study on the Generation Model of Weighted Visual Codebook for Action Recognition

Article Preview

Abstract:

The action recognition based on local spatial-temporal feature has attracted more and more attentions. The codebook used in the recognition is usually generated by k-means and the code words are considered to have the same importance. But in fact, it isn’t true. So in this paper, we propose two strategies to measure the importance of different code words and improve the HIK-Kmeans algorithm .The experiment result shows that our improved algorithm increases the accuracy of action recognition.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

944-947

Citation:

Online since:

July 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] J. K. Aggarwal and M. S. Ryoo1. Human Activity Analysis: A Review. [C] ACM Computing Surveys (CSUR), 43(3), April (2011).

Google Scholar

[2] I. Laptev. On space-time interest points. [J] International Journal of Computer Vision, 2005, 64(2): 107-123.

Google Scholar

[3] P. Dollar, V. Rabaud, G. Cottrell, et al. Behavior recognition via sparse spatio-temporal features. [C] IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, (2005).

DOI: 10.1109/vspets.2005.1570899

Google Scholar

[4] A. Oikonomopoulos, I. Patras, and M. Pantic. Spatio-temporal salient points for visual recognitoin of human actions. [J] IEEE Trans. Systems, Man and Cybernetics, Part B, 36(3): 710-719, (2006).

DOI: 10.1109/tsmcb.2005.861864

Google Scholar

[5] P. Scovanner,S. Ali, and M. Shah. A 3-dimensional SIFT descriptor and its application to action recognition. [C] In ACM International Conference on Multimedia, (2007).

DOI: 10.1145/1291233.1291311

Google Scholar

[6] G. Willems, T. Tuytelaars and L. V. Gool. An efficient dense and scale-invariant spatio-temporal interest point detector. [C] European Conference on Computer Vision, (2008).

DOI: 10.1007/978-3-540-88688-4_48

Google Scholar

[7] I. Laptev, B. Caputo, C. Schuldt, et al. Local velocity-adapted motion events for spatio-temporal recognition. [J] Computer Vision and Image Understanding, 2007, 108: 207-229.

DOI: 10.1016/j.cviu.2006.11.023

Google Scholar

[8] A. Klaser, M. Marszalek and C. Schmid. A spatio-temporal descriptor based on 3D-gradients. [C] British Machine Vision Conference, (2008).

DOI: 10.5244/c.22.99

Google Scholar

[9] C. Schuldt, I. Laptev and B. Caputo. Recognizing human actions: A local SVM approach. [C] Internation Conference on Pattern Recognition, (2004).

DOI: 10.1109/icpr.2004.1334462

Google Scholar

[10] S. Maji, A. C. Berg, and J. Malik. Classification using intersection kernel support vector machines is efficient. In CVPR, (2008).

DOI: 10.1109/cvpr.2008.4587630

Google Scholar

[11] I. Laptev, M. Marszalek, C. Schmid, et al. Learning realistic human actions from movies. [C] Computer Vision and Pattern Recognition, (2008).

DOI: 10.1109/cvpr.2008.4587756

Google Scholar

[12] M. J. Swain and D. H. Ballard. Color indexing. [J] International Journal of Computer Vision, 7(1): 11–32, (1991).

Google Scholar

[13] F. Odone, A. Barla, and A. Verri. Building kernels from binary strings for image matching. [J] IEEE Trans. Image Processing, 14(2): 169–180, (2005).

DOI: 10.1109/tip.2004.840701

Google Scholar

[14] Jianxin Wu , James M. Rehg. Beyond the Euclidean distance: Creating effective visual codebooks using the histogram intersection kernel. [C] 2009 IEEE 12th International Conference on Computer Vision. pp.630-637 , (2009).

DOI: 10.1109/iccv.2009.5459178

Google Scholar

[15] F. Jurie and B. Triggs. Creating efficient codebooks for visual recognition. [C] In ICCV, volume 1, pages 604–610, (2005).

DOI: 10.1109/iccv.2005.66

Google Scholar