An Adaptive Stacked Denoising Auto-Encoder Architecture for Human Action Recognition

Dao Xi Wu; Wei Pan; Li Dong Xie; Chao Xi Huang

doi:10.4028/www.scientific.net/AMM.631-632.403

Paper Titles

A New Design of Ka-Band Circularly-Polarized Antenna
p.383

The Design of Sleep Disorder Therapeutic Apparatus Based on CES
p.387

A Correction Algorithm of Multi-View Image System
p.395

A Fast Texture Recognition Technology Based on DFT
p.399

An Adaptive Stacked Denoising Auto-Encoder Architecture for Human Action Recognition
p.403

An Image Retrieval Algorithm Based on HVS Weighted Color Features
p.410

Body Recognition Based on Depth Image
p.414

Color Extraction and Research of Image Retrieval Based on Wavelet Territory
p.418

Control Techniques of Class D Audio Power Amplifier
p.422

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 631-632An Adaptive Stacked Denoising Auto-Encoder...

An Adaptive Stacked Denoising Auto-Encoder Architecture for Human Action Recognition

Abstract:

In this paper, a stacked denoising auto-encoder architecture method with adaptive learning rate for action recognition based on skeleton features of human is presented. Firstly a Kinect is used for capturing the skeleton images and extracting skeleton features. Then an adaptive stacked denoising auto-encoder with three hidden layers is constructed for unsupervised pre-training. So the trained weights are achieved. Finally, a neural network is constructed for action recognition, in which the trained weights are used as the initial value, covering the random value. Based on the experimental results from the Kinect dataset of human actions sampled in experiments, it is clear to see that our method possesses the better robustness and accuracy, compared with the classic classification methods.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 631-632)

Pages:

403-409

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.631-632.403

Citation:

Cite this paper

Online since:

September 2014

Authors:

Dao Xi Wu, Wei Pan*, Li Dong Xie, Chao Xi Huang

Keywords:

Action Recognition, KINECT, Skeleton Features, Stacked Denoising Auto-Encoder

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

* - Corresponding Author

References

[1] Ronald Poppe. A survey on vision-based human action recognition[J]. Image and Vision Computing 28 (2010) 976–990.

DOI: 10.1016/j.imavis.2009.11.014

Google Scholar

[2] J. K. AGGARWAL, M.S. Ryoo (2011) Human activity analysis: a review, [J], ACM Computing Surveys, Article 16, Vol. 43, No. 3. p.1–43.

DOI: 10.1145/1922649.1922653

Google Scholar

[3] Hinton, Osindero, et. A Fast Learning Algorithm for Deep Belief Nets[J]. Neural Computation July 2006, Vol. 18, No. 7, Pages 1527-1554.

DOI: 10.1162/neco.2006.18.7.1527

Google Scholar

[4] Bengio. Learning Deep Architectures for AI[J]. Foundations and Trends in Machine Learning Vol. 2, No. 1 (2009) 1–127.

DOI: 10.1561/2200000006

Google Scholar

[5] Lidong Xie, Wei Pan et al. A pyramidal deep learning architecture for human action recognition[J]. Int. J. Modelling, Identification and Control. Volume 21, Number 2/(2014).

Google Scholar

[6] Ho-Joon Kim, Joseph S. Lee et al. Human Action Recognition Using a Modified Convolutional Neural Network[J]. Lecture Notes in Computer Science Volume 4492, 2007, pp.715-723.

Google Scholar

[7] Quoc V. Le, Will Y. Zou et al. Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on.

DOI: 10.1109/cvpr.2011.5995496

Google Scholar

[8] Chen, H-S., Chen, H-T. et al. Proceedings of the ACM international workshop on Video surveillance and sensor networks Pages 171 - 178 , ACM , NY, USA (2006).

Google Scholar

[9] Tran Thang Thanh , Fan Chen et al. Extraction of Discriminative Patterns from Skeleton Sequences for Human Action Recognition. Computing and Communication Technologies, Research, Innovation, and Vision for the Future (RIVF), 2012 IEEE RIVF International Conference on.

DOI: 10.1109/rivf.2012.6169822

Google Scholar

[10] Munaro M, Ballin G, Michieletto S, et al. 3D flow estimation for human action recognition from colored point clouds[J]. Biologically Inspired Cognitive Architectures, 2013, 5: 42-51.

DOI: 10.1016/j.bica.2013.05.008

Google Scholar

[11] Zhu, Hong-Min, Pun, Chi-Man. Human action recognition with skeletal information from depth camera. 2013 IEEE International Conference on Information and Automation, ICIA (2013).

DOI: 10.1109/icinfa.2013.6720456

Google Scholar

[12] Pascal Vincent, Hugo Larochelle, et al. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion[J]. The Journal of Machine Learning Research archive Volume 11, 3/1/2010 Pages 3371-3408.

Google Scholar

[13] Palm, R.B. (2012) Prediction as a Candidate for Learning Deep Hierarchical Models of Data, Master's thesis, Technical University of Denmark, DTU Informatics.

Google Scholar

[14] P. Vincent, H. Larochelle, et al. Extracting and Composing Robust Features with Denoising Autoencoders. Proceedings of the 25th International Conference on Machine Learning, pp.1096-1103, (2008).

DOI: 10.1145/1390156.1390294

Google Scholar