The next generation of intelligent surveillance system should be able to recognize human’s spontaneous emotion state automatically. Compared to speaker recognition, sensor signals analyzing, fingerprint or iris recognition, etc, facial expression and body gesture processing are two mainly non-intrusive vision modalities, which provides potential action information for video surveillance. In our work, we care one kind of facial expression, i.e. anxiety and gesture motion only. Firstly facial expression and body gesture feature are extracted. Particle Swarm Optimization algorithm is used to select feature subset and parameters optimization. The selected features are trained or tested for cascaded Support Vector Machine to obtain a high-accuracy classifier.