Aimed at the problem of low accuracy rate for face recognition and speaker recognition in noisy environment, a multi-biometric model fusing face features and speech features is presented by combining Normalization and SVM theory based on the research of feature level fusion. Face features and speech features are first extracted by pulse coupled neural network and VQ-SVM respectively. Then the distance between tested people and template people is calculated after getting the fused feature on the feature level fusion. In order to reduce the computational cost and improve the recognition performance, matching distance is normalized and finally recognized by SVM. Experiment on the ORL database show that even when the signal to noise ratio is declined, recognition rate of the fused system is clearly higher than the single system under noisy environment and the purpose of identity recognition is achieved.