Combining Speech Enhancement and Cepstral Mean Normalization for LPC Cepstral Coefficients
A mismatch between the training and testing in noisy circumstance often causes a drastic decrease in the performance of speech recognition system. The robust feature coefficients might suppress this sensitivity of mismatch during the recognition stage. In this paper, we investigate the noise robustness of LPC Cepstral Coefficients (LPCC) by using speech enhancement with feature post-processing. At front-end, speech enhancement in the wavelet domain is used to remove noise components from noisy signals. This enhanced processing adopts the combination of discrete wavelet transform (DWT), wavelet packet decomposition (WPD), multi-thresholds processing etc to obtain the estimated speech. The feature post-processing employs cepstral mean normalization (CMN) to compensate the signal distortion and residual noise of enhanced signals in the cepstral domain. The performance of digit speech recognition systems is evaluated under noisy environments based on NOISEX-92 database. The experimental results show that the presented method exhibits performance improvements in the adverse noise environment compared with the previous features.
J. Yang "Combining Speech Enhancement and Cepstral Mean Normalization for LPC Cepstral Coefficients", Key Engineering Materials, Vols. 474-476, pp. 349-354, 2011