Low Bit Rate Speech Coding Using Lattice Vector Quantization and Time-Scale Modification

Article Preview

Abstract:

This paper presents a low bit rate speech coder based on predictive lattice vector quantization (PLVQ) and time-scale modification (TSM). The coding model of proposed vocoder is built on the MELP, in which bit rate reduction is achieved by taking advantage of PLVQ and TSM techniques. PLVQ is used to encode the speech line spectrum pair (LSP) parameters, which has the advantage of lower implementation complexity than multi-stage vector quantization (MSVQ), moreover, it does not require memory for codebook storage. With our speech data base, PLVQ can save up to 4 bits/frame compared to unstructured codebook MSVQ. TSM can change the speed of speech signal with its perceptual characteristics remained. Through appending TSM as previous and post process, speech coding at bit rate about 1.1 kbps could be easily achieved without modifying the vocoder structure.

You might also be interested in these eBooks

Info:

Periodical:

Advanced Materials Research (Volumes 383-390)

Pages:

5111-5116

Citation:

Online since:

November 2011

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2012 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] L.M. Supplee, R.P. Chon, A. McCree, et al., MELP: the new Federal standard at 2400bps, IEEE Processing of ICASSP, 1997, pp.1591-1594.

Google Scholar

[2] T. Wang, K. Koishida, V. Cuperman, et al., A 1200/2400bps coding suite based on MELP, IEEE workshop on speech coding, Tsukuba, Japan, 2002, pp.90-92.

DOI: 10.1109/scw.2002.1215734

Google Scholar

[3] W.J. Han, E.K. Kim, Y.H. Oh, Multicodebook split vector quantization of LSF parameters, IEEE Signal Processing Letters, 2002, 9(12): 418-421.

DOI: 10.1109/lsp.2002.806057

Google Scholar

[4] F. Lahouti, A.K. Khandani, Reconstruction of multi-stage vector quantized sources over noisy channels-application to MELP codec, IEEE Processing of ICASSP, 2004, pp.613-616.

DOI: 10.1109/icassp.2004.1326901

Google Scholar

[5] T.R. Fischer, A pyramid vector quantizer, IEEE Trans. on information theory, 1986, IT-32 (4): 568-583.

DOI: 10.1109/tit.1986.1057198

Google Scholar

[6] L.H. Fonteles, M. Antonini, High dimension lattice vector quantizer design for generalized Gaussian distributions, IEEE Processing of ICIP, 2007, pp.185-188.

DOI: 10.1109/icip.2007.4379985

Google Scholar

[7] J.H. Conway, N.J.A. Sloane, Fast quantizing and decoding algorithm for lattice quantizers and codes, IEEE Trans. on information theory, 1982, IT-28 (2): 227-232.

DOI: 10.1109/tit.1982.1056484

Google Scholar

[8] P. Rault, C. Guillemot, Indexing algorithms for Zn, an, Dn, and D++n lattice vector quantizers, IEEE Trans. on multimedia, 2001, 3 (4): 395-404.

DOI: 10.1109/6046.966111

Google Scholar

[9] B. Ninness, S.J. Henriksen, Time-Scale Modification of speech signals, IEEE Trans. signal processing, 2008, 56(4): 1479-1488.

DOI: 10.1109/tsp.2007.909350

Google Scholar

[10] W. Verhelst, M. Roelands, An overlap-add technique based on waveform similarity (WSOLA) for high quality time-scale modification of speech, IEEE Processing of ICASSP, 1993: 554-557.

DOI: 10.1109/icassp.1993.319366

Google Scholar

[11] ITU-T P. 862, Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrow-band telephone network and speech codecs, (2001).

DOI: 10.1109/icassp.2001.941023

Google Scholar

[12] M. Djamah, D. O'Shaughnessy, An efficient tree-structured codebook design for embedded vector quantization, IEEE Processing of ICASSP, 2010, pp.4686-4689.

DOI: 10.1109/icassp.2010.5495190

Google Scholar