基于WI的低速率语音编码算法研究与实现
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
目前,拥有长途话音质量的4kbps及4kbps以下速率的语音编码器是人们研究的热点,
    它们在可视电话、移动通信及个人通信等方面将有非常广阔的应用前景。本文基于波形内插
    (WI)编码方案开发一个3.75kbps的语音编码器,并用C语言在计算机上模拟实现。
     WI编码器将语音信号表示为渐变的特征波形(CW),并将其分解为慢渐变波形(SEW)
    与快渐变波形(REW),以分别进行量化。其中,SEW表示语音信号中的准周期成分,REW
    表示语音信号中的非周期成分。
     非正式主观质量测试的结果表明,未经参数量化的WI编码器的语音质量要比8kbps
    G.729标准好。而量化以后,3.75kbps WI编码器的语音质量接近于G.723.1的5.3kbps标准。
Recently~ there is the great interest in developing toll-quality speech coders at rates of 4kbps and below, and which are expected to be widely exploited in applications such as visual telephone~ mobile and personal communications. In this paper, a speech coder at 3.75kbps is developed based on the waveform interpolation scheme (WI), and has been simulated in floatingpoint C language.
     In the WI coder, the speech signal is represented by an evolving characteristic waveform (CW) . The CW is decomposed into a slowly evolving waveform and a rapidly evolving waveform, representing the quasi-periodic and non-periodic components of speech respectively,
    which are quantized separately.
     Informal subjective listening tests indicated that the unquantized WI coder is comparative with the 8kbps G.729 standard under clean input speech conditions. After quantized, the speech quality of WI coder at 3.75kbps is closed to that of G.723.l at 5.3kbps.
引文
[1]杨行峻、迟惠生等,“语音信号数字处理”,北京:电子工业出版社,1995年8月。
    [2]鲍长春,“低比特率数字语音编码基础”,北京:北京工业大学出版社,2001年3月。
    [3]胡航,“语音信号处理”,哈尔滨:哈尔滨工业大学出版社,2000年5月。
    [4]曹志刚、钱亚生,“现代通信原理”,北京:清华大学出版社,PP.105—169,1992年8月。
    [5]黄端旭,“信息传输原理”,江苏:南京工学院出版社,pp.249—286,1987年9月。
    [6]鲍长春,“低比特率语音编码的若干问题研究”,西安电子科技大学博士后出站报告,1997年11月。
    [7]鲍长春,“低速率语音编码算法研究及实现”,吉林工业大学博士研究生学位论文,1995年8月。
    [8]江灏、崔惠娟、唐昆,“一种高质量的2Kb/s语音编码算法MWI”,清华大学学报(自然科学版),第38卷,第3期,pp.67—71,1998年。
    [9]鲍长春、戴逸松,“线谱对参数的一步插值预测矢量量化”,长春邮电学院学报,第13卷,第4期,pp.1—7,1995年。
    [10]鲍长春、樊昌信、王都生,“线谱频率参数的分裂矢量量化”,电子科学学刊,第20卷,第4期,pp.508—514,1998年7月。
    [11]卓力、鲍长春,“一种高效、透明的线谱频率参数矢量量化器”,CCSP'99,PP.154—158,1999年10月。
    [12]刘志勇, “4.0Kb/s—8.0Kb/s中速率语音编码技术的研究”,清华大学博士学位论文,1997年5月。
    [13]W.B.Kleijn, and J.Haagen, "Waveform Interpolation for Coding and Synthesis ", in Speech Coding and Synthesis by W.B.Kleijn and K.K.Paliwal, Elsevier Science B.V., Chapter 5,pp.175—207, 1995.
    [14]W.B.Kleijn, "Encoding speech using prototype waveforms" , IEEE Trans. Speech and Audio Processing, vol. 1, pp.386—399, Oct. 1993.
    [15]G.Kubin, B.S.Atal, and W.B.Kleijn, "Performance of noise excitation for unvoiced speech" , Proc. IEEE Workshop on Speech Coding for Telecom. pp. 35—36, Oct. 1993.
    
    
    [16] W.B.Kleijn, and, J.Haagen, "Transformation and Decomposition of the Speech Signal for Coding" , IEEE Signal Processing Letters, vol.1, No.9, pp.136-138, Sept. 1994.
    [17] W.B.Kleijn , Y.Shoham , D.Sen , and R.Hagen , " A Low-Complexity Waveform Interpolation Coder" , Proc. IEEE Int. Conf. on Acoustics, Speech, Signal Processing, pp.212-215, May 1996.
    [18] Eddie L.T.Choy, "Waveform Interpolation Speech Coder at 4KB/S " , Engineering Master degree thesis, McGill University, Montreal, Canada, Aug. 1998.
    [19] A.S.Spanias, "Speech coding: A tutorial review" , Proc. IEEE, vol. 82, pp. 1541-1582, Oct. 1994.
    [20] O.Gottesman, " Dispersion Phase Vector Quantization for Enhancement of Waveform Interpolate Coder" , IEEE ICASSP'99, vol.1, pp.269-272, 1999.
    [21] O.Gottesman , and , A.Gersho , " Enhanced Waveform Interpolative Coding at 4KBPS" , IEEE Speech Coding Workshop, 1999.
    [22] O.Gottesman , and , A.Gersho , " Enhanced Analysis-by-Synthesis Waveform Interpolative Coding at 4KBPS " , European Speech'99, 1999.
    [23] Bao Changchun, and, Fan changxin, "A Review of Speech Coding" , Journal of China Institute of Communications, Vol.19, No.5, pp.45-56, May 1998.
    [24] R.J.Mcaulay , and T.F.Quatieri, " Speech Analysis/Synthesis Based on a Sinusoidal Representation" , IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-34, No.4, pp.744-754, Aug. 1986.
    [25] R.J.Mcaulay, and T.F.Quatieri, "Mid-rate Coding Based on a Sinusoidal Representation of Speech" , Proc. IEEE Int. Conf. on Acoustics, Speech, Signal Processing, pp.945-948, 1985.
    [26] J.-H. Chen, R.V. Cox, Y.-C. Lin, N. Jayant, and M.J. Melchner, "A low delay CELP coder for the CCITT 16 kb/s speech coding standard" , IEEE J. Selected Areas Commun., Vol. 10, pp. 830-849, June 1992.
    [27] R.Salami , C.Laflamme , J.-P.Adoul , A.Kataoka , S.Hayashi , C.Lamblin , D.Massaloux, S.Proust, P.Kroon, and Y.Shoham, "Description of the proposed ITU-T 8 kb/s speech coding standard " , Proc. IEEE Workshop on Speech Coding for Telecom. (Annapolis) , pp. 3-4, Sept. 1995.
    
    
    [28] O.Gottesman, and, A.Gersho, "High Quality Enhanced Waveform Interpolative Coding at 2.8 kbps" , Proc. IEEE Int. Conf. on Acoustics, Speech, Signal Processing, vol.3,pp.1363—1366, 2000.
    [29] K.K.Paliwal, and W.B.Kleijn, "Quantization of LPC Parameters" , in Speech Coding and Synthesis by W.B.Kleijn and K.K.Paliwal, Elsevier Science B.V., Chapter 12, pp.433—466, 1995.
    [30] W.B.Kleijn, and, J.Haagen, "Speech coder based on decomposition of characteristic waveform" , Proc. IEEE Int. Conf. on Acoustics, Speech, Signal Processing, pp.508—511, 1995.
    [31] D.Marston and F.Plante,"PWI speech coder in the speech domain", Proc. IEEE Workshop on Speech Coding for Telecom., pp.31-32,1997.
    [32] K.Yaghmain and A.M.Kondoz, "Multiband prototype waveform analysis synthesis for very low bit rate speech coding", Proc. IEEE Int. Conf. on Acoustics, Speech, Signal Processing, pp.1571-1574, 1997.
    [33] J.Skoglund, R.V.Cox, and, J.S.Collura, "A Combined WI and MELP Coder at 5.2 kbps" , Proc. IEEE Int. Conf. on Acoustics, Speech, Signal Processing, vol.3, pp.1387—1390, 2000.
    [34] Y.Jiang and V.Cuperman , "Encoding Prototype Waveforms Using a Phase Codebook" , Proc. IEEE Workshop on Speech Coding for Telecom., pp.21-22,1995.
    [35] L.Cellario, D.Sereno, M.Giani, P.Blocher , and, K.Hellwig, "A VR-CELP Codec Implementation for CDMA Mobile Communications" , Proc. IEEE, vol. 1, pp. 281—284, 1994.
    [36] V.T.Ruoppila, M.Tammi, and, J.Saarinen, "Waveform Extraction for Perfect Reconstruction in WI Coding" , Proc. IEEE Int. Conf. on Acoustics, Speech, Signal Processing, vol.3, pp.1359—1362, 2000.
    [37] T.Eriksson, H.G.Kang, and, P.Hedelin, "Low-Rate Quantization of Spectrum Parameters" , Proc. IEEE Int. Conf. on Acoustics, Speech, Signal Processing, vol.3,pp.1447—1450, 2000.
    [38] 徐金标、杜利民,“基音同步特征波形内插语音编码算法”,声学学报,第25卷,第6期, pp.499—503,2000年。

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700