AMR-WB语音编码中最优码书搜索算法的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
在现代的通信系统中,在GSM和WCDMA系统中使用的自适应多速率编解码器是一种多速率的窄带编解码器,然而随着数字通信网络的飞速发展和视听业务需求量的日益增长以及人们追求一种自然的面对面的通信质量和带宽的限制越来越不适应人们对高质量语音业务的需求,为此,有必要进行宽带语音压缩编码的研究。
     为了满足对高质量语音业务的需求,3GPP/ETSI提出了宽带自适应多速率编解码器。随后,AMR-WB又被ITU-T选用为其16kbit/s宽带语音编码标准G.722.2。自适应多速率宽带语音编解码器由于其音频带宽扩展到7kHz,采样频率扩大到16kHz,突破了窄带语音编解码器带宽方面的限制,因此在语音的自然度、音乐处理等方面有较大的改善。AMR-WB具有语音质量高、平均编码速率低和自适应好等优点,是通信史上第一种可以同时用于无线与有线业务的语音编码系统,在无线和有线通讯领域都有着广阔的应用前景。
     本文对AMR-WB算法进行了系统的分析,深入研究了编码器的矢量量化、自适应码本搜索和固定码本搜索等几个模块并对其相应的算法进行了优化,还对解码器的解码原理进行了一定程度的分析和研究。选用TIMIT标准英文语音数据库中的标准语音文件(16kHz,16bit)进行编解码实验,对编解码后的语音质量进行了主观听觉测试和客观w-PESQ测试,测试结果表明12.65kbit/s以上模式,合成语音的波形与原始语音基本一致,在听觉上已经与原始语音难以分辨,w-PESQ值都超过了4.0,而对于6.60kbit/s和8.85kbit/s模式,在波形和听觉上稍有失真,w-PESQ值都在3.5以上,达到了通信质量标准,而且合成语音具有良好的自然度和听觉舒适性。
     本文对宽带语音编码标准G.722.2 (AMR-WB)进行了矢量量化的改进实现,自适应码书搜索的改进实现和代数码书搜索的改进实现。并且对各种改进的算法分别进行了编解码测试,对其语音质量进行了评测,并与原始的宽带语音编码标准G.722.2算法相比较,对于矢量量化的改进和自适应码书搜索的改进而言,两者都可显著降低码书搜索的计算复杂度,语音编码算法的时间得到显著降低,改进的快速码字搜索算法都是实现快速编码的有效方法,具有一定的应用价值。代数码书搜索的改进算法与原始代数码书搜索算法相比较,改进的代数码书搜索算法可提高标准算法语音质量,具有一定的参考价值。
In the public telephone networks, The AMR(Adaptive Multi-rate, AMR)speech codec used in GSM and WCDMA system is a multirate narrowband speech codec. For the limited bandwidth, the narrowband AMR speech codec can't satisfy people's needs to high quality speech services. With the rapid development of the digital networks and audio-visual technique application, people are in the pursuit of face-to-face communication. Conclusively the research of the wideband code is necessary.
     In order to meet the requirement in high quality speech services, 3GPP/ETSI proposed the adaptive multi-rate wideband speech codec.After that, the AMR-WB codec was selected by ITU-T in the standardization activity for wideband speech coding around 16kbit/s. AMR-WB(Adaptive Multi-rate Wideband, AMR-WB) codec makes speech frequency extended to 7kHz and sample frequency extended to 16kHz, greatly breakthroughs the restriction of the bandwidth than the narrow band codec. Therefore, AMR-WB codec will ameliorate greatly in many aspects such as speech naturalness and musical processing. AMR-WB has the advantages of high speech quality, low average bit rate and strong adaptability. The adoption of AMR-WB by ITU-T is of significant importance for it is the speech code system adopted both by wireless and wire line service at first time, and will be used widely in wireless communication fields as well as wire line communication fields.
     This paper systemically analyzed the AMR-WB algorithm. Based on deep research of vector quantization, adaptive codebook search,fixed codebook search, high band process and the decoding principle, the AMR-WB algorithm was simulated. The standard speech files(16kHz,16bit) of TIMIT standard English speech database were selected to accomplish coding and decoding experiments, then tested the synthesized speech in subject auditory quality and object w-PESQ value. The result showed above the model of 12.65kbit/s, the waveform of synthesized speech is the same as that of input speech, and it had hardly to differentiate with original input speech in auditory, he w-PESQ values were more than 4.0. In the model of 6.60kbit/s and 8.85kbit/s, there were a little distortion in terms of waveform and auditory, and the w-PESQ values were above 3.5. They all reached the communication quality standard and the synthesized speech had the good naturalness and comfortable auditory property.
     For adaptive multi-rate wideband speech codec algorithm, the paper optimizaties vector quantization and the adaptive codebook search the adaptive codebook search and fixed book search. And improved algorithms encoding and decoding were tested by voice quality algorithm. Compared with the original wideband speech coding standard G.722.2(AMR-WB)algorithm, optimizated vector quantization algorithm and adaptive codebook search algorithm both can significantly reduce the computational complexity and speech coding time, so two kinds of codeword search algorithm is fast and effective coding method and has certain application value. Compared with original fixed book search algorithm, optimizated fixed book search algorithm can improve voice quality and has a certain reference value.
引文
[1]3GPP TS 26.194"AMR wideband speech codec;Voice Activity Detection(VAD)"
    [2]ITU-T."Wideband coding of speech at around 16kbit/s using Adaptive Multi-Rate Wideband (AMR-WB)",ITU-T Recommend.G.722.2,2003
    [3]王炳锡.语音编码[M],西安电子科技大学出版社,2002
    [4]王炳锡,王洪.变速率语音编码[M],西安电子科技大学出版社,2004
    [5]TIA/EIA/96-C.Speech Service Option Standard for Wideband Spread Spectrum Systems,1998
    [6]丁琦.增强型变速率语音编解码算法研究:学位论文.解放军信息工程大学,2003
    [7]B.Bessette et al."The Adaptive Multi-Rate Wideband Speech Codec (AMR-WB),"IEEE Trans.Speech and Audio Processing,vol.10,no.8,Nov.2002.10(6).20-36
    [8]张刚,张雪英,马建芬,语音处理与编码[M],兵器工业出版社,2003
    [9]ITU-T Recommendation,P.862.1,Mapping Function for Transforming P.862 Raw Result Scores to MOS-LQO,2003
    [10]赵斐,徐勇,PESQ及其应用,电子设计应用,2003(3),pp.28-30
    [11]岳子琪.3G中语音编码及其关键技术研究与实现,西安电子科技大学硕士学位论文,2002
    [12]杨海,感知语音只来那个评价PESQ及其在通信系统中的应用,江西通信科技,2004(2),pp.36-47
    [13]杨行峻,迟惠生.语音信号数字处理[M],电子工业出版社,1995
    [14]Stephen so, Kuldip k.Paliwal. Acomparative study of LPC parameter representations and quantisation schemes for wideband speech coding.digital signal processing 17(2007)114-137
    [15]J.L.Flanagan, Speech coding, IEEE Trans. on Communications,1979,27(4),pp.710-736
    [16]J.L.Flanagan,RMGolden,Phase Vocoder,BellSyst.Tech,1966,45(5),pp.1493-1509
    [17]Jerry D.Gibson等,多媒体数字压缩原理与标准,李煜辉等译,电子工业出版社,2000
    [18]张义,姚中华.基于TM1300实现AMR-WB语音压缩算法优化,北京电子科技学院学 报,2006,14(2):70-74
    [19]A.V.Oppenheim, A Speech Analysis-Synthesis System Based on Homomorphic Filtering, J. Acoust. Soc. Amer.,1969,45(2), pp.458-465
    [20]ATAL.B.S, SCHROEDER.M.R, Code-Exited linear prediction (CELP):High quality speech at very low bit rates, Proc ICASSP,1985
    [21]D.W.Grifin, Jae.S.Lim, Multi-band excitation Vocoder, IEEE Transactions on ASSP, 1988,36, pp.1223-1235
    [22]胡航.语音信号处理[M],哈尔滨工业大学出版社,2000
    [23]孙圣和,陆哲明.矢量量化技术及应用[M].北京:科学出版社,2002.
    [24]Linde Y,Buzo A,Gray R.An Algorithm for Vector Quantizer Design[J].IEEE Transactions on Communications,1980,28(1):84295
    [25]BEI C,GRAY R B.An Improvement of the Minimum Distortion Encoding Algorithm for Vector Quantization[J].IEEE Transactions on Communications,1985,COM-33(10): 1132-1133
    [26]D.Cheng,A.Gersho,B.Ramamurthi,Y.Shoham.Fast Search Algorithms for Vector Quantization and Pattern Matching.International Conference on Aeoustics,Speech, and Signal Processing 1984:9.11.1-9.11.4
    [27]Schroeder M R, Atal B S. Code-Ex cited Liear Prediction(CELP):high quality speech at very low bit rates[C],Proc IEEE Int ConfASSP,Tampa,FL,1985:937-940.
    [28]Campbell J P,Welch V C,Tremain T E.An expandable error-protected 4800bps CELP coder(US Federal standard 4800bps voicecoder)[C],Proc IEEE Int Conf ASSP, Glasgrow,Scotland,1989:735-738.
    [29]Alcaim A,Silva L M.Modified CELP model with computationallyefficient adaptive codebook search[J].IEEE Signal Processing Letters,1995,2(3):44-45.
    [30]王艳,黄建国,李钒.一种用于语音编码的快速自适应码书搜索算法[J].计算机工程与应用,2007,43(15),69.
    [31]ITU-T Recommendation,P.862,Perceptual evaluation of speech quality (PESQ):An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs,2001.
    [32]白国栋,张雪英,自适应多速率宽带语音编码算法的研究与仿真实现,太原理工大学学报,2008,39(3)
    [33]白国栋,自适应多速率宽带语音编码算法的仿真实现及研究,太原理工大学硕士学位论文,2008
    [34]朱敏AMR-WB编码算法的研究以及AMR-NB基于ZSP500的优化实现,南京邮电大学硕士学位论文,2006
    [35]王波涛.移动通信中的语音编码算法的研究,北京邮电大学博士后学位论文,2004
    [36]夏娟.宽带话音编码实现技术的研究,重庆邮电学院硕士学位论文,2004
    [37]鲍长春.低比特率数字语音编码基础[M].北京:北京工业大学出版社,2001:206-214.
    [38]鲍长春.高质量的4kb/s散布脉冲CELP语音编码算法.电子学报,2003,31(2):309-313.
    [39]白燕宁,鲍长春.码激励线性预测语音编码器中的非均匀和部分搜索域代数码书,电子与信息学报Vo1.28No.11
    [40]Imre Varga,Siemens AG Standardization of the AMR Wideband Speech Codec in 3GPP and ITU-T.IEEE Communications Magazine.2006.6(6).66-73
    [41]戴沁云,胡捍英.第三代移动通信系统中的语音编码,无线通信技术.2001,(3):8-11
    [42]张义,姚中华.基于TM1300实现AMR-WB语音压缩算法优化,北京电子科技学院学报,2006,14(2):70-74

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700