先进音频编码（AAC）在中波数字调幅广播中的研究

英文题名：Research of Advanced Audio Coding in the Digital Amplitude Modulation Broadcast
作者：陆泱
论文级别：硕士
学科专业名称：信号与信息处理
中文关键词：AAC ; DAMB ; 心理声学模型算法 ; 量化编码 ; 快速算法 ; 错误保护
英文关键词：AAC ; DAMB ; psychological acoustics model algorithm ; quantize & coding ; fast algorithm ; error protection
学位年度：2005
导师：吴乐南
学科代码：081002
学位授予单位：东南大学
论文提交日期：2005-03-01
答辩委员会主席：余兆明

摘要

本文研究中波数字调幅广播(DAMB)系统信源编码中的先进音频编码(AAC)。欧洲标准DRM是当前唯一能够同时服务于长、中、短波频段的数字广播系统,采用OFDM体制,调制时同时利用幅度和相位两路信号,属于单边带调制,10KHz的信道带宽可以充分利用,实际传输的信号带宽能达到10KHz,这需要改造发射机,而这里的DAMB不改动现有的广播发射机,仍利用双边带调幅体制。优点是可以实现大范围的信号覆盖,并且接收机简单、易于实现、价格不高。但带来的问题是实际能利用的带宽只有一半,音质受到影响(带宽、噪音、信道衰落),只能用于信道条件相对较好的中波段数字调幅广播。因而与DRM信源编码相比,在相同的带宽下,DAMB信源编码的输出比特率降低很多,压缩率必须大大提高,其信源编码中的音频编码采用AAC编码。
     在输入信号的基础上,利用已知的心理声学模型规则可以算出一个与当前(与时间关联)掩蔽门限的估计值,在AAC系统中用了心理声学模型2。从掩蔽门限可以得到信号的掩蔽比(信号掩蔽比就是对输入信号能够掩蔽掉多大量化噪声的一个估计)。在量化阶段,对任何给定数据率都可以利用信号掩蔽比使量化信号的可闻失真最小,编码阶段则对控制系数和量化系数进行霍夫曼编码,得到编码比特流。
     以往的音频编码方式中,对于原始数据,使用降低采样率、降低音频带宽、减少量化级数等传统方法进行压缩编码,压缩率大约可以达到15:1。但是过多使用这些方法会使得感知音质的可理解性和可识别度下降。AAC使用心理声学模型,以感知音质为标准进行压缩编码,在保证感知音质的前提下,可以达到较高的压缩比。
     本文研究了AAC编码软件实现的优化,对其中的音频带宽控制、心理声学模型算法和量化编码模块等做出了改进。为了适合在中波数字调幅广播中的传输,改变了AAC编码的帧长,使用了相应的快速算法,加入了错误保护的新工具。
     在Windows XP操作系统下使用Visual C++工具,把MPEG-4 AAC的码率降低到本系统所能支持的10/16kbps,解码后输出音频带宽为与AM广播相应的50Hz-7kHz,可以保持尽可能好的音质;在理论分析的基础上,调整了MPEG-4 AAC的参数,以适应改动后的音频带宽和输出码率。
This text studies the Advanced Audio Coding (AAC ) in the source code of systematic letter of Digital Amplitude Modulation Broadcasting (DAMB ). Europe standard DRM can serve long , digital broadcast system , medium-short wave of frequency band at the same time only at present, adopting OFDM system, utilizes signal of the range and phase place at the same time when modulating, it is modulated that is the single sideband, the channel width of 10KHz can be fully utilized , the bandwidth of signal transmitted actually can reach 10KHz , this needs to transform the transmitter, but DAMB here does not change the existing broadcast transmitter, still utilize and take the amplitude modulation system bilaterally. The advantage is that it can realize the signal on a large scale covers , and the receiver is simple , easy to realize , the price is not high. But the question brought is the bandwidth that the reality can be utilized there is half only, the tone quality is influenced (the bandwidth, noise , channel decline), can only be used in channel condition relatively better hitting the digital amplitude modulation broadcasting of wave band. Therefore compared with DRM letter source code, under the same bandwidth, bit rates of DAMB source code output reduces a lot, the compressing rate must raise greatly, the audio coding in its source code adopts AAC .
     On the basis of input signal, psychological acoustics model rule known already to spend can calculate one estimated value, masking threshold at present (with time related), have used the psychological acoustics model 2 in AAC system. From masking threshold, it can get Noise Masking Rate. In quantization stage , give to any data rate can utilize signal is it quantize the hearing distortedly and minimum of the signal making to shelter definitely, the code carries on Haffman's code in controlling coefficient and quantization coefficient at stage, get the code bit stream.
     Past audio frequency code among the way, as to initial data, is it reduce samplerate, reduce audio frequency bandwidth , reduce quantization progression traditional method compress the code to use, the compressing rate can probably be up to 15:1. But too much use of these methods can make apprehending and discerning degree of perceptible audio quality drop too much. AAC uses the psychological acoustics model, regards perceptible audio quality as the standard and compresses the code, on the premise of guaranteeing perceptible audio quality, can reach higher compression ratio.
     This text code improved optimization software realize of AAC. It include control to audio frequency bandwidth among them , psychological acoustics model algorithm and quantize code module ,etc. In order to be suitable for the transmission in the digital amplitude modulation broadcasting channel, the frame length of AAC code has changed , having use the correspondingfast algorithm, and the new tool that the error protected joined.
     Using Visual C ++ tool under the operating system Windows XP, it reduce bit rate of MPEG-4 AAC to10/16kbps that this system can supported to. After decoding the bandwidth of the audio frequency is 50Hz-7kHz that correspond with AM radio sets, which can keep the tone quality as good as possible. On the basis of t the theory analysis, it has adjusted the parameter of MPEG-4 AAC, in order to adapt to the bandwidth of audio frequency after changing bandwidth of the audio frequency and bit rate.

引文

[1] David Salomon 著,吴乐南等译. 数据压缩原理与应用(第二版)[M]. 电子工业出版社第 1 页
    [2] 袁玫,袁文. 数据压缩技术及其应用[M]. 电子工业出版社 1-2.
    [3] 潘兴德. 感知信源编码中的若干核心技术研究[EB/OL] 3-5. http://202.119.8.11:85/MST.DLL?DATABASE=CDDBFT&FMT=CDDBFTN&OP=I&MFN=183832 万方学位论文数据库
    [4] 车振华. MPEG-2 AAC 音频编解码算法分析及其 DSP 实现[EB/OL] 3-6 . http://202.119.8.11:85/MST.DLL?DATABASE=CDDBFT&FMT=CDDBFTN&OP=I&MFN=60671 万方学位论文数据库
    [5] 陈艳阳. MPEG-2AAC 低复杂度双声道编解码器的 DSP 实现[EB/OL] 8-11 . http://202.119.8.11:85/MST.DLL?DATABASE=CDDBFT&FMT=CDDBFTN&OP=I&MFN=35985万方学位论文数据库
    [6] 王厦. Dolby环绕声技术[J] . 现代电视技术,1999 年S1 期
    [7] 边世勇. 杜比数字与高清晰度电视的音频[J] . 现代电视技术,2002 年 01 期
    [8] 黄汉威. DTS技术发展的新动态. 实用影音技术[J],2000 年 02 期
    [9] 胡泽. 相干声学编码系统——用于DTS多声道数字音频系统[J] . 北京广播学院学报, 2002 年 03 期
    [10] 毅博. DTS 解析[J] . 现代家电, 2001 年 01 期
    [11] 史明锐,吴镇扬. MPEG-4 的音频标准[EB/OL] . http://www.chinavideoonline.com/mpeg4/mpeg4_009.htm
    [12] Kurniawati, E.; Lau, C.T.; Premkumar, B.; Absar, J.; George, S.; “New Implementation Techniques of an Efficient MPEG Advanced Audio Coder[J]” IEEE Transactions on Consumer Electronics, Volume: 50, Issue: 2 , May 2004 Pages:655 - 665
    [13] http://www.audiocoding.com/modules/wiki/?page=AAC
    [14] 王华明,陈健 . MPEG-2 AAC 音频编码技术及其软件解码器的实现[J] . 计算机工程, 2001 年 6 月第 27 卷第 6 期
    [15] 车振华. MPEG-2 AAC 音频编解码算法分析及其 DSP 实现[EB/OL] 8-23. http://202.119.8.11:85/MST.DLL?DATABASE=CDDBFT&FMT=CDDBFTN&OP=I&MFN=60671 万方学位论文数据库
    [16] 王华明. MPEG-2 AAC 编解码器的实现方法研究[EB/OL] 7-20. http://202.119.8.11:85/MST.DLL?DATABASE=CDDBFT&FMT=CDDBFTN&OP=I&MFN=26943 万方学位论文数据库
    [17] 江虹. MPEG 音频编码及心理声学模型的研究[EB/OL] 17-35. http://202.119.8.11:85/MST.DLL?DATABASE=CDDBFT&FMT=CDDBFTN&OP=I&MFN=123811 万方学位论文数据库
    [18] 陆泱. AAC 中的 MDCT 算法分析[J] . 第 19 届南京地区通信年会,2005,pp.452-456
    [19] 何兵. 低码率感知音频编码研究[EB/OL] 26-41. http://202.119.8.11:85/MST.DLL?DATABASE=CDDBFT&FMT=CDDBFTN&OP=I&MFN=49273 万方学位论文数据库
    [20] 姜晔. 预测技术在感知音频编码中的应用[EB/OL] 33-38. http://202.119.8.11:85/MST.DLL?DATABASE=CDDBFT&FMT=CDDBFTN&OP=I&MFN=35702 万方学位论文数据库
    [21] 杨斌. AAC 编码器的优化实现[EB/OL] 18-24 页. http://202.119.8.11:85/MST.DLL?DATABASE=CDDBFT&FMT=CDDBFTN&OP=I&MFN=169206 万方学位论文数据库
    [22] http://www.audiocoding.com/modules/wiki/?page=PNS[EB/OL]
    [23] 徐盛. 基于感知理论的低码率高质量音频编码[EB/OL] 71-74. http://202.119.8.11:85/MST.DLL?DATABASE=CDDBFT&FMT=CDDBFTN&OP=I&MFN=8988 万方学位论文数据库
    [24] Peter Doliwa MPEG-4 Advanced Audio Coding[EB/OL] http://www.ibr.cs.tu-bs.de/lehre/ss04/skm/mpeg-4-aac.pdf
    [25] http://www.audiocoding.com/modules/wiki/?page=LTP
    [26] Hyen-O Oh, Joon-Seok Kim, Chang-Jun Song, Young-Cheol Park, Dae Hee Youn, “Low Power MPEG/Audio Encoders Using Simplified Psychoacoustic model and Fast Bit Allocation”[J], 0-7803-6622-0/01, 2001 IEEE.
    [27] E.Kurniawati, J.Absar, S.George, C.T.Lau, B.Premkumar, “An Investigation Into Different Masking Behaviours Resulting from Estimation of Tonality Index”, 14th International Conference onDigital Signal Processing, July 2002, Santorini, Greece.
    [28] A.D.Duenas, R. Perez, B.Rivas, E.Alexandre, A.S.Pena, “A robust and Efficient Implementation of MPEG-2/4 AAC Natural Audio Coders”,Audio Engineering Society 112th Convention 2002, Preprint #5556
    [29] Hee Youn, “Low Power MPEG/Audio Encoders Using Simplified Psychoacoustic model and Fast Bit Allocation”, 0-7803-6622-0/01, 2001 IEEE.
    [30] Toshiyuki Nomura, Yuchiro Takamizawa, “Processor-Efficient Implementation of a Hight Quality MPEG-2 AAC Encoder”, Audio Engineering Society 110th Convention 2001, Preprint #5294
    [31] Yuichiro Takamizawa, Toshiyuki Nomura, and Masao Ikekawa, “High-Quality and Processor-Efficient Implementation of an MPEG-2 AACEncoder”, 0-7803-7041-4/01, IEEE.
    [32] C.M. Liu, W.J. Lee, R.S. Hong, “Bit Allocation for Advanced Audio Coding using Bandwidth Proportional Noise Shaping Criterion”,Proceedings of the 6th International Conference on Digital Audio Effects (DAFX-03).
    [33] Hyen-O Oh, Joon-Seok Kim, Chang-Jun Song, Dae-Hee Youn, Il-WhanCha, “New Implementation Techniques of A Real Time MPEG-2Audio Encoding System”, 0-7803-5041-3/99, 1999 IEEE.
    [34] A.D.Duenas, R. Perez, B.Rivas, E.Alexandre, A.S.Pena, “Realtime Implementation of MPEG-2 and MPET-4 Natural Audio Coders”,Audio Engineering Society 110th Convention 2001, Preprint #5302
    [35] Manoj Kumar, Mohammad Zubair, “A High Performance Software Implementation of MPEG Audio Encoder”, ICASSP, Vol. 2, 1996
    [36] C.M. Liu, W.J. Lee, R.S. Hong, “A New Crieterion and Associated Bit Allocation Method For Current Audio Coding Standards”, Proceedings of the 5th International Conference on Digital Audio Effects (DAFX-02).
    [37] Chi-Min Liu, Chin-Ching Chen, Wen-Chieh Lee, Szu-Wei Lee, “A Fast Bit Allocation Method for MPEG Layer III”, 0-7803-5123-1/99, 1999IEEE.
    [38] C.Y.Lee, Y.C.Fang, H.C.Chuang, C.N.Wang, T.H. Chiang, “A Fast Audio Bit Allocation Technique Based on a Linear R-D Model”, IEEE Transactions of Consumer Electronics, Vol. 48, No.3, August 2002
    [39] Digital Radio Mondiale (DRM) System Specification[P] ETSI ES 201 980 V2.1.1 ETSI Standard (2004-06)
    [40] 胡广书. 数字信号处理——理论、算法与实现[M] . 北京:清华大学出版社1997年8月第1版,156-167,281-294
    [41] 蒋增荣 ,曾泳泓 ,余品能. 快速算法[M]. 湖南:国防科技大学出版社,1993 87-19
    [42] HEIKO PURNHAGEN “AN OVERVIEW OF MPEG-4 AUDIO VERSION 2” [EB/OL] http://www.ee.columbia.edu/~marios/courses/e6820y02/project/papers/An%20Overview%20of%20MPEG-4%20Audio%20Version%202.pdf
    [43] Chi Wai Yung , Hung Fai Fu, Chi Ying Tsui, Roger S. Cheng, Doug George “unequal error protection for wireless transmission of mpeg audio” [EB/OL] http://www.aessf.org/newsletters/march99.pdf
    [44] Oliver Kunz “SBR explained:White Paper” [EB/OL] www.codingtechnologies.com/products/sbr.htm
    [45] LILJERYD,Lars,Gustaf; “OURCE CODING ENHANCEMENTS USING SEPCTRAL-BAND REPLICATION”[P] intervgen 19,S-171 34 Solna(SE) WORLD INTELLECTUAL PROPERTY ORGANIZATION PCT/IB98/00893
    [46] 马丁·迪茨斯蒂芬·梅尔泽. CT-aacPlus——一种新型音频编码方案 [J] . 世界广播电视.2003.017(005).-26-29
    [47] 李方慧,王飞,何佩琨. TMS320C6000 系列 DSPs 原理与应用[M]. 2003 年 1 月第 2版