高保真低速率音频编码关键技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
近年来,高质量低码率音频编码关键技术虽然得到了广泛发展,但是数字音频业务的强劲增长迫切需要更高的音频质量和更低的编码比特率。相比之下,音频压缩编码技术稍显滞后,所以继续这一领域的研究具有重大的现实意义。
     本文提出一种高保真、低速率音频编码关键技术——基于最优频带选择的高频重建技术。它利用高频成分与低频成分的相关性,并结合音调理论和谐波理论,只需传递少量参数,就可在解码端使用低频成分重构特性与原始信号十分相似的高频成分。在技术上,它是一种自适应高频重建技术,增强了对音频特性的分析和检测,对各种不同音频特性的文件均可采用专门的复制策略进行处理,共提出三种适应不同音频特性的频带复制策略和一种低码率时对频带复制策略的扩展策略,并采用最大相关准则判定方法,为高频成分选择最优匹配的低频成分来进行复制。在实现步骤上,它只需频带复制和包络调整,就可以完成对高频信号的高质量重建。
     测试与分析结果表明,基于最优频带选择的高频重建技术与现有高频重建技术相比,能够更准确、更完整地重建原始高频成分的谐波;重建后的音频文件音质饱满,音调悦耳;非常适合高保真、低速率音频编码的需求。虽然在技术发展的现阶段还存在一些不足,但是可以通过后续研究来改进。
Recent years various key technologies of high-quality and low bitrate audio coding have been developed, but the strong growth of digital audio business is in great need of high audio quality and low bitrate. By in contrast, audio compression and coding technology is a little far behind. So it is of great practical significance to do some research on these areas.
     A novel key technology of high fidelity and low bitrate audio coding, the technology of high-frequency band reconstruction based on the optimal choice of replication bands, is proposed in this thesis. Combining with the tonality and harmonic relationships and by using the correlation of high-frequency and low-frequency components, the proposed scheme is able to reconstruct the high-frequency components very similar to the original signals only with the low-frequency components and a few additional parameters reflecting the feathers of the high-frequency signals. Technically, it is an adaptive high-frequency reconstruction technique, so different audio signals with various characteristics can be processed by specific replication methods. Therefore three different replication strategies adapting to various audio characteristics and several algorithms for low bitrate audio coding are introduced, in which the high-frequency components are replicated by selecting the optimal matching low-frequency components according to the most relevant criteria. It reconstructs high frequency signals with high quality just by two steps, bands replication and envelope adjustment,.
     Comparing to the high-frequency reconstruction technology already available, the results of testing and analysis have shown that the technology proposed in this paper can reconstruct the high frequency harmonic more accurately and completely and the reconstructed audio sound great. Although in the current stage, there are still some shortcomings in this technology, it can be improved later by further studies.
引文
[1] Stanley P. Lipshitz. Dawn of the Digital Age.AES, Vol. 46, pp. 37-41 1998.
    [2] Ted Painter,Andreas Spanias.A review of algorithms for perceptual coding of digital audio signals. Proceedings of International Conference on Digital Signal Processing(DSP).1997.179-205.
    [3] Painter T, Spanias A. Perceptual coding of digital audio. Proc IEEE, 2000, 88(4): 451515.
    [4] J. Breebart, S. v. d. Par, A. Kohlrausch, and E. Schuijers. Parametric Coding of Stereo Audio. EURASIP Journal on Applied Signal Processing 2005:9, 1305-1322.
    [5] 卢官名,宗昉.数字音频原理及应用.机械工业出版社,2005.1.
    [6] Lam Y.H, Stewart R.W. Perception-based residual analysis-synthesis system. Proc ICASSP 1999. USA: IEEE, 1999. 989-992.
    [7] Martin Dietz, Lars Liljeryd. Spectral Band Replication, a novel approach in audio coding. At 112th AES Convention, Munich, 2002.5.
    [8] 3GPP TS 26.404: Enhanced aacPlus general audio codec: General description. 2005.03.
    [9] 3GPP TS 26.404: Enhanced aacPlus general audio codec: Encoder specification AAC part. 2004.9.
    [10] 3GPP TS 26.404: Enhanced aacPlus general audio codec: Enhanced aacPlus encoder SBR part. 2004.9.
    [11] Per Ekstrand. BANDWIDTH EXTENSION OF AUDIO SIGNALS BY SPECTRAL BAND REPLICATION. IEEE Workshop on Model based Processing and Coding of Audio, Benelux, 2002.11.
    [12] ISO/IEC JTC1/SC29/WG11. ISO/IEC 14496-3:2001/FPDAM1, Bandwidth Extension, with the simple editorial changes, listed in NB comments, incorporated. MPEG2003/m9539, 2003.3.
    [13] 郭庆巍,张海波,马鸿飞.频带复制技术的分析和测试.电子质量,2007.12,51-53 页.
    [14] A.Ehret, X.D.Pan. Audio Coding Technology of ExAC. Proceedings of 2004 International Symposium on intelligent Multimedia, Video and Speech Processing, 2004.10.
    [15] Erik Larsen, Ronald M. Efficient high-frequency bandwidth extension of music and speech. at 112th Convention,2002.5.
    [16] Sang-Uk Ryu, Kenneth Rose, Joon-Hyuk Chang. EFFECTIVE HIGH FREQUENCY REGENERATION BASED ON SINUSOIDAL MODELING FOR MPEG-4 HE-AAC. IEEE Workshop, 2005.10.
    [17] VLSI Solution Oy. PlusV Specification “VLSI Solution PlusV”.Public Document,2005-03-08.
    [18] J. Seo, D. Jang, J. Hong, and K. Kang, A Simple Method for Reproducing High Frequency Components at Low-Bit Rate Audio Coding, in 113th AES Convention 2002 October 5-8 Los Angeles, California, USA.
    [19] 3GPP TS 26.404: Enhanced aacPlus general audio codec: Encoder specification parametric stereo part. 2005.3.
    [20] Jonas Engdegard, Heiko Purnhagen. Synthetic Ambience in Parametric Stereo Coding. At 116th AES Convention, Berlin, 2004 May 8-11.
    [21] ISO/IEC JTC1/SC29/WG11. Report on Informal MPEG-4 Extension 1 (Bandwidth Extension) Verification Tests. Pattaya, Thailand, 2003.3.
    [22] ISO/IEC JTC1/SC29/WG11. Report on the Verification Tests of MPEG-4 High Efficiency AAC. Brisbane, Australia, 2003.10.
    [23] Patent: Lars Gustaf Liljeryd. Source Coding Enhancement Using Spectral-Band Replication. US 6925116B2.
    [24] Emilia Gomez Gutierrez. TONAL DESCRIPTION OF MUSIC AUDIO SIGNALS.Department of Technology Universitat Pompeu Fabra, Barcelona. 2006.
    [25] Robert C. Maher. Fundamental frequency estimation of musical signals using a two-way mismatch procedure. Acoustical Society of America, p2254, 1994.4.
    [26] (加)李泽年(Ze-Nian Li), Mark S. Drew. 多媒体技术教程(英文版),机械工业出版社,2004.7.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700