用户名: 密码: 验证码:
音频信息隐藏关键技术研究及识别技术的信息安全应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
二十世纪九十年代以来,随着Internet和多媒体技术的发展,数字化作品呈现出海量的增长势头,其传播量呈指数式扩张.这使得古老的隐写术重新找到了附着的载体,并产生了一新的学科研究方向——信息隐藏,现在,信息隐藏已经是信息安全领域的一个重要关注焦点.数字化音频的信息隐藏是信息隐藏的重要方面。由于数字化音频,特别是数字音乐及语音通信,更贴近大众生活,因而与数字化音频相关的信息隐藏具有良好的应用前景.音频的信息隐藏不但可以被用于机要谍报部门的秘密通信,也可以被用于个人隐私保护、Internet安全利用、数字作品权利保护等民用目的,因而它的研究不但具有实用性和社会经济价值,而且具有国家安全的意义。
     语音是音频的重要分支,对语音类音频的信息隐藏研究需要结合语音特性的研究,因而不可避免地会与语音识别的知识发生交叉。语音识别技术的方法与成果可以与信息隐藏的研究结合,服务于信息安全的应用范畴。对识别技术及音频信息隐藏的信息安全应用进行探索,研究他们的实用领域和应用场景,对其实现经济价值具有重要意义。
     本文从信息隐藏的视角来看待语言的交流过程,揭示了语音识别与音频信息隐藏的类比相关性。本文进行了音频信息隐藏的研究及其与语音识别相关的交叉研究,取得了以下的创新性研究成果:
     1.提出了利用汉语语音端点后冗余的信息隐藏方法。汉语的音节末尾的音素基本都是浊音。在声学处理上浊音可以理解为准周期的脉冲串对声道激励所产生的输出。汉语语音的这个特性被用来做语音的端点检测,以区分“有声和无声”。本文则利用这种端点检测方法,确定语音的时域周期冗余,并利用冗余实现了信息隐藏。
     2.提出了利用美尔频率倒谱系数(MFCC)的信息隐藏方法。MFCC是语音识别的重要参数。为利用MFCC做隐藏点,本文解决了三个难点:(1).MFCC参数选取准则。(2).如何从改变过的MFCC参数逆向求解对数能量。(3).如何进行美尔频率滤波器组的逆向求解。在此基础上,实现了利用MFCC的信息隐藏。
     3.提出了一种在高级音频编码AAC中进行信息隐藏的方法。由于AAC过程中存在有使用不同码书进行压缩试算的步骤,因而可以利用比例因子频段内频域量化值经不同霍夫曼码书编码后得到相等的最优长度比特的存在概率,以码书的选择作为0、1比特的隐藏方法.
     4.利用“绝大多数语音识别系统在嗓音环境中的性能都不可避免地急剧下降”的噪声环境的识别研究结论及汉语发音时长短等特点,构建了汉语语音验证码,为网上银行公共客户登录提供可选择的安全性解决方案,并重点解决了合成速度与WEB应用匹配问题。这是语音识别成果在本文的一个重要信息安全应用。
     5.成功尝试了音频水印和语音识别的结合应用。在自动语音服务中,用创新点1的方法在自动语音中嵌入水印,客户语音终端通过对水印的检测完成对自动语音的确认,并通过调起语音识别引擎,完成客户语音和自动语音的交互。
     目前,音频信息隐藏技术的研究还具有非常广阔的空间,特别是针对格式音频媒体的隐藏、结合识别技术的隐藏和结合低码率语音编码技术的隐藏。另外,我们还要加强语音识别和音频信息隐藏的领域应用和综合应用研究。
Digital products broadcasted with exponential growth have increased large amount with the development of Internet and multimedia since 1990's. This lets the ancient steganography get chance to have a carrier for new life, and it forms a new research field called information hiding. Information hiding is now an important focus of information security. Since digital audio, especially digital music and speech communication, is closer to people's lives, its information hiding has good promise for application. Audio information hiding can not only be used in secret communication of espionage or confidential department, but also be used in civil purposes such as personal privacy protection, security use of Internet and right protection for digital product, so its research has practicability, economy value and the meaning of country security.
     Speech is an important branch of audio. Inevitably we should have cross with speech recognition, because we must combine the research of speech character while studying the information hiding of speech. Methods and results of speech recognition technology can be combined to information hiding for the application purposes of information security. It is of great importance for implement of economy value to do research on information secutity application of speech recognition and information hiding of speech, and to do research on their application fields and scenes of application.
     This paper looks at the process of speech conversation from the viewpoint of information hiding, and discovers the analogy relation between speech recognition and audio information hiding. This paper studies on audio information hiding and combination research with speech recognition, and gives the following creative results.
     1. An information hiding method using the redundancy after the endpoint of Chinese speech is provided. The phoneme at the end of syllable is always voiced speech in Chinese, while voiced speech can be regarded as an output for quasi-periodic sequence of pulses acting on vocal tract. This property of Chinese speech can be used in endpoint detection to distinguish "sound or no sound". By using this endpoint detection method, periodic redundancy of speech in time domain can be decided, and hiding in the redundancy is fullfiled.
     2. An information hiding method using MFCC is provided. MFCC is the main parameter for speech recognition. In order to hide in MFCC, this paper gives answers to the following questions: (1). Criterion for MFCC selection. (2). Solution for getting log energe from changed MFCC. (3). Solution for reverse transformation of Mel frequency filter bank. Based on these answers, we can hide data in MFCC successfully.
     3. An information hiding method in AAC is provided. In the calculation test step of AAC, codebook selection can be used for hiding as bit 0 or 1, considering the possibility of same length of the shortest coded bits of sfb quantized frequent value with different codebook.
     4. Chinese speech verification code is constructed because of the short time character of Chinese pronunciation and the research result that the performance will decrease sharply in noise surround inevitably for most speech recognition system. It solves the problem of the adaption between WEB application and synthesis speed mainly, and be an optional safety solution for common user's logging on Internet bank. It is an important information security application for speech recognition results.
     5. A sample application combining audio watermarking with speech recognition is carried out. in the process of automatic speech service, audio watermarking can be embedded in automatic speech, and customer's speech terminal can call speech recognition engine by watermarking detection and assuring of automatic speech, thus customer's speech can interact with automatic speech.
     Audio information hiding research has wide space to explore nowdays, especially in formatted audio media hiding, hiding combinated to speech recognition and hiding combinated to low bit rate speech coding. Moreover we should enhence field application and integrated application research for speech recognition and audio information hiding.
引文
白剑,景晓军等.2005.语音信息隐藏中的AREA算法[J].电子学报,33(9):1541-1544.
    白俊梅,张世磊,张树武,徐波.2006.噪声环境下的鲁棒性说话人识别[J].中文信息学报,20(1):91-97.
    卜凡亮.2002.听觉掩蔽效应以及语音增强方法研究[D]:[博士后].北京:中国科学院声学研究所,6-19.
    蔡莲红等.2003.现代语音技术基础与应用[M].北京:清华大学出版社,1,14-47,66-70,154-158,232-273,335,341,354.
    陈向群 等译.1995.多媒体开发指南[M].(美)Paul Perry著.北京:清华大学出版社,360-365.
    邓倩岚,林家骏.2006.基于统计的LSB隐写分析方法[J].计算机安全,(1):23-24.
    付兴滨.2005.基于均值量化的音频脆弱水印算法研究[J].应用科技,32(8):17-19.
    郭芬红.2006.几种典型的隐秘信息检测算法[J].华南金融电脑,(2):89-91.
    郭志川,程义民,王以孝,谢于明.2005.实时隐秘传输嵌入方法与实现[J].数据采集与处理,20(3):306-310.
    何强,何英.2002.MATLAB扩展编程[M].北京:清华大学出版社,159-175,289-372.
    贾骏,王朔中,张新鹏.2004.码分复用数字音频水印嵌入方案[J].上海交通大学学报,38(12):2074-2077.2081
    李宵寒,黄南晨,戴蓓蒨,姚志强.2004.基于HMM-UBM和短语音的说话人身份确认[J].信息与控制,33(6):762-764.
    李奕飞.2005.大容量MP3音频数字水印算法研究[J/OL].合肥:《计算机与信息技术》,http://www.ahcit.com/lanmuyd.asp?id=1464.
    林福宗.2002.多媒体技术基础[M].2版.北京;清华大学出版社,216-219.
    刘歆,牛少彰.2005.信息隐藏的检测算法研究综述[J].北京电子科技学院学报,13(4):90-94.
    刘振华,尹萍.2002.信息隐藏技术及其应用[M].北京:科学出版社,78,82.
    全笑梅,张鸿宾.2005.基于量化和听觉感知的数字音频水印[J].计算机工程与应用,(31):33-37,48.
    唐步天.2000.基于Internet多媒体文件的信息隐形传输[D]:[硕士].北京:中国科技大学(北京)研究生院,14-15.
    王炳锡等.2004.变速率语音编码[M].西安:西安电子科技大学出版社,1.
    王炳锡等.2005.实用语音识别基础[M].北京:国防工业出版社,2,76-86,95-96,147-148,177-182,235.
    王剑,林福宗.2003a.基于离散小波变换的数字音频水印[J].计算机工程与应用,(15):80-82.
    王剑,林福宗.2003b.基于多分辨率分解的数字音频水印[J].计算机应用研究,(9):40-41,55.
    王剑,林福宗.2003c.MATLAB在数字水印技术研究中的应用[J].计算机工程与应用,(11): 156-158.175.
    王剑,林福宗.2004.基于人工神经网络的数字音频水印算法[J].小型微型计算机系统,25(11):2006-2010.
    王剑,林福宗.2005.基于支持向量机(SVM)的数字音频水印[J].计算机研究与发展,42(9):1605-1611.
    王让定,徐达文,陈金儿.2004.基于频率掩蔽效应的自适应音频数字水印技术[J].计算机工程与应用,(15):31-33,64.
    王莉.2003.感知音频编码中的心理声学模型研究[D]:[硕士].武汉:武汉大学,40-41.
    吴家安.2006.语音编码技术及应用[M].北京:机械工业出版社,8-10.
    吴乐南.2005.数据压缩[M].2版.北京:电子工业出版社,4,28-30,80-84.
    杨伟,王飞,张中,杨义先,钮心忻.2004.伪装式数字化语音保密通信系统[J].通信学报,25(2):75-81.
    佚名.AAC源码[EB/OL].http://www.audiocoding.com/.
    佚名.Voicebox源码[EB/OL].http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html.
    佚名.听觉理论[EB/OL].http://news.xinhuanet.com/audio/2003-03/21/content_792365.htm,http://www.hoodong.com/wiki/%E5%90%AC%E8%A7%89%E7%90%86%E8%AE%BA,http://www.hoodong.com/wiki/%E5%90%AC%E8%A7%89.
    佚名.语音识别技术综述[EB/OL].http://www.laogu.com/wz_1709.htm
    俞铁城.2005.语音识别的发展现状[J].通信市场,(5):36-37.
    岳军巧,钮心忻,杨义先.2002.语音保密通信中的信息隐藏技术[J].北京邮电大学学报,25(1):79-82.
    曾一凡.2005.扩频通信原理[M].李晖.北京:机械工业出版社,11-13.
    赵贤宇.2005.基于VTS的稳健语音识别[J].清华大学学报(自然科学版),45(7):892-895.
    赵朝阳,刘振华.2000.数字音频信号的回声隐藏技术[J].计算机应用研究,7(1):42-44.
    中华人民共和国国家质量监督检验检疫总局.2003.GB/T 17957.7-2002.信息技术 运动图象及其伴音信息的通用编码第七部分:先进音频编码(AAC)[S].北京:中国标准出版社.
    张益贞,刘滔.2002.Visual c++实现MPEG/J-PEG编解码技术[M].北京:人民邮电出版社,274~295.
    Beerends J.,Stemerdink J..1992.A perceptual audio quality measurement based on a psychoacoustic sound representation[J].Journal of AES,40(12):963-972.
    Bender W.,Gruhl D.et al..1996.Techniques for data hiding[J].IBM Systems Journal,35(3/4):313-336.
    Byeong-seob Ko,Ryouichi Nishimura,Y6iti Suzuki.2005.Time-Spread Echo Method for Dogital Audio Watermarking[J].IEEE Transactions on Multimedia,7(2):212-221.
    C.Neubauer,J.Herre.2000.Audio Watermarking of MPEG2 AAC Bit Streams[C].Proc.of 108th Convention of Audio Engineering Society(AES),Paris,February 2000.
    Changsheng Xu et al..2007.Content-adaptive digital music watermarking based on music structure analysis[J].ACM Transactions on Multimedia Computing,Communications and Applications(TOMCCAP),3(1).
    Chun-Shien Lu.2005.Multimedia security:steganography and digital watermarking techniques for protection of intellectual property[M].Hershey,PA:Idea Group Publishing,127-156.
    Cooperman M.,Moskowitz S..1997-11-11.Steganographic Method and Device.USA,5687236[P].
    Cox I.J.,Kilian J.,Leighton T.,shamoon T..1995.Secure spread spectrum watermarking for multimedia[R].Technical Report 95-10,NEC Research Institute.
    COX I.J.,KILIAN J.,LEIGHTON T..1996.Secure spread spectrum watermarking for images,audio and video[C].Proceedings of International Conference on Image Processing(ICIP96).243-246.
    David Megias,Jerdi Herrera-Joancomarti,Julia Minguillion.2005.Robust Frequency Domain Audio Watermarking:A Tuning Analysis[C].IWDW 2004,LNCS 3304.Berlin,New York:Springer,244-258.
    David Salomon.2003.Data privacy and security[M].New York:Springer,351.
    Gordy J.D.,Bruton L.T..2000.Performance evaluation of digital audio watermarking algorithms [J].Proceedings of IEEE MWSCAS 2000,1:456-459.
    Gruhl D.,Bender W.,Lu A..1996.Echo hiding[C].First International Workshop on Information Hiding,Lecture Notes in Computer Science.England:Springer,295-315.
    Hamza Ozer,Bulent Sankur,Nasir Memon.2005.Perceptual Audio Hashing Functions[J].EURASIP Journal onApplied Signal Processing,2005(12):1780-1793.
    Horvalic P.,Zhao J.,Thorwirth N.J..2000.Robust audio watermarking based on secure spread spectrum and auditory perception model[C]∥S.Qing(Ed.).International Federation for Information Processing(IFIP):information security for global information infrastructures.Boston:Kluwer Academic Publishers,181-190.
    Jeng-Shyang Pan.2004.Intelligent watermarking techniques[M].Hsiang-Cheh Huang,Lakhmi C.Jain.River Edge,N.J.:World Scientific,188-192.
    Kaliappan Gopalan.2004.Cepstral domain modification of audio signals for data embeddingpreliminary results[C].Security,Steganography,and WatermarkIng of Multimedia Contents Ⅵ:Proc.of SPIE-IS&T Electronic Imaging,Vol.5306.Washington:SPIE,151-161.
    Lee C.,Moallemi K.,Warren R..1998-10-13.Method and Apparatus for Transporting Auxiliary Data inAudio Signals.USA,5822360[P].
    Lie Wen-Nung,Lin Guo-shiang.2005.A feature-based classification technique for blind image steganalysis[J].IEEE Trans.On Multimedia,7(6):1007-1020.
    LoboGuerrero A.,Bas P.,Lienard J..2004.An Informed Synchronization Scheme for Audio Data Hiding[C].Security,Steganography,and Watermarking of Multimedia Contents Ⅵ:Proc.of SPIE-IS&T Electronic Imaging,Vol.5306.Washington:SPIE,116-126.
    Petitcolas F.mp3stego[EB/OL].http://www.petitcolas.net/fabien/steganography/mp3stego/Cambridge University,UK
    Preuss R.et al..1994.Embedded signaling.USA,5319735[P].
    Qiao L.,Klara N..1999.Non-invertible watermarking scheme for MPEG audio[C].Proceedings of SPIE Multimedia Security conference,San Jose,CA,1999.
    Ryuki Tachibana.2004.Two-Dimensional Audio Watermark for MPEG AAC Audio[J]. Proc.SPlE,5306(1):139-150.
    S.K.Lee,Y.S.Ho..2000.Digital audio watermarking in the cepstrum domain[J].IEEE Transactions Consumer Electronics,46(3):744-750.
    Shaoquan Wu,JiWu Huang,Daren Huang,et al..2005.Efficiently Self-Synchronized Audio Watermarking for Assured Audio Data Transmission[J].IEEE Transactions on Broadcasting,51(1):69-76.
    Siho Kim,Keunsung Bae.2005.Robust Estimation of Amplitude Modification for Scalar Costa Scheme Based Audio Watermark Detection[C].IWDW 2004,LNCS 3304.Berlin,New York:Springer,101-114.
    Tilki J.F.,Beex A.A..1996.Encoding a hidden digital signature onto an audio signal using psychoacoustic masking[C].Proc of 7th Int.Conf.On Sig.1996.Proc.Apps.And Tech.,Boston:476-480.
    Toshiyuki SAKAI,Naohisa KOMATSU.2004.Digital Watermarking Based on Process of Speech Production[C].Security,Steganography,and Watermarking of Multimedia Contents Ⅵ:Proc.of SPIE-IS&T Electronic Imaging,Vol.5306.Washington:SPIE,127-138.
    X.Li,H.H.Yu.2000.Transparent and robust audio data hiding in cepstrum domain[C].Multimedia and Expo,2000.ICME 2000.2000 IEEE International Conference on Volume 1,397-400.
    Xu C.,Feng D.,Zhu Y..2001a.Copyright protection for WAV-table synthesis audio using digital watermarking[C]∥H.-Y.Shum,M.Liao & s.-f.Chang(Ed.).Advances in multimedia information processing-PCM 2001:second IEEE Pacific Rim Conference on Multimedia,Beijing,China,October 24-26,2001:proceedings.Berlin,New York:Springer,772-779.
    Xu C.,Zhu Y.,Feng D..2001b.Digital audio watermarking based on multiple-bit hopping and human auditory system[C]∥ACM International Conference on Multimedia(9th:2001:Ottawa,Canada).Proceedings:ACM Multimedia 2001,Ottawa,Canada,September 30-October 5,2001.New York:Association for Computing Machinery,568-571.
    Xu C.,Zhu Y.,Feng D..2001c.A robust and fast watermarking scheme for compressed audio[C]IEEE International Conference on Multimedia and Expro,Tokyo,Japan,2001.253-256.
    Xu C.,Feng D..2002.Robust and Efficient content-based digital audio watermarking[J].ACM Journal of Multimedia Systems,8(5):353-368.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700