低速率语音编码中的信息隐藏研究与实现

英文题名：Research and Implementation on Information Hiding in Low Bit-rate Speech Codec
作者：肖博
论文级别：硕士
学科专业名称：信息和通信工程
中文关键词：互补邻居顶点算法 ; 低速率语音编码 ; 信息隐藏 ; 矢量量化 ; 码本划分
英文关键词：Complementary Neighbor Vertex ; Low Bit-rate speech codec ; Information Hiding ; Vector Quantization ; Codebook Dividing
学位年度：2009
导师：黄永峰
学科代码：081001
学位授予单位：清华大学
论文提交日期：2009-05-01

摘要

低速率语音编码中的信息隐藏问题是信息隐藏领域的难点,也是构建信息隐藏通信系统的基础。通过修改编码的矢量量化环节实现嵌入是一种主要方法,其中码本分组是否合理决定了信息隐藏的隐蔽性优劣。本文基于QIM方法和图论,提出了优化的码本分组算法,即互补邻居顶点算法(CNV)。该算法对于任给的码本,可以保证划分结果使得每个码字和它最近邻的码字分属不同的分组,且使得局部附加量化失真的极大值对各种划分方式取得其极小。
     在理论方面,本文首先利用图论建模,证明了上述两个结论。接着,通过反例分析了满足约束使得每个码字和它最近的两个或三个邻居顶点两两属于不同的分组的分组方式对任意码本不能普遍成立。为了分析给定的分组是否存在满足上述条件的分组方式,我们提出了基于回溯法的搜索方法。此外,提出了重复使用CNV算法将分两组的结果各自再分一次,得到近似优化的分四组的方法,改进了信息隐藏的容量。
     在实际编码方面,本文使用CNV算法对iLBC、G.729、G.723.1三种编码的LPC系数矢量量化码本做了划分,给出了分组结果并做了讨论。回溯搜索的结果表明不能对上述码本按照类似CNV的约束条件分多组,但近似优化的分四组的方法可以很好的成立。
     通过大量实验,发现CNV算法划分的平均量化误差小于随机划分,且有利于取得较小的最大量化误差。将上述划分应用于实际信息隐藏,对语音质量的主观评价表明主观感受不能区分嵌入机密信息的合成语音与普通的合成语音。采用平均LPC倒谱失真作为客观评价准则,发现CNV算法划分具有较好的语音质量,而近似分四组方法的效果也好于直接替换量化结果这一类方法。
     最后,将CNV方法应用于前期工作基于VoIP的信息隐藏通信系统中,通过实际应用验证了其有效性,传输速率和理论分析一致,满足实际应用需求,是目前较理想的隐藏方法。
Information Hiding (IH) in low bit-rate speech codec is a di?cult problem in IHresearch as well as the foundation of building IH communication system. One mainembedding method is modifying Vector Quantization (VQ) procedure during speechencoding, where the covertness of IH is determined by codebook partition. Basedon the QIM method and Graph Theory, the Complementary Neighbor Vertex (CNV)method is proposed in this paper. CNV method guarantees that for any given code-book,every codeword belongs to the di?erent part of its nearest neighbor’s. It alsoensures that the maximum of local extra quantization distortion due to IH reaches itsminimum among all dividing patterns.
     Theoretically, This paper first establishes analytical model using Graph Theory,and gives proof to the above conclusions. Secondly, it is demonstrated using counterexample that CNV-like dividing, that is, making every codeword and it’s two or threenearest neighbor each having di?erent color, is not generally feasible towards any code-book. In order to find out whether such dividing pattern exists, a searching algorithmbased on backtracking is proposed. In addition, another high capacity algorithm is in-troduced using CNV method iteratively, which divides the 2 parts derived from CNVmethod again into 4 parts approximately optimized.
     Practically, The CNV method is used in dividing the VQ codebooks of LPC co-e?cients in iLBC, G.729, and G.723.1 codec. The result of dividing is given anddiscussed. The attempt to divide the codebooks into multiple parts with CNV-likerestrictions is not successful. Nevertheless, dividing those codebooks into 4 parts ap-proximately optimized is proved applicable.
     Extensive experiments in real audio resources demonstrate that the mean quanti-zation error of CNV method was smaller than that of dividing the codebook randomly,and it is more likely to get a smaller maximum quantization error using CNV method.Subjective quality assessment of the produced audio signal shows that it is impossi- ble to distinguish the audio signal containing embedded messages from normal recon-structed audio signal by human ears. Adopting the Mean LPC Cepstrum Distortion asobjective evaluation method, it is found that using CNV method is beneficial for get-ting better quality. And dividing the codebook into 4 parts approximately optimizedexceeds the class of methods which replace the quantization result directly after encod-ing.
     Finally, the CNV method is applied to our previous work which was a covertcommunication system based on VoIP and IH. Though practical test, the feasibility ofour methods is proved, and the measured transmission rate is consistent with theoreticalanalysis. The CNV method satisfies application requirement, being the favorable IHmethod in the system up to now.

引文

[1] Petitcolas F A, Anderson R J, Kuhn M G. Information hiding—Asurvey. Proceedings ofthe IEEE, 1999, 87(7):1062–1078.
    [2] Sequeira A, Kundur D. Communication and Information Theory in Watermarking: A Sur-vey. Proceedings of SPIE, 2001, 4518:216.
    [3] Cachin C. An Information-Theoretic Model for Steganography. Lecture Notes in ComputerScience, 1998, 1525:306–318.
    [4] Sallee P. Model-based steganography. Proceedings of Second International Workshop onDigital Watermarking, volume 2939, Seoul, Korea: Springer, 2004. 154–167.
    [5] Moulin P, O’Sullivan J. Information-Theoretic Analysis of Information Hiding. IEEE Trans-actions on Information Theory, 2003, 49(3):563.
    [6] Wu C, Kuo C C J. Fragile speech watermarking for content integrity verification. Proceed-ings of IEEE International Symposium on Circuits and Systems, volume II, 2002. 436–439.
    [7] Dittmann J, Hesse D. Network based Intrusion Detection to Detect Steganographic Commu-nication Channels. Proceedings of IEEE 6th Workshop on Multimedia Signal Processing,2004.
    [8]吴顿,张勇,李岳楠, et al. VQ域信息隐藏检测算法.电子学报, 2005, 33(12).
    [9] Wang H, Wang S. Cyber warfare: steganography vs. steganalysis. Communications of theACM, 2004, 47(10):76–82.
    [10] Fridrich J G, Du M R. Detecting LSB steganography in color, and gray-scale images. IEEEMultimedia, 2001, 8(4):22–28.
    [11] Sharp T. An Implementation of Key-Based Digital Signal Steganography. Proceedings ofInternational Workshop on Information Hiding, LNCS, volume 2137, 2001. 13–26.
    [12] Ker A D. Improved Detection of LSB Steganography in Grayscale Images. Proceedings of6th International Workshop on Information Hiding, 2004.
    [13] Wu S, Huang J, Huang D, et al. E?ciently Self-Synchronized Audio Watermarking forAssured Audio Data Transmission. IEEE Transactions on Broadcasting, 2005, 51(1):69–76.
    [14] Wang C T, Chen T S, Chao W H. A New Audio Watermarking Based on Modified DiscreteCosine Transform of MPEG/Audio Layer III. Proceedings of IEEE International Conferenceon Networking, Sensing and Control, 2004.
    [15] Li X, Yu H H. Transparent and Robust Audio Data Hiding in Cepstrum Domain. Proceed-ings of IEEE International Conference on Multimedia and Expo, 2000.
    [16] Cox I J, Kilian J, Leighton F T, et al. Secure Spread Spectrum Watermarking for Multimedia.IEEE Transactions on Image Processing, 1997, 6(12).
    [17] Chen B, Wornell G W. Quantization index modulation: a class of provably good methodsfor digital watermarking and information embedding. IEEE Transactions on InformationTheory, 2001, 47(4):1423–1443.
    [18] Liu Y W, Smith J O. Watermarking sinusoidal audio representations by quantization indexmodulation in multiple frequencies. IEEE International Conference on Acoustics, Speech,and Signal Processing, 2004, 5.
    [19] Lu Z, Pan J, Sun S. VQ-based digital image watermarking method. Electronics Letters,2000, 36(14).
    [20]肖博,黄永峰.流媒体隐蔽通信系统模型及性能优化.西安电子科技大学学报, 2008,35(3):554–558.
    [21] Chen Y, Li T, Gao D, et al. A secure mobile communication approach based on informationhiding. 2nd International Conference on Mobile Technology, Applications and Systems,2005..
    [22] Kratzer C, Dittmann J, Vogel T, et al. Design and evaluation of steganography for voice-over-IP. Proceedings of IEEE International Symposium on Circuits and Systems, 2006.
    [23] Wang C, Wu Q. Information Hiding in Real-Time VoIP Streams. Ninth IEEE InternationalSymposium on Multimedia, 2007. 255–262.
    [24]赵晓群.数字语音编码.北京:机械工业出版社, 2007.
    [25] ITU. ITU-T Recommendation G.723.1. Dual Rate Speech Coder forMultimedia Communication Transmitting at 5.3 and 6.3 kbit/s, 1996.http://www.itu.int/rec/T-REC-G.723.1-200605-I/en.
    [26] ITU. ITU-T Recommendations G.729. Coding of speech at 8kbit/s usingconjugate-structure algebraic-code-excited linear-prediction (CS-ACELP), 2007.http://www.itu.int/rec/T-REC-G.729/e.
    [27] Andersen S, Duric A, Telio. Internet Low Bit Rate Codec (iLBC), IETF RFC 3951, 2004.http://www.ietf.org/rfc/rfc3951.txt.
    [28]鲍长春.低比特率数字语音编码基础.北京:北京工业大学出版社, 2001.
    [29]李伟,袁一群,李晓强, et al.数字音频水印技术综述.通信学报, 2005, 26(2).
    [30]陈亮,张雄伟.基于语音参数模型的语音隐藏算法.计算机学报, 2003, 26(8):974–981.
    [31] MP3 Stego, 2009.”http://www.petitcolas.net/fabien/steganography/mp3stego/”.
    [32] Koukopoulos D, Stamatiou Y. An E?cient Watermarking Method for MP3 Audio Files.Proceedings of World Academy of Science, Engineering and Technology, 2005, 7.
    [33] Gruhl D, Lu A, Bender W. Echo Hiding. Proceedings of the First International Workshopon Information Hiding, 1996. 293–315.
    [34] Maor A, Merhav N. On joint information embedding and lossy compression. IEEE Trans-actions on Information Theory, 2005, 51(8):2998–3008.
    [35]邱应强,程义民,王以孝.一种基于矢量量化彩色图像的水印方法.中国科学技术大学学报, 2007, 37(2).
    [36]王洪,唐凯.低速率语音编码.北京:国防工业出版社, 2006.
    [37] Quatieri T F. Discrete-Time speech signal processing: Principles and practice. Prentice HallPTR, 2002.
    [38] Lu Z M, Yan B, Sun S H. Watermarking Combined with CELP Speech Coding for Authen-tication. IEICE Transactions on Information and System, 2005, E88-D(2):330–334.
    [39] Chang P C, Yu H M. Dither-like data hiding in multistage vector quantization of MELPand G. 729 speech coding. Conference Record of the Thirty-Sixth Asilomar Conference onSignals, Systems and Computers, 2002, 2.
    [40] Wang F H, Jain L C, Pan J S. VQ-based watermarking scheme with genetic codebookpartition. Journal of Network and Computer Applications, 2007, 30(1):4–23.
    [41] Lin C, Pan J S, Liao B Y. Robust VQ-Based Digital Image Watermarking for Mobile Wire-less Channel. Proceedings of IEEE International Conference on System, Man, and Cyber-netics, 2006.
    [42] Chiang Y K, Tsai P, Huang F L. Codebook Partition Based Steganography without MemberRestriction. Fundamenta Informaticae, 2008, 82(1):15–27.
    [43] Lu Z M, Xing W, Xu D G, et al. Digital Image Watermarking Method Based on VectorQuantization with Labeled Codewords. IEICE TRANSACTIONS on Information and Sys-tems, 2003, E86-D(12):2786–2789.
    [44] Geiser B, Vary P. High rate data hiding in ACELP speech codecs. Proceedings of IEEEInternational Conference on Acoustics, Speech and Signal Processing, 2008. 4005–4008.
    [45] Tutte W T. Graph Theory. Cambridge University Press, 2001: 233–237.
    [46]黄惠明,王瑛.语音系统客观音质评价研究.电子学报, 2000, 28(004):112–114.
    [47]杨行峻,迟慧生.语音信号数字处理.北京:电子工业出版社, 1995.
    [48] Kitawaki N, Nagabuchi H, Itoh K. Objective quality evaluation for low-bit-rate speechcoding systems. IEEE Journal on Selected Areas in Communications, 1988, 6(2):242–248.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700