基于病毒碎片思想的英文文本数字水印算法研究

英文题名：An English Text Digital Watermarking Algorithm Based on the Idea of Virus Debris
作者：周海燕
论文级别：硕士
学科专业名称：计算机应用技术
中文关键词：数字水印 ; 病毒 ; 划分 ; 分类
英文关键词：Digital Watermarking ; Virus ; Divide ; Sort
学位年度：2007
导师：胡峰松
学科代码：081203
学位授予单位：湖南大学
论文提交日期：2007-03-19
答辩委员会主席：邝继顺

摘要

文本数字水印作为版权保护的一种手段得到了越来越多的重视。而由于数字文本所固有的“二值性”问题,没有丰富的纹理,大大增加了文本数字水印的鲁棒性和有效载荷这两个问题的解决难度,从而使得文本数字水印的研究远远落后于其它数字媒体。N.F Maxemchuk等人提出的特征编码的方法能够很好的解决有效载荷问题。然而鲁棒性问题却一直还没有理想的解决方案。现有的大多数文本数字水印算法实质上到最后都是基于格式的(文件格式如:DOC、PDF等等,排版格式如:字体、字号、文字颜色、字间距和行间距等等),只要对文本的格式进行修改,嵌入的水印信息极可能便荡然无存。鲁棒性不强是当前大部分文本数字水印算法的一个通病,也正是这个通病严重地制约了文本数字水印技术的成熟与发展。
     本文提出了一个“基于病毒碎片思想的英文文本数字水印算法”。该水印算法嵌入位置的基本思想受启发于计算机病毒的分块存储,具体就是把整个英文文本的字符以某些特定字母为界划分成若干小段(元素),再把这些元素按一定规则归类成若干个集合,然后在每一个集合中分别嵌入一个水印信息片。算法嵌入方式的基本思想来源于UNICODE编码集中存在着“外似而内不似”的字符,即字符的形状完全相同,却有着不同的内码。利用这个特点,用形状相同而内码不同的字符进行替代便可达到嵌入信息的目的。检测水印的时候,只要这个集合里有一个元素中的水印载体字符没被破坏,那么这个集合中嵌入的水印信息片就可以被提取出来。由于该算法完全可以在纯TXT文本上做,所以格式攻击对其是无效的。于是从理论上讲,该算法的鲁棒性能得到良好的保证。
     基于本文提出的算法思想,在.NET + MS Access环境下开发出了一个完整的软件系统。大量的实验证明:该算法的鲁棒性确实能达到理论上的预期效果。本算法可在纯TXT文本上得到实现,自然也可以在格式化的文件中实现,因此可广泛应用于各类英文电子出版物和网页。
It becomes more and more attractive to protect copyrights by text digital watermarking. Because of the inherent issue (“two values 0 or 1”), there are not rich textures in a digital text. The difficulties to solute the problem of robustness and the valid loading are increased largely. As a result, the researches of text digital watermarking fall behind other digital media largely. The feature coding method made by N ? F Maxemchuk solves the loading problem satisfactorily. However, the robustness problem is not solved ideally. Most of the existing text digital watermarking algorithms are based formats (files formats, such as: DOC, PDF, etc., typesetting formats, such as: the style, the size, the color of the words and the character spacing and line spacing, etc.). The watermarking information embedded in the digital text will be extracted no more, if the formats of a text are changed. It is a common lack that the robustness of most text digital watermarking algorithms is not strong. The maturity and development of text digital watermarking technology are constrained seriously just because of the robustness.
     This paper presents a novel English text digital watermarking algorithm based on the idea of computer virus. The basic idea of location of the algorithm comes from computer viruses. The specific is: dividing the whole characters of an English text with some specific alphabets into several short paragraphs (elements). These elements are then sorted into several aggregates by specific rules, and then one bit watermarking information is embedded into an aggregate. The basic ideal of the embedding method of the algorithm comes from Unicode code. There are some characters which share the same shapes but are stored in computers by different codes. The watermarking information piece embedded into an aggregate can be extracted well as long as only one elements of the aggregate which carried a bit watermarking information are not destroyed. The algorithm can be done in pure text TXT; attacks based on formats are invalid. Thereupon, the robustness performance of the algorithm has been well ensured theoretically. Actually, the splendid performance in robustness of the algorithm is proved by experiments.
     A software system based on the proposed algorithm is developed in .NET + MS Access environment. It is proved by extensive experiments that the robustness of the algorithm will be able to achieve the desired effect theoretically. The algorithm can be done in pure TXT text, but also can be achieved in the formatted documents off cause. It can be widely used in various electronic publications in English.

引文

[1] Ingemar Cox J, Matthew J.Miller, Jeffrey Bloom A. Digital Watermarking. 王颖,黄志蓓译. 北京: 电子工业出版社,2003: 1-6
    [2] 孙圣和, 陆哲明, 牛夏牧. 数字水印技术及应用.北京: 科学出版社, 2004: 1-36
    [3] Tirkel A, schyndel V. A Digital Watermarking. Newyork, In: Proceeding of ICIP, 1994: 86-89
    [4] 贾英江, 傅孝忠, 于鑫. 数字文档的数字水印. 小型微型计算机系统, 2000, 21(10): 1067-1068
    [5] Gerhard Langelaar C, Lwan Setyawan, Reginald lagendijk L. Watermarking Digital Image and Video Data a State-of-the-art Overview. IEEE signal processing magazine, 2000, 9(17): 20-46
    [6] Hsn C T, Wu J L. Hidden Digital Watermarks in Image. IEEE transaction on image processing,1999,8(1): 58-68
    [7] Zeng Wenjan, Lju Bede. A Statistical Watermark Detection Technique without Using Origin Images for Resolving Rightful Ownerships of Digital Images. IEEE transaction on image processing, 1999, 8(11): 1534-1548
    [8] Ping Wah Wong, Nasir Memon. Secret and Public Key Image Watermarking Sekemes for Image Authentication and Ownership Veriflcation. IEEE transaction on image processing, 2001, 10(10): 1593-1601
    [9] Joachim Eggers J, Jonathan Su K, Bernd Girod. Robustness of a Blind Image Watermarking Scheme. In Proceedings of IEEE international conference on image processing. Vancouver, Canada, 2000: 17-20
    [10] Deepa Kundur, Dimitrios Hatzinakos. Digital Watermarking for Telltale Tamper Proofing and Authentication. Proceedings of the IEEE, Special Issue on Identification and Protection of Multimedia Information, 1999, 87(7): 1167-1180
    [11] 张春田, 苏育挺. 信息产品的版权保护技术——数字水印. 电信科学, 1998, 14(12): 15-l7
    [12] Bender W, Gruhl D, Morimoto N, et a1. Techniques for Data Hiding. IBM system journal, 1996, 35(3): 313-336
    [13] Cox I J, Killian J, Leishton F T, et a1. Secure Spread Spectrum Watermarking for Multimedia. IEEE transactions on image processing, 1997, 6(12): 1673-1687
    [14] Zhao J, Koch E. Embedding Robust Labels Into Images for Copyright Protection. In:Proceedings of the knowright’95 conference on intellectual property rights and new technologies,Vienna,Austria, 1995: 241-251
    [15] Podilchud C I, Zeng W. Image-adaptive Watermarking Using Visual Models. IEEE journal on special area in communications,1998, 16(4): 525-539
    [16] Hartung F, Girod B. Watermarking of MPEG-2 Encoded Video without Decoding and Re-eneording. In: SPIE proceedings on multimedia computing and networking, San Jose, 1997:264-273
    [17] 韦志辉. 基于小波域视觉门限模型的数字水印技术. 东南大学学报, 1998, 28(5): 44-48
    [18] 易开祥, 石教英. 一种自适应二维数字水印算法. 全国第二届信息隐藏学术研讨会论文集, 北京, 2000: 108-112
    [19] Bender W. Techniques for Data Hiding. IBM Systems Journal,1996, 35(3): 313-336
    [20] Brassil J T, Low S, Maxemchuk N F. Copyright Protection for the Electronic Distribution of Text Documents. Proceedings of the IEEE, 1999, 87(7):1181-1196
    [21] Brassil J, Low S, Maxemchuk N F, LoGol’man. Electronic Marking and Identification Techniques to Discourage Document Copying. IEEE Journal on Selected Areas in Communications,1995, 13(9): 1495-1504
    [22] Low S H, Maxemchuk N F, Lapone A M. Document Identification for Copyright Protection Using Centroid Detection. IEEE Transactions on Communications, 1998, 46(5): 372-383
    [23] Ding Huang, Hong Yan. Interwords Distance Changes Represented by Sine Waves for Watermarking Text Images. IEEE Transactions on Circuits and Systems for Video Technology, 2001, 11(12): 1237-1245
    [24] Nopporn Chotikakamthorn. Document Image Data Hiding Technique Using Character Spacing Width Sequence Coding. In Proceedings of the 1999 International Conference on Image Processing (ICIP) (2), Kobe, Japan, 1999: 250-254
    [25] 赵东宁, 张勇, 李德毅. 基于云模型的的文本数字水印技术. 计算机应用, 2003(增): 100-102
    [26] 黄华, 齐春, 李俊, 朱伟芳. 一种新的文本数字水印标记策略和检测方法. 西安交通大学学报, 2002, 36(2): 165-168
    [27] Shing Olnoue, Kyoko Makino, Ichiro Murase. A Proposal on Information Hiding Methods Using XML. http://goanna.CS.rmit.edu.all～sgomez/StegWmark-abs.doc
    [28] Charles P P, Shari Lawrence P. Security in Computing. 李毅超, 蔡洪斌, 谭浩译. 北京: 电子工业出版社,2004: 508-521
    [29] NBS(U.S.National Bureau of Standards). Data Encryption Standard. FIPS, 1977, 46:23-25
    [30] Morris R. Assessment of the NBS Proposed Data Encryption Standard. Cryptologia, 1979:281-291
    [31] Lexan Corp. An Evaluation of the DES. Unpublished report, 1976: 30-35
    [32] Konheim, A. Cryptography. A primer, 1981: 20-26
    [33] Branstad, D. Report of the Workshop on Cryptography in Support of Computer Security. NBS Technical Report, 1977: 1277-1291
    [34] Diffie W, Hellman M. Privacy and Authentication. IEEE Personal Communications, 1979: 397-429.
    [35] Kocher P. Breaking DES. RSA laboratories Cryptobytes, 1999, 4(2): 30-39,
    [36] Hellman M. A Cryptanalytic Time-Memory Trade Off. Scientific American, 1979, 241(2): 146-157
    [37] Davio D. Applying the RSA Digital Signature to Electronic Mail. IEEE Computer, 1983, 16(2):171-202
    [38] Desmedt Y. Dependence of Output on Input in DES: Small Avalanche Characteristics. In Processions of Cryptobytes Conference, Santa Barbara, California, United States, 1984: 359-376
    [39] Biham K. Integrity Considerations for Secure Computer System. United States Air Force Electronic Systems Division Technical Report, 1977: 76-372
    [40] Biham E, Shamir A. Differential Cryptanalysis of FEAL and N-Hash. In Processions of Eurocrypt Conference, Berlin Heidelberg, 1991: 1-16
    [41] 肖湘蓉, 孙星明. 基于内容的英文文本数字水印算法设计与实现. 计算机工程, 2005, 31(22): 175-177
    [42] Young-Won Kim, Kyung-Ae Moon, Il-Seok Oh. A Text Watermarking Algorithm based on Word Classification and Inter-word Space Statistics. In Seventh International Conference on Document Analysis and Recognition, Washington, DC, USA, 2003: 775-779
    [43] 温泉, 孙锬锋, 王树勋. 零水印的概念与应用. 电子学报, 2003(2), 31(2): 214-216
    [44] 罗纲, 孙星明. 汉字数学表达式开发平台的设计与实现. 计算机工程与应用, 2005(5), 113-116
    [45] Low S H, Maxemchuk N F, Brassil J T , O'Gorman L. Document Marking and Identification using Both Line and Word Shifting. In Proceedings of IEEE INFOCOM'95, Boston, Massachusetts, 1995: 853-860
    [46] 黄华, 齐春, 李俊, 朱伟芳. 文本数字水印. 中文信息学报, 2002, 15(5): 52-57
    [47] Xingming Sun, Gang Luo, Huajun Huang. Component-Based Digital Watermarking of Chinese Texts. In: Proceedings of the Third International Conference on Information Security, Shanghai, 2004: 76-81
    [48] Petitcolas F A P. Watermarking Schemes Evaluation. IEEE Signal Processing Magazine, 2000, 17(5): 58-64
    [49] 孙星明, 殷建平, 陈火旺. 汉字的数学表达式研究. 计算机研究与发展, 2002, 39(6): 707-711
    [50] Xingming Sun,Huowang Chen, Lihua Yang, et al. Mathematical Representation of a Chinese Character and its Applications. International Journal of Pattern Recognition and Artificial Intelligence. 2002, 16(8): 735-747

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700