基于云的协议识别模型与算法研究

英文题名：Research of Protocol Identification Model and Algorthm Based on Cloud Computing
作者：马萌
论文级别：硕士
学科专业名称：信息安全
中文关键词：协议识别 ; 云计算 ; 云存储 ; 变异测试 ; 特征修正
英文关键词：protocol identification ; cloud computing ; cloud storage ; mutation testing ; rule revision
学位年度：2011
导师：李忠献
学科代码：081001
学位授予单位：北京邮电大学
论文提交日期：2011-02-10

摘要

随着信息技术的发展,网络环境日益多样化与复杂化。为保障关键业务的开展与普及,借助协议识别技术对网络流量进行分类与管理是各行业用户的迫切需求。对协议识别技术的研究具有重要的理论意义与现实价值。
     传统协议识别模式受限于部署架构以及处理能力日渐无法适应数量庞大且经常变化的协议特征。论文在对传统协议识别模型进行优化的方向上,开展了以下几个方面的研究工作：
     1.总结了现有协议识别技术的发展限制,分析了基于云计算技术的协议识别技术。
     2.提出基于云的协议识别模型,以解决现有协议识别模型中处理能力受限、开发环境适应性不强以及难以实现规则自动学习等缺陷,并给出了关键的数据处理模型、特征修正模型以及云存储与计算模型。
     3.设计了规则树构造算法用以组织规则,为数据处理与特征修正模块提供支持,利用变异测试技术对规则出现的偏差进行测试并修正。
     4.对所提出的模型与算法进行了实例测试与分析,表明变异测试方法能够正确发现特征变化,同时优化的规则树构造算法能够显著提高特征匹配效率。
With the development of information technology, network environment has become increasingly multiplex and complex. To promote and spread the critical business, it is an urgent need of various industries to classify and manage the network traffic with the help of protocol identification technology. Researches on protocol identification have important theoretical and practical value.
     Traditional protocol identification model with limits in structure and capacity cannot deal with growing number of volatile protocols. To optimize the traditional model, this paper has done researches in the following aspects:
     1. The paper summarizes the limitations of existing protocol identification technology. With introduction to cloud computing technology and its advantages, the paper analyses a cloud based protocol identification model.
     2. According to the limits in computing capacity, environmental flexibility and automatic learning ability, a protocol identification model based on cloud computing has been presented, of which the key models have been elaborated such as data processing, rule revision, cloud storage and computing model.
     3. The paper designs a rule tree organizing algorithm for needs in data processing and rule revision procedures. It applies mutation testing technology to test for variation of the rules with small deviation to be amended.
     4. The paper uses an example in designed model and algorithm to show that the protocol identification model can correctly find the modification of protocol's characteristic. And the introduction of optimized rule tree organizing algorithm can significantly improve the efficiency of the matching.

引文

[1]中国互联网协会.中国互联网行业发展报告.电子工业出版社.2009,No.102-123
    [2]IANA port-numbers [EB/OL].http://www.iana.org/assignments/port-numbers
    [3]Sen S, Wang J. Analyzing Peer-to-Peer Traffic across Large Networks[C]. IEEE/ACM Transactions on Networking. NJ:IEEE Press,2004.219-232
    [4]Plissonneau L, Costeux J L, Brown P. Analysis of Peer-to-Peer Traffic on ADSL [J]. In PAM 2005, volume 3431 of LNCs Springer.2005.69-82
    [5]吴辉娟,袁方.个性化服务技术研究[J].计算机技术与发展,2006,16(2)：32-34
    [6]杨义先,钮心忻.入侵检测理论与技术[M].北京：高等教育出版社,2006：56-58
    [7]庄绪春,孟相如,韩仲祥.高速网络环境中入侵检测技术探讨[J].信息与电子工程,2006,4(4)：288-291
    [8]王艳秋,赵昭灵,兰巨龙.一种基于IPv6的网络入侵检测系统[J].计算机应用研究,2007(2)：142,144-147
    [9]谭炜,吴健.基于半监督学习的P2P协议识别[J].计算机工程与设计,2009,30(2)： 291-293
    [10]Bernaille L, Teixeira Salamatian K, Early application identification[C]. Lisboa, Portugal:CoNEXT,2006
    [11]Cache Logic. Peer-to-peer in 2005[EB/OL]. http://www.cachelogic.com/home/pages/research/p2p2005.php,2005
    [12]Estan C, Keys K, Moore D, et al. Building a beter netflow[C]. Portland, USA: SIGCOMM,2004
    [13]Karagiannis T, Papagiannaki K, Faloutsos M. BLINC:Multilevel traffic classification in the dark[C]. Philadelphia USA:SIGCOMM,2005
    [14]Erman J, Arlitt M, Mahanti A. Traffic classification using clustering algorithms[C]. Pisa, Italy:SIGCOMM MineNet Workshop,2006
    [15]Chapelle O, Scholkopf B, Zien A. Semi-supervised learning[M]. Cambridge, USA:MIT Press,2006:28-45
    [16]Jeffrey Erman, Anirban Mahanti, Martin Arlitty. Ofiine/realtime traffic classification using semi-supervised leaming[R]. University of Calgary, Department of Computer Science, 2007
    [17]Sun Microsystems, Inc. Introduction to Cloud Computing Architecture. [EB/ OL].1st Edition, June 2009
    [18]刘鹏.云计算[M].北京：电子工业出版社,2010：1-2
    [19]Apache Hadoop [EB/OL]. http://hadoop.apache.org/
    [20]Wikipedia Hadoop [EB/OL]. http://wiki.apache.org/hadoop/
    [21]Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung. The Google File System, Proceedings of 19th ACM Symposium on Operating Systems Principles.2003, 20-43
    [22]曹强,黄建忠,万继光,谢长生.海量网络存储系统原理与设计[M].武汉：华中科技大学出版社,2010
    [23]Burrows M. The chubby lock service for loosely-coupled distributed systems. In: Proc Of the 7th USENIX Symp on Operating Systems Design and Implementation. Berkele:USENIX Association,2006,335-350
    [24]Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M, Chandra T, Fikes A, Gruber RE. Bigtable:A distributed storage system for structured data. In:Proc Of the 7th USENIX Symp on Operating Systems Design and Implementation. Berkele:USENIX Association,2006,205-218
    [25]Wikipedia MapReduce[EB/OL]. http://zh.wikipedia.org/wiki/MapReduce
    [26]Dean Jeffrey, Ghemawat, Sanjay (2004). MapReduce:Simplified Data Processing on Large Clusters. Retrieved Apr.6,2005
    [27]T. White. Hadoop:The Definitive Guide. O'Reilly Media,2009,63-68
    [28]RFC3971. Requirements for IP Flow Information Export (IP-FIX) [S]
    [29]Hifn, Inc. Why You Need Flow Classification, Technical White Paper[EB/OL]. http://www.hifn.com/docs/a/WP-0001-00-Why-You-Need-Flow-Classification.pd f.September 2001
    [30]Wikipedia Snort[EB/OL]. http://zh.wikipedia.org/zh-cn/Snort
    [31]Snort User Manual[EB/OL]. http://www.snort.org/assets/140/snort_manual_2_8_6.pdf
    [32]谷晓钢,江荣安,赵铭伟.Snort的高效规则匹配算法[J].计算机工程,2006,32(18)： 155-156
    [33]周宇,谭小彬,何鲜宗,奚宏生.基于频率的Snort规则集构造方法[J].计算机工程,2010,36(12)：156-158
    [34]WONG W E. On mutation and data flow [D]. West Lafayette, USA:Purdue University. Software Engineering Research Center,1993
    [35]Kovacs G, Pap Z, Viet D. L, Wu-Hen-Chang A, Csopaki D. Applying mutation analysis to SDL specification. In:Proceedings of Specification and Description Language 2003, Stuttgart, Germany, LNCS 2708,2003,269-284
    [36]章志燮,周颢,赵保华.面向变异分析的协议安全测试方法[J].西安交通大学学报,2009,43(12)：11-15

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700