支持向量机及其在入侵检测中的应用研究

英文题名：Study of Support Vector Machines and Its Application in Intrusion Detection Systems
作者：董春曦
论文级别：博士
学科专业名称：通信与信息系统
中文关键词：支持向量机 ; 算法改进 ; 支持向量预选 ; 分步训练法 ; 判决准则简化 ; 推广能力估计 ; 跨度 ; 线性规划 ; 参数选择 ; 最优化方法 ; 试探法 ; 入侵检测 ; KDD99 ; 系统模型
英文关键词：Support Vector Machines (SVM) ; Intrusion Detection Systems (IDS) ; algorithm improvement ; pre-extract support vectors ; step training method ; classification rule simplification ; generalization performance estimation ; Span ; linearly program ; parameters selection ; optimal method ; method of trails and errors ; KDD99 ; system model.
学位年度：2004
导师：杨绍全
学科代码：081001
学位授予单位：西安电子科技大学
论文提交日期：2004-02-01

摘要

随着计算机和网络的日益普及，有关系统或网络的安全问题也日益突出。入侵检测系统是对传统计算机安全机制的一种补充，增大了对系统与网络安全的保护范围。入侵检测系统的研究始于上世纪80年代，虽有长足的发展，但还未成为网络或系统的主要安全保护措施。目前入侵检测研究的主要问题在于需要计算的数据量特别大，而这些数据对于检测率和虚警率的要求来说又是不充分和不完备的。
     支持向量机是自上世纪90年代提出的一种基于统计学习理论的机器学习算法。与传统统计学研究样本产生的规律或样本数目趋于无穷大时的渐进性能不同，它更注重研究样本本身所提供的信息，所以特别适合于小样本问题。本论文的目的是研究将支持向量机应用于入侵检测系统的有关问题，结合入侵检测系统的特点和要求，对支持向量机算法做了一些研究工作。
     论文主要涉及四方面研究内容：支持向量机改进算法、支持向量机推广能力估计、支持向量机参数选择和构造基于支持向量机的入侵检测系统。
     1．在算法改进方面，给出了基于紧互对模型的支持向量预选方法和支持向量机的分步训练方法。由于支持向量机训练时要在所有样本上寻找对应二次规划问题的最优解，所以运算量和存储空间的要求都很大。但在得到的支持向量机中，只有支持向量影响分类性能，而通常支持向量只占样本的很少一部分，所以存在大量的浪费。通过分析发现，紧互对样本和支持向量距离分类超平面最近的性质有许多相似之处。通过将紧互对模型推广到特征空间，给出了一种基于紧互对模型的支持向量预选方法。这种方法能够有效预选样本集中的支持向量；使用紧互对集合训练的支持向量机性能与在全部训练集上得到的支持向量机性能相当，而运算时间和存储空间都有较大程度的减小。通过分析支持向量机求解二次规划中的KKT条件，根据样本的判决值，可以将训练集中的样本划分为不同的类别。应用于测试样本时，同样可以将测试样本划分为不同的类别。这些不同类别的样本不但对应了支持向量机分类超平面中的不同位置，还对应了特征空间中样本与分类超平面法向量间的角度关系。从而可以估计不同性质的样本加入到训练集后对支持向量机分类面的影响，进而得到支持向量机的分步训练方法。这种方法不但能够积累支持向量机的判决结果进行再学习，而且可以根据具体问题选择再学习的效果，使支持向最机的分类面向期望的方向变化。
     2．在支持向量机推广能力估计中，给出了使用跨度估计推广能力时的线性规划求解方法。在求解跨度的二次规划中，目标函数定义了一个向量的椭圆范数。证明了它和约束条件一起，定义了一个凸规划。根据凸规划的收敛性质以及范数的等

    西女.匕子卞}技人学博卜论文:支持向量{儿及J〔丫i一入侵检测中的{、谈用朔·究
    价性，可以使用无穷范数来代替目标函数中的椭圆范数。同时证明，新的规划仍然
    是凸规划，而且两者寻优过程互为收敛，有数值上的依赖关系。对新的规荆做适当
    变换，就可以使用线性规划方法来求解跨度。这种方法与原方法在估计推广能力时
    的误差月卜常小，而计算时间有很大的降低。
     3.在支持向量机的参数选择中，给出了支持向量机参数的最优化选择方法和
    试探法选择方法。通过分析核参数和惩罚因子对性能的不同影响，可以将支持向量
    机的参数分为敏感参数和不敏感参数，从而能够使用不同的参数更新规则来寻找最
    佳参数一。在更新核参数的准则方面，不再寻求随参数变化连续、可微的推广能力估
    计表达武，而是选择计算简单的支持向量率来衡量推广能力的变化，使用间接的方
    式指导参数更新。对不敏感的惩罚因子则采用宽松的边界支持向量数目和在所有支
    持向量中的比例作为更新准则，进而给出一种选择支持向量打L参数的最优化方法。
    由于在最常用核函数的标准表达形式中，只有一个核参数需要一调整，根据参数选择
    的不同更新准则，还可以使用试探法来选择支持向量机的参数，最后给出了试探法
    选择支持向量机参数的流程图。无论是使用最优化方法还是试探法，这些准则都能
    有效地选择一组较好的支持向量机参数。
     4.在应用支持向量机实现入侵检测方面，分析了应用支持向量机实现入侵检
    测的方法，数据归一化、支持向量机参数对检钡小性能的影响，以及分步一训练法应用
    于入侵检测、简化支持向量机判决准则等问题。通过分析支持向量机中边界支持向
    量和非边界支持向量对分类性能的影响，证明了支持向量机中，去掉边界支持向量
    的训练集是可分的，分析了在此训练集上训练的支持向量机的分类超平而与在原4训
    练集_仁得到的支持向量机的分类超平面间的异同以及带来的误差，从而给出了支持
    向量机的分类准则简化方法。在入侵检测中，简化分类准则在保持检测性能儿一乎相
    !司的情况下，大大减小了检测时f司。最后给出了构建基于支持向量机的入侵检测模
    型，它是基于网络的入侵检测系统，既能检测异常行为，也能检测误用行为，而且
    包含了支持向量机的再学习功能。
    关键i司;支持向量机，算法改进，支持向量预选，分步训练法，判决准则简化，扣
    广能川古计，跨度，线性规划，参数选择，最优化方法，试探法，入侵检测，KDD99，
Along with the wide application of computer and Internet, the security problem about systems or network becomes more and more important. Intrusion Detection Systems (IDS) are complementarities to security mechanism of network and computer systems, and enhance the protection depth of security. The research of IDS began in 80's of last century, although there is quite great progress, there is no effective mean to protecting the network and computer systems yet. The key problem of IDS is that there are too many data need to be dealt with, and these data are not sufficient and self-contained to the requirement of detection and false alarm rates.Support Vector Machines (SVM) is a novel algorithm of machine learning issued in 90's of last century, which is based on the statistic learning theory. It puts more attention to the information provided by the samples, and fits to solving the small sample problem.The aim of the dissertation is to research the applications of SVM to IDS. The dissertation contains four parts: the improvement of SVM algorithm, the estimation algorithm of the generalization performance of SVM, the method for selecting the parameters of SVM and the IDS based on the SVM.For the improvement of SVM, the tight mutual pair method for pre-extracting support vectors and step method for training SVM are proposed. Since the training procedure of SVM searches the optimal solution of a quadratic program on all the training samples, it requires huge computation and memory space. In fact, only few samples determine the performance of SVM, which are named support vectors. There are many similar characteristics between tight mutual pair in traditional pattern recognition and support vector. Generalized the tight mutual pair to the feature space by Mercer theorem, it can pre-extract support vectors efficiently, and the performance of the SVM trained on the tight mutual pair set is comparable with that trained on all the training samples, while the required computation and memory space of the former decrease sharply. By the KKT condition of solution procedure, samples in training set can be divided into three categories. For the testing samples, there are also three categories divided by the determination value, and the categories corresponds the position of samples to the classification hyper-plane and the angle to the normal vector of the hyper-plane. The effect of different category samples appended on the hyper-plane can be studied, it leads step method for training SVM. The method can not only make the SVM to relearn from the new samples, but also make classification hyper-plane change in an expect direction according with the certain problem.In the estimation of generalization performance, a linear program procedure for

    Span method is presented. In the quadratic program solution procedure of Span, the objective function defines an elliptic norm of vectors. It is proved that the function and the constraint condition define a convex program. According with the convergence of convex program and equivalence of norm, it is also proved that the elliptic norm can be substituted by the infinite norm, the new program is still convex, and both optimal procedures are convergent mutually. After certain transformation, a linear program method can be obtained to solve the Span. The error between two methods is very little, and the computation decreases sharply.In the selection of the parameters, optimal and trials-and-errors methods are proposed. According with the affections of support vector to the classification performance, they can be categorized into sensitive and insensitive parameters. So the parameters are updated with different rules. For kernel parameters, the ratio of the number of support vectors to that of training samples is used to measure the variation of generalization performance, and updates the parameters indirectly. For penalty factor, the ratio of the number of bound support vectors to that of support vectors is not the critical rule. So an optimal method for selecting the parameters can be obtained. Exce

引文

[1] 中国互联网发展报告，http://www.cnnic.cn/develst/2003-7/,2003.8。
    [2] K. Ivan, Computer vulnerability analysis thesis proposal, Technical Report CSD-TR-97-026, The COAST Laboratory, Department of Computer Sciences, Pudue University, April 1997.
    [3] 安全报告：MSBlast病毒，国家计算机网络应急技术处理协调中心(CNCERT/CC), http://www.cert.org.cn/index.shtml, 2003. 9。
    [4] D. Denning, An intrusion detection model, IEEE Trans. On Software Engineering, Vol. 13, No. 2, PP. 222-232, 1987.
    [5] Bai Yuebin, H. Kobayashi, Intrusion detection systems: technology and development, Advanced Information Networking and Applications, 2003. AINA 2003. 17th International Conference on, PE 710-715, 27-29 March 2003.
    [6] R. A. Kemmerer, G. Vigna, Intrusion detection: a brief history and overview, Computer, Vol. 35, No. 4, PP. 27-30, April 2002.
    [7] K. Levitt, Intrusion detection: current capabilities and future directions, Computer Security Applications Conference, 2002. Proceedings. 18th Annual, 9-13 Dec. PE 365-367, 2002.
    [8] J. S. Sherif, T. G. Dearmond, Intrusion detection: systems and models, Enabling Technologies: Infrastructure for Collaborative Enterprises, 2002. WET ICE 2002. Proceedings. 11th IEEE International Workshops on, PP. 115-133, 10-12 June 2002.
    [9] 李鸿培，入侵检测中几个关键问题的研究，西安电子科技大学博士学位论文，2001．3。
    [10] 唐正军，不对称模型与智能化网络入侵检测技术，第二炮兵工程学院博士学位论文，2002．5。
    [11] 马恒太，基于Agent分布式入侵检测系统模型的建模及实现，中国科学院博士学位论文，2001．2。
    [12] L. Zirkle, What is host-based intrusion detection?, Virginia Tech CNS, SANS institute Resources, Intrusion Detection FAQ, 2000.
    [13] K. Ilgun, USTAT: A real-time intrusion detection system for Unix. Proceeding of IEEE Symp., Research in Security and Privacy, Oakland, California, PP. 16-28, May, 1993.
    [14] T. Lunt, A. Tamura and F. Gilham, et al, A real-time intrusion detection expert system (IDES), Final Technical Report, Computer Science Lab., SRI International, Menlo Park, CA, USA, Feb., 1992.

    [15] A.R Phillip, G.N. Peter, EMERALD: event monitoring enabling responses to anomalous live disturbances, Technical Report, Computer Lab., SRI International, Menlo Park, California, Oct. 1997.
    [16] L.T. Heberlein, G.V. Dias and K.N. Levitt, et al, A network security monitor. Proceedings of the IEEE Symposium on Research in Security and Privacy, Oakland, CA,PP. 296-304, May, 1990.
    [17] S. Staniford-Chen, S. Cheung, et al, GrIDS: a graph-based intrusion detection system for large networks, Poceedings of the 19th National Information System Security Conference, National Institute of Standards and Technology, Vol. 1, PP. 361-370, Oct, 1996.
    [18] Internet Security System, Introduction to RealSecure Ver 3.0, Jan, 1999.
    [19] Netranger Intrusion Detection Systems, Technical Information, Apr. 1999.
    [20] M. Roesch, Snort-Lightweight Intrusion Detection for Networks, Proc. Usenix Lisa'99 Conf., Usenix Assoc, Berkeley, Calif., 1999.
    [21] S.R. Snapp, J. Brentano, G.V. Dias, et al, DIDS (Distributed intrusion detection system): motivation, architecture, and an early prototype, Proceeding of 14th National Computer Security Conference, Washington, D. C, PP. 167-176, Oct, 1991.
    [22] S. Cheung, K.N. Levitt. Proteting Routing Infrastructures from denial of service using cooperative intrusion detection. Preceedings New Security Paradingms Workshop. Cumbria, U.K., 1997.
    [23] M. Mahoney and P. Chan, Detecting novel attacks by identifying anomalous network packet headers. Technical Report CS-2001-2, Florida Institute of Technology, Melbourne, Fl. 2001.
    [24] M. Mahoney. Computer security: a survey of attacks and defense, Available at http://www.docshow.net/ids.htm, 2000.
    [25] E. Eskin, A.arnold, M. Preraul, et al. A geometric framework for unsupervised anomaly detection: detecting intrusions in unlabeled data. Technical Report, CUCS, 2002.
    [26] E. Eskin. Anomaly detection over noisy data using learned probability distributions. Proceedings of ICML 2000. Menlo Park, CA, 2002.
    [27] J. Winkler, W.J. Page. Intrusion and anomaly detection in trusted systems. Proceedings of 15th Annual Computer Security Application Conference. Tucson, AZ., 1989.

    [28] D. Anderson, Th. Frivold and A. Valdes. Next generation intrusion detection expert system (NIDES): a summary. SRI-CSL-95-07, SRI International, Menlo Park, CA, May, 1995.
    [29] Jiang Ning, K.A. Hua, S. Sheu. Considering both intra-pattern and inter-pattern anomalies for intrusion detection, Data Mining, 2002. Proceedings. 2002 IEEE International Conference on , PP. 637 -640, 9-12 Dec. 2002.
    [30] J.M. Estevez-Tapiador, P. Garcia-Teodoro, J.E. Diaz-Verdejo. Stochastic protocol modeling for anomaly based network intrusion detection Information Assurance, 2003. IWIAS 2003. Proceedings. First IEEE International Workshop on, PP. 3-12, March 24, 2003.
    [31] D. Anderson, T. Lunt, H. Javitz, et al. SafeGuard final report: detecting unusual program behavior using the NIDES statistical componet. Technical Report SRI, 1994.
    [32] S. Kumar, E.H. Spafford. A pattern matching model for misuse intrusion detection. Proceedings of 17th National Computer Security Conference, October, PP. 1994-204, 1994.
    [33] S. Kumar, E.H. Spafford. Software architecture to support misuse intrusion detection. Proceedings of 18th National Information Security Conference, PP. 194-204, 1995.
    [34] K. Jackson, M.C. Neumann, D. Simmonds et al. An sutomated computer misuse detection system for UNICOS. Proceedings of the Cray Users Group Conference. Tours, France, 1994.
    [35] P.G. Neumann, D.B. Parker. A summary of computer misuse techniques. Proceedings of the 12th National Computer Security Conference, 1989.
    [36] S. Smaha, S. Snapp. Method and system for detecting intrusion into and misue of a data processing system. US555742. Patent Office, Sep. 17, 1996.
    [37] T. Corbitt. The computer misue act. Computer fraud and security workshop. Cumbria, U.K., 1997.
    [38] R.F. Erbacher, K.L. Walker. D.A. Frincke, Intrusion and misuse detection in large-scale systems. Computer Graphics and Applications, IEEE , Vol. 22, No. 1 , PP. 38 -47, Jan.-Feb. 2002.
    [39] S.A. Hofmeyr, S. Forrest, A. Somayaji. Intrusion detection suing sequences of system calls. Journal of Computer Security. Vol. 6, PP. 151-180, 1998.
    [40] C. Warrender, S. Forrest, B. Pearlmutter. Detecting Intrusions using system call: alternative data models. 1999 IEEE Symposium on Security and Privacy. IEEE Computer Society, PP. 133-145, 1999.
    [41] D. Marchette. Computer intrusion detection and network monitoring: a statistical viewpoint. Springer Verlag, 2001.

    [42] D.J. Burroughs, L.F. Wilson, G.V. Cybenko. Analysis of distributed intrusion detection systems using. Bayesian methods. Performance, Computing, and Communications Conference, 2002. 21st IEEE International, PP. 329 -334, 3-5 April 2002.
    [43] W. Fan, W. Lee, S.J. Stolfo et al. A multiple model cost-sensitive approach for intrusion detection. Proceedings of the 11th European Conference on Machine Learning, 2000.
    [44] W. Lee. A data mining frame work for constructing features and models for intrusion detection systems. PhD Thesis, Columbia University, 1999.
    [45] W. Lee, S.J. Stolfo et al, A data mining framework for building intrusion detection models. Proceeding of the 12th IEEE Symposium on Security and Privacy, Oakland, CA 1999.
    [46] Zhao Jun-Zhong, Huang Hou-Kuan. An intrusion detection system based on data mining and immune principles. Machine Learning and Cybernetics, 2002. Proceedings. 2002 International Conference on , Vol. 1, PP. 524 -528, 4-5 Nov. 2002.
    [47] Han Hong, Lu Xin-Liang, Ren Li-Yong. Using data mining to discover signatures in network-based intrusion detection. Machine Learning and Cybernetics, 2002. Proceedings. 2002 International Conference on , Vol. 1 , PP. 13 -17, 4-5 Nov. 2002.
    [48] W. Lee, S. Stolfo. Data mining approaches for intrusion detection. Proceedings 7th USENIX Security Symposium. San Antonio. TX. 1998.
    [49] W. Lee, S.J. Stolfo, K.W. Mok. Data mining in work flow environments: Experiences in intrusion detection. Proceeding of the Conference in Knowledge Discovery and Data Mining, 1999.
    [50] D.S. Bauer, F.R. Eichelman, R.M. Herrera, A.E. Irgon. Intrusion detection: an application of expert systems to computer security. Security Technology, 1989. Proceedings. 1989 International Carnahan Conference on, PP. 97-100, Oct. 3-5, 1989.
    [51] D. Denning, P. Neumann. Requirements and model for IDES- a real-time intrusion detection expert system. Final Report, Computer Science Laboratory, SRI International, 1985.
    [52] D. Denning, D.Edwards, R. Jagannathan, et al. A prototype IDES: a real-time intrusion detection expert system. Computer Science Laboratory, SRI International, 1987.
    [53] D. Anderson, T. Frivold, A. Valdes. Next-generation intrusion detection expert system (NIDES). Technical Report, SRI-CSL-95-07, SRI International, Computer Science Lab., 1995.

    [54] G. Tsudik, R. Summers. AudEs - an expert system for security auditing. Proceedings of the AAAI Conference on Innovative Applications in AI, San Jose, CA, 1990. Reprinted in Computer Security Journal, Vol. 6, No. 1, PP. 89-93, 1990.
    [55] V. Paxon. Bro: a system for detection network intruders in real-time. Proceeding of the 7th USENIX Security Symposium, San Antonio, TX, 1998.
    [56] H. Debar, M. Becker, D. Siboni. A neural network component for an intrusion detection system. Proceedings of the IEEE Symposium on Research in Security and Privacy, Oakland, CA 1992.
    [57] A. Ghosh, A. Schwartzbard. A study in using neural networks for anomaly and misuse detection. Proceedings of 8th USENIX Security Symposium, 1999.
    [58] R. Simonian. A neural network approach towards intrusion detection. Proceedings of 13th National Computer Security Conference, Washington, D.C., 1990.
    [59] S. Mukkamala, G. Janoski, A. Sung. Intrusion detection using neural networks and support vector machines. Neural Networks, 2002. IJCNN '02. Proceedings of the 2002 International Joint Conference on, Vol. 2, PP. 1702 -1707, 12-17 May 2002.
    [60] Liu Zhen, G. Florez, S.M. Bridges. A comparison of input representations in neural networks: a case study in intrusion detection. Neural Networks, 2002. IJCNN '02. Proceedings of the 2002 International Joint Conference on , Vol. 2 , PP. 1708 -1713, 12-17 May 2002.
    [61] P. Lichodzijewski, A. Nur Zincir-Heywood, M.I. Heywood. Host-based intrusion detection using self-organizing maps. Neural Networks, 2002. IJCNN '02. Proceedings of the 2002 International Joint Conference on , Vol. 2 , PP. 1714 -1719, 12-17 May 2002.
    [62] A.J. Moglund, K. Hatonen, A.S. Sorvari. A computer host-based user anomaly detection system using the self-organizing map. Neural Networks, 2000. IJCNN 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on , Vol. 5 , PP. 411 -416, 24-27 July 2000.
    [63] S.A. Ilofmeyer. An immunological model of distributed detection and its application to computer security. PhD thesis, The University of New Mexico, 1999.
    [64] S, Forrest. S.A. Hofmeyer, A. Somayaji. Computer Immunology. Communications of the ACM, Vol. 40, No. 10, PP. 88-96, 10, Oct., 1997.
    [65] P.K. Harmer, P.D. Williams, G.H. Gunsch, G.B. Lamont. An artificial immune system architecture for computer security applications. Evolutionary Computation, IEEE Transactions on ,Vol. 6, No. 3 , PP. 252 -280. June 2002.

    [66] Zhao Junzbong, Huang Houkuan. An evolving intrusion detection system based on natural immune system. TENCON '02. Proceedings. 2002 IEEE Region 10 Conference on Computers, Communications, Control and Power Engineering , Vol. 1, PP. 129 -132, 28-31 Oct. 2002.
    [67] Yang Xiang-Rong, Shen Jun-Yi, Wang Rui. Artificial immune theory based network intrusion detection system and the algorithms design. Machine Learning and Cybernetics, 2002. Proceedings. 2002 International Conference on , Vol. 1, PP. 73 -77, 4-5 Nov. 2002.
    [68] A. Boukerche, K.R.L. Juca, J. Bosco, M. Sobral. Intrusion detection based on the immune human system. Parallel and Distributed Processing Symposium., Proceedings International, IPDPS 2002, Abstracts and CD-ROM, PP. 199 -206, 15-19 April 2002.
    [69] Gao Meimei, Zhou MengChu. Fuzzy intrusion detection based on fuzzy reasoning Petri nets. Systems, Man and Cybernetics, 2003. IEEE International Conference on, Vol. 2, PP. 1272 -1277, Oct. 5-8, 2003.
    [70] D. Frincke, D. Tobin, Y. Ho. Planning, Petri nets, and intrusion detection. Proceeding of 21st National Information System Security Conference, Crystal City, VA, 1998.
    [71] N. Ye. A markov chain model of temporal behavior for anomaly detection. Proceedings of the IEEE Systems, Man, and Cybernetics Information Assurance and Security Workshop, 2000.
    [72] Gao Fei, Sun Jizhou, Wei Zunce. The prediction role of hidden Markov model in intrusion detection Electrical and Computer Engineering, 2003. IEEE CCECE 2003. Canadian Conference on, Vol. 2, PP. 893 -896, May 4-7, 2003.
    [73] Y. Qiao, X.W. Xin, Y. Bin, S. Ge. Anomaly intrusion detection method based on HMM. Electronics Letters, Vol. 38, No. 13, PP. 663 -664, 20 Jun 2002.
    [74] Gao Bo, Ma Hui-Ye, Yang Yu-Hang. HMMs (Hidden Markov models) based on anomaly intrusion detection method. Machine Learning and Cybernetics, 2002. Proceedings. 2002 International Conference on, Vol. 1, PP. 381 -385, 4-5 Nov. 2002.
    [75] M. Crosbie, E. Spafford. Defending a computer system using autonomous agents. Proceedings of the 8th National Information Systems Security Conference, Baltimore, MD, 1995.
    [76] Autonomous agents. Technical Report CSD-TR-95-022, Department of Computer Sciences, Purdue University, 1995.
    [77] Dong Yongle, Qian Jun. Shi Meilin. A cooperative intrusion detection system based on autonomous agents. Electrical and Computer Engineering, 2003. IEEE CCECE 2003. Canadian Conference on, Vol. 2, PP. 861-863, May 4-7, 2003.
    [7

    [78] Xian Feng, Jin Hai, Liu Ke, Hart Zongfen. A mobile-agent based distributed dynamic/spl mu/Firewall architecture. Parallel and Distributed Systems, 2002. Proceedings. Ninth International Conference on, PP. 431-436, 17-20 Dec. 2002.
    [79] Chan, P. C., Wei, V. K. Preemptive distributed intrusion detection using mobile agents. Enabling Technologies: Infrastructure for Collaborative Enterprises, 2002. WET ICE 2002. Proceedings. Eleventh IEEE International Workshops on, PP. 103-108, 10-12 June 2002.
    [80] J. Frank. Machine learning and intrusion detection: current and future directions. Proceedings of the 17th National Computer Sceurity Conference, Baltimore, MD, 1994.
    [81] W. Weiss, A. Baur. Analysis of audit and protocol data using methods from artificial intelligence. Proceedings of the 13th National Computer Security Conference, Washington, D. C., 1990.
    [82] T. Lane, C. Brodley. An application of machine learning to anomaly detection. Proceedings of the 12th National Information System Security Conference, Baltimore, MD, 1994.
    [83] C. Sinclair, L. Pierce, S. Matzner. An application of machine learning to network intrusion detection. Computer Security Applications Conference, 1999.(ACSAC'99) Proceedings. 15th Annual, PP. 371-377, 6-10 Dec. 1999.
    [84] Cisco Intrusion Detection Systems. Cisco System, Inc. http://www.cisco.com.
    [85] RealSecure. Internet Security System. http://www.iss.net.
    [86] SecurityNet. Intrusion Inc. http://www.intrusion.com.
    [87] 天阗黑客入侵检测与预警系统。启明星辰信息技术有限公司。http://www.venustech.com.cn/。
    [88] “天眼”入侵检测系统。北京中科网威信息技术公司。http://www.netpower.com.cn/。
    [89] P. Porras. STAT, A state transition analysis tool for intrusion detection. Master thesisi, Computer Science Department, University of California. Santa Barbara, CA. 1992.
    [90] D. Zamboni. SAINT: A security analysis integration tool. Systems Administration, Networking and Security(SANS) Conference, Washington, D. C., 1996.
    [91] J. Hichberg, K. Jackson, C. Stallings et al. NADIR: An automated system for detecting network intrusion and misuse. Computers and Security, Vol. 12, No. 3, PP. 235-248, 1993.
    [92] C. Cortes and V. Vapnik. Support vector networks. Machine Learning, Vol. 20, PP. 273—297. 1995.

    [93] V. Vapnik. The Nature of Statistical Learning Theory. Springer, New York, 1995.
    [94] V. Vapnik, The nature of statistical learning theory, Springer, New York, 2000.
    [95] V. Vapnik, The nature of. statistical learning theory, Springer, New York, 2000.(中译本，张学工译，清华大学出版社，2000)
    [96] A. Turing. Computing machinery and intelligence. MIND, 1950, 59: 433-460.
    [97] F. Rosenblatt. The Perceptorn: A Probablistic Model for Information Storage and Organization in the Brain. Psychological Review. Vol. 65, PP. 386-406, 1958.
    [98] Machine Learning, an Artificial Intelligence Approach. Edited by R. S. Michalski, J. Garbonell, T. M. Mitchell. Springor-Verlag. 1983.(中译本，王树林等译。科学出版社，1992)
    [99] F. Rosenblatt. The Perceptron: A Theory of Statistical Separability in Cognitive Systems. Technical Report VG-1196-G-I, Cornell Aeronautical Lab., 1958.
    [100] F. Rosenblatt. Perceptual Generalization over Transformation Groups, Self Organizing Systems, Permagon Press, London, 1959.
    [101] F. Rosenblatt. Principles of Newodynamics and the Theory of Brain Mechanisms, Spartan Books, Washington, D. C., 1962
    [102] O. G. Selfridge. Pandemonium: A Paradigm for Learning, Procerdings of the Symposium on Mechanization of Thought Processes, D. Blake, A. Uttley(EDs.), HMSO, London, PP. 511-529, 1959.
    [103] B. Widrow. Generalization and Information Storage in Networks of Adelaine"Neurons", Sell Organizing System, M. C. Yovitz, G. T. Jacobi and G. D. Goklstein EDs., Spartan Books, Washington, D. C., PR 435-461, 1962.
    [104] W. R. Ashby. Design for a Brain, The origin of Adaptive Behavior, John Wiley and Sons, Inc., 1960.
    [105] M. Minsky, S. Papert, Perceptrons, MIT Press, Cambridge, Mass., 1969.
    [106] H. D. Block. The Perceptron: A model of Brain Functioning, Rev. Math. Physics, Vol. 34, No. 1, PP. 123-135, 1961.
    [107] M. C. Yovits, G. T. Jacnbi and G. D. Goldstein. Self-Organizing Systems, Spartan Books, Washington, D. C., 1962
    [108] J. T. Culberson. The Minds of Robots, University of Ilinois Press, Urbana, Ilinois, 1963.
    [109] H. Kazmierczak and K. Steinbuch. Adaptive Systems in Pattern Recognition, IEEE Transactions of Electronic Computer, Vol. EC-12, No. 5, PR 822-835, 1963.
    [110] G. S. Sebestyen. Decision-Making Processes in Pattern Recognition, Macmillan, New York. 1962.

    [111] K. S. Fu. Sequential Methods in Pattern Recognition and Machine Learning, Academic Press,New York, 1968.
    [112] K. S. Fu. Pattern Recognition and Machine Learning, Plenum Press, New York, 1971.
    [113] K. S. Fu and J. T. Tou. Learning Systems and Intelligent Robots, Plenum Press, New York, 1974.
    [114] S. Watanabe. Information Theoretic, Aspects of Inductive and Deductive Inference, IBM Jorunal of Research and Developments, Vol. 4, No. 2, PP. 208-231,1960.
    [115] A. G. Arkadev and E. M. Braverman. Learning in Pattern Classification Machines, Nauka, Moscow, 1971.
    [116] K. Fukanaga. Introduction to Statistical Pattern Recognition, Academic Press, New York, 1972.
    [117] R. O. Duda and P. E. Hart. Pattern Classification and Scene Analysis, Wiley, New York, 1973.
    [118] L. Kanal. Patterns in Pattern Recognition: 1968-1974, IEEE Transactions on Information Theory, Vol. IT-20, No. 6, PP. 697-722, 1974.
    [119] V. Vanpik and A. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Doklady Akademii Nauk Ussr. 1968 (英译本: Sov. Math. Dokl.)
    [120] V. Vanpik and A. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab. Vol. 16, No. 4. PP. 264-280, 1968
    [121] V. Vapnik and A. Chervonenkis. Theory of Pattern Recognition [in Russian]. Nauka, Moscow, 1974. (German Translation: W. Wapnik & A. Tscherwonenkis, Theorie der Zeichenerkennung, Akademie-Verlag, Berlin, 1979).
    [122] V. Vapnik. Estimation of Dependences Based on Empirical Data [in Russian]. Nauka, Moscow, 1979. (English translation: Springer Verlag, New York, 1982).
    [123] V. Vapnik and A. Chervonenkis. The necessary and sufficient conditions for consistency of the method of empirical risk minimization [in Russian]. Yearbook of the Academy of Sciences of the USSR on Recognition, Classification, and Forecasting, 2, Nauka, Moscow, PP. 207-249. (English translation: The necessary and sufficient conditions for consistency of the method of empirical risk minimization. Pattern Recognition and Image Analysis, Vol. 1, No. 3, PP. 284-305, 1991.)
    [124] A. Tikhonov. On solving ill-posed problem and method of regularization. Doklady Akademii Nauk USSR, 153, PP. 501-504, 1963.
    [125] V. Ivanov. On linear Problems which are not well-posed. Soviet Math. Docl., Vol. 3, No. 4, PP. 981-983, 1962.

    [126] D. Phillips. On estimation of probability function and model. Annals of Mathematical Statistics, Vol. 33, No. 4, 1962.
    [127] V. Vapnik and A. Stefanyuk. Nonparametric methods for estimating probability densities. Autom. And Remote Contr. Vol. 8. 1978.
    [128] R. Solomonoff. A formal theory of inductive inference. Parts. 1 and 2, Inform. Contr., Vol. 7, PP. 1-22,224-254,1964.
    [129] A. Kolmogorov, Three approaches to the quantitative definitions of information, Problem of Inform. Transmission, Vol. 1,No. 1,PP. 1-7, 1964.
    [130] G. Chaitin, On the length of programs for computing finite binary sequences. J. Assoc. Comput. Mach., Vol. 4, PP. 1559-1526, 1963.
    [131] J. Rissanen, Modeling by shortest data description, Automatica, Vol. 14, PP. 465-471, 1978.
    [132] Y. LeCun, Learning processes in an saymmetric threshold network. Disordered systems and biological organizations, Les Houches, France, Springer, PP. 233-240, 1986.
    [133] D. Rumelhart, G. Hinton and R. Williams, Learning internal representations by error propagation, Parallel distributed processing: Explorations in macrostructure of recognition. Vol. 1, Badford Books, Cambridge, MA., PP. 318-362, 1986.
    [134] B. E. Boser, I. M. Guyon, and V. N. Vapnik. A training algorithm for optimal margin classifiers. In D. Haussler, editor, 5th Annual ACM Workshop on COLT, pages 144-152, Pittsburgh, PA, 1992. ACM Press.
    [135] C. Cortes and V. Vapnik. Support vector networks. Machine Learning, Vol. 20, PP. 273-297, 1995.
    [136] R. Courant and D. Hilbert, Methods of Mathematical Physics, J. Wiley, New York, 1953.
    [137] Christopher J.C.Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, Vol. 2, No. 2, PP. 121-167, 1998.
    [138] J. Smola and B. Scholkopf, A tutorial on support vector regression, NC2-TR-1998-030, 1998.
    [139] K.-R. Muller, S. Mika, G. Ratsch, K. Tsuda, and B. Scholkopf. An introduction to kernel-based learning algorithms. IEEE Neural Networks, Vol. 12, No. 2, PP. 181-201, 2001.
    [140] B. Scholkopf and A. Smola, Learning with Kernels Support Vector Machines, Regularization, Optimization and Beyond, MIT Press, Cambridge, MA, 2002.

    [141] B. Scholkopf. Statistical learning and kernel methods. MSR-TR 2000-23, Microsoft Research, 2000.
    [142] http://www.ieee.org/portal/index.jsp. 2003.
    [143] http://www.xa.cnki.net. 2003.
    [144] B. Scholkopf, C. Burges, and V. Vapnik. Extracting support data for a given task. In U. M. Fayyad and R. Uthurusamy, editors, Proceedings, First International Conference on Knowledge Discovery & Data Mining, Menlo Park, 1995. AAAI Press.
    [145] B. Scholkopf, C. Burges, and V. Vapnik et al. Comparing support vector machines with Gaussian kernels to radial basisi function classifiers. IEEE Trans. Sign. Processing, Vol. 45, PP. 2758-2765,1997.
    [146] B. Scholkopf, A. Smola, K.-R. Muller, et al. Support vector methods in learning and feature extraction. In 9th Australian Congress on Nwural Networks, 1998.
    [147] B. Scholkopf, C. Burges, and V. Vapnik. Incorporating invariances in support vector learning machines. In C. von der Malsburg, W. von Seelen, J. C. Vorbruggen, and B. Sendhoff, editors, Artificial Neural Networks, ICANN'96, Vol. 1112, PP. 47-52, Berlin, 1996.
    [148] B. Scholkopf, A. Smola, and V. Vapnik. Prior knowledge in support vector kernels. In M. Jordan, M. Kearns, and S. Solla, Eds. Advances in Neural Information Processing Systems 10, Cambridge, MA, Neural Computation, 1998.
    [149] Mika S., Ratsch G, Scholkopf B. and et al., Invariant feature extraction and classification in kernel spaces. In Advances in Neural Information Processing ystems 12, Cambridge, MA, MIT Press, 2000, PP. 526-532.
    [150] C. Gavin, N. L. C. Talbot, Mainipulation of prior probabilities in support vector classification, Proceedings of the International Joint Conference on Neural Networks (IJCNN-2001), Washington D.C., U.S.A, PP. 2433-2438, 2001.
    [151] Edgar Osuna, R. F. F. G, An Improved Training Algorithm for Support Vector Machines, Proc.IEEE NNSP'97, 1997.
    [152] J, Platt. Fast Training of Support Vector Machines using Sequential Minimal Optimization, Advances in Kernel Methods - Support Vector Learning, edited by B. C. S. A. Scholkopf B MIT Press, Cambridge, MA, 1999, PP. 185-208.
    [153] T. Joachims. Making large-Scale SVM Learning Practical, Advances in Kernel Methods - Support Vector Learning, edited by B. C. S. A. Scholkopf B MIT Press, Cambridge, MA, 1999, PP. 41-56.
    [154] C. J. C. Burges and B. Scholkopf. Improving the accuracy and speed of support vector learning machines. In M. Mozer, M. Jordan, and T. Petsche, editors, Advances in Neural Information Processing Systems 9, PP. 375-381, Cambridge, MA, 1997. MIT Press.

    [155] C. J. C. Burges. Simplified support vector decision rules. In L. Saitta, Eds., Proc. 13th International Conference on Machine Learning, PP. 71-77, San Mateo, CA, 1996.
    [156] B. Scholkopf, P. Knirsch, A. Smola, and C. Burges. Fast approximation of support vector kernel expansions and an interpretation of clustering as approximation in feature spaces. In P. Levi, M. Schanz, R.-J. Ahlers, and F. May, Eds., Mustererkennung 1998 20. DAGM-Symposium, Informatik aktuell, PP. 124 - 132, Berlin, Springer, 1998.
    [157] Edgar Osuna and Federico Girosi. Reducing the run-time complexity of Support Vector Machines. In Alexander J.Smola; Bernhard Scholkopf and Christopher J.C.Burges, Eds. Advance in kernel learning. PP. 271-284, MIT Press, 1999.
    [158] J.A.K. Suykens, T. Van Gestel, J. De Brabanter, B. De Moor, J. Vandewalle, Least Squares Support Vector Machines, World Scientific, Singapore, 2002 .
    [159] J.A.K. Suykens and J.Vandewalle, Least squares support vector machine classifiers, Neural Processing Letter, Vol. 9, No.3, PP. 193-230, 1999.
    [160] Jaakkola, T., Haussler, D. Probabilistic Kernel Regression Models. Proceedings of the Seventh Workshop on AI and Statistics. San Francisco, 1999.
    [161] G. Wahba, Yi Lin et al, Generalized Approximate Cross Validation for Support Vector Machines or Another Way to Look at Margin-like Quantities. Advances in Large fMargin Classifiers. P297-209. MIT Press. 2000.
    [162] M. Opper, O. Winther. Gaussian Processes and SVM: Mean Field and Leave-one -out. Advances in Large Margin Classifiers, PP. 311-326, Cambridge, MA, MIT Press,2000.
    [163] T. Joachims. Estimating the Generalization Performance of a SVM Efficiently. LS VIII-Report 25, University Dortmund, Germany, 1999.
    [164] V. Vapnik, O.Chapelle. Bounds on Error Expectation for Support Vector Machines. Neural Computation Vol. 12 No. 9, 2013-1036, 2000.
    [165] Ying Guo, Peter L. Bartlett, J. S.-Taylor. Covering numbers for support vector machines. IEEE Trans. Info. Vol. 48, No.1, PP. 239-250, 2002.
    [166] R.C. Williamson, A.J. Somla, B. Scholkopf. Generalization performance of regularization networks and support vector machines via entropy numbers of compact operators. IEEE Trans. Info. Vol. 47, No.6, PP. 2516-2532, 2001.
    [167] Ding-Xuan Zhou. Capacity of reproducing kernel spaces in learning theory. IEEE Trans. Info. Vol. 49, No. 7, PP. 1743-1752, 2003.

    [168] O. Chapelle, V. Vapnik. Choosing multiple parameters for support vector machines. Machine Learning, Vol. 46, No.1, PP. 131-159, 2002.
    [169] Y. Bengio. Gradient-based optimization of hyper-parameters. Neural Computation, Vol. 12, No. 8,2002.
    [170] N. Cristianini, C. Campbell and J.S.-Taylor. Dynamically adapting kernels in support vector machines. In Advances in Neural Information Processing Systems, 1999.
    [171] J.A.K. Suykens, Vandewalle J., Multiclass Least Squares Support Vector Machines, in Proc. of the International Joint Conference on Neural Networks (IJCNN'99), Washington DC, USA, Jul. 1999.
    [172] J. Weston, C. Watkins. Multi-class support vector machines. Technical Report CSD-TR-98-04, Royal Holloway University of London, 1998.
    [173] E. Bredensteiner, J. Bennett. Multicategory classification by support vector machines. Computational Optimizations and Applications. PP. 53-79, 1999.
    [174] A. Ben-Hur, D. Horn, H.T. Siegelmann and V. Vapnik, A support vector clustering method, International Conference on Pattern Recognition, 2000.
    [175] A. Ben-Hur, D. Horn, H.T. Siegelmann and V. Vapnik, Support Vector Clustering, Journal of Machine Learning Research Vol. 2, PP. 125-137, 2001.
    [176] B. Scholkopf and A. Smola and R. Williamson and P. Bartlett, New support vector algorithms, Neural Computation, Vol. 12, PP. 1083 - 1121, 2000.
    [177] K.R. Muller, S. Mika. G. Ratsch, et al. An introduction to kernel-based learning algorithms. IEEE Trans. On Neural Networks, Vol. 23, No. 2, PP. 181-201, 2001.
    [178] Jianhua Xu, Xuegong Zhang, Yanda Li, Kernel MSE algorithm: a unified framework for KFD, LS-SVM and KRR. In Proc. of IJCNN"01, PP. 1486-1491, Washington, DC, 2001.
    [179] E. Osuna, R. Freund, G. Girosi. Training support vector machines: an application to face detection. In International Conference on Computer Vision and Pattern Recognition. PP. 130-136, 1997.
    [180] M. Schmidt. Identifying speaker with support vector networks. In Interface '96 Proceedings. Sydney, 1996.
    [181] V. Blanz, B. Scholkopf, J.C. Burges, et al. Comparison of view-based object recognition algorithms using realistic 3d models. In C. Von Malsburg, W. von Seelen, J.C. Vorbuggen et al Eds, Artificial Neural Networks, ICANN '96, PP. 251-256, Berlin, 1996. Springer Lecture Notes in Computer Science, Vol. 1112.
    [182] T. Joachims. Text categorization with support vector machines. Technical Report, LS-Ⅷ No. 23, University of Dortmund, 1997.
    [1

    [183] K. Itai, A. Takasu, J. Adachi. Information extraction from HTML pages and its integration. Applications and the Internet Workshops, 2003. Proceedings. 2003 Symposium on, PP. 276-281, 2003
    [184] Xue-wen Chen. Gene selection for cancer classification using bootstrapped genetic algorithms and support vector machines. Bioinformatics Conference, 2003. CSB 2003. Proceedings of the 2003 IEEE, PP. 504-505, 2003.
    [185] Li Juan Cao, Kok Seng Chua, Lira Klan Guan. Combining KPCA with support vector machine for time series forecasting. Computational Intelligence for Financial Engineering, 2003. Proceedings. 2003 IEEE International Conference on, PP. 325-329, 2003.
    [186] 张莉，周伟达，焦李成。用于一维图像识别的支撑矢量机方法。红外与毫米波学报。Vol．21，No．2，PP．119-123，2002．
    [187] 饶鲜，董春曦，杨绍全。基于支持向量机的入侵检测系统。软件学报，Vol．14，No．4，2003．
    [188] 边肇祺，张学工等编著。模式识别(第二版)。清华大学出版社，1999。
    [189] 董春曦，基于Matlab Optimization ToolBox的SVM训练和判决程序。Available at ftp://ecm.xidian.edu.cn/pub/tools/algorithm/SVM/svm trainl.m。
    [190] O. L. Mangasarian and W. H. Wolberg. Cancer diagnosis via linear programming, SIAM News, Vol. 23, No. 5, PR 1-18, September 1990.
    [191] R. A. Fisher. The use of multiple measurements in taxonomic problems. Annual Eugenics, 7, Part Ⅱ, 179-188, 1936.
    [192] D. W. Aha. Incremental constructive induction: An instance-based approach. Proceedings of the Eighth International Workshop on Machine Learning. PP. 117—121.(1991).
    [193] 焦李成，张莉，周伟达。支持矢量机选取的中心距离比值法。电子学报，Vol．29，No．3，PP．383-386，2001．
    [194] ～J. ～W. Smith, ～J. ～E. Everhart, ～W. ～C. Dickson, et al. Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In Proceedings of the Symposium on Computer Applications and Medical Care. PP. 261-265, IEEE Computer Society Press.
    [195] 周伟达，张莉，焦李成。支撑矢量机推广能力分析。电子学报，Vol．29，No．5，PP．590-594，2001．
    [196] Lunts, A., Brailovskiy, V., Evaluation of Attributes Obtained in Statistical Decision Rules. Engineering Cybernetics, 3, P98-109, 1967.

    [197] 陈开周，最优化计算方法。西安电子科技大学出版社。1999。
    [198] 程云鹏编，矩阵论。西北工业大学出版社。1989。
    [199] 周伟达，张莉，焦李成。线性规划支撑矢量机。电子学报，Vol．29，No．11，PP．1508-1511．2001．
    [200] 支撑矢量预选取的自适应投影算法。丁爱玲，刘芳，曹伟。计算机工程与应用，Vol．38，No．19，PP．116-118，2002．
    [201] 裴继红，杨炬，支撑矢量预选取的Voronoi图方法。电子与信息学报，Vol．25，No．11．PP．1494-1498，2003．
    [202] C. L. Blake and C. J. Merz, UCI Repository of machine learning databases Irvine, CA: University of California, Department of Information and Computer Science, 1998[http://www.ics.uci.edu/～mlearn/MLRepository.html].
    [203] 美国联邦邮政局手写体识别数据库。Available at www.kernel machine.org,1994．
    [204] T. Graepel et al. Classification on Proximity Data with LP-Machines. In Proceedings of the 9th International Conference on Artificial Neural Networks, PP. 304-309, 1999.
    [205] A. Smola, B. Schlkopf, G. Ratsch. Linear programs for automatic accuracy control in regression. In Proceedings ICANN'99, Int. Conf. on Artificial Neural Networks, Berlin, 1999. Springer.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700