基于支持向量机的聚类及文本分类研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

About the library

Background
History
Leadership
Organization

Readers' Guide

Opening Hours
Collections
Help Via Email

Publications

Electronic Information Resources

基于支持向量机的聚类及文本分类研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Research on Clustering and Text Categorization Based on Support Vector Machine
作者：平源
论文级别：博士
学科专业名称：信息安全
中文关键词：内容安全 ; 支持向量机 ; 聚类分析 ; 支持向量聚类 ; 文本分类
英文关键词：content security ; support vector machine ; cluster analy-
英文关键词：sis ; support vector clustering ; text categorization
学位年度：2012
导师：万江文
学科代码：0812
学位授予单位：北京邮电大学
论文提交日期：2012-04-15

摘要

随着大数据(Big Data)时代的来临,互联网上分布、流动并急剧膨胀的不仅有多样化应用所产生的具有可用性、有效性的内容资源,还充斥着大量干扰正常业务、侵犯隐私、误导公众甚至危害社会稳定并同样多样化的信息和行为。从数据管理的角度,有必要根据不同行业、领域用户的需要,快速、高效地组织、分析、提取并分级保护有用的数据或敏感信息；而从内容安全的角度,人们更期待能够对正在或即将泄露的敏感信息进行检测和保护,对存在虚假、恶意或诱导意图的内容或行为进行分类、过滤和分析,以便及时地发现攻击源、保护受害者,同时调动智能防御系统进行数据处理、知识学习和模型更新。在众多机器学习方法中,聚类分析(无监督学习)和分类(有监督学习)被认为是快速、准确地发现、定位、组织和分析具有特定用途的可用信息和行为模式,实现信息安全保护效率最大化的有效途径和关键技术。
     作为一种基于统计学习理论的机器学习方法,支持向量机不仅具有优秀的小样本学习能力,而且较好地解决了非线性、高维度、局部极小值等问题。它既能通过构造闭合分界面来进行无监督的数据聚类分析,又可以通过构造非闭合分界面来处理有监督的数据分类问题,尤其适于处理高维、稀疏且特征之间具有较大相关性的文本数据,因而具有高效地解决前述以数据管理和内容安全为目的数据分析问题的优秀品质。然而,当样本规模较大、维数较高、类别数较多、分布不规则且存在噪声数据干扰时,传统的基于支持向量机的聚类分析模型存在训练速度较慢、参数敏感且难以找到合适的簇原型来提升簇标定的效率和准确率等问题；作为互联网信息存在的主要形式,文本数据通常具有前述特征,并且会以降低数据可分性的方式影响基于支持向量机的文本分类系统性能,包括降低训练和分类速度、准确率以及收集到的支持向量样本的指示意义等。
     为了解决这些问题,本文的主要研究内容及创新工作可归纳如下：
     (1)针对支持向量聚类算法兼具边界聚类与原型查找聚类的特点,从参数选择、对偶问题求解及簇标定策略等方面分析并总结了影响支持向量聚类算法性能的关键原因及可行的改进方向,并在分析了核函数宽度q与簇的分裂／合并模式之间的关系之后,提出了通过二分查找法快速定位簇规模稳定时的q值来同时取得最优参数和最佳聚类结果。
     (2)作为基于边界的聚类方法,能够对具有任意形状或不规则簇轮廓的数据集进行高效率的聚类是支持向量聚类算法相对于其他算法的一大优势。然而,这一优点也导致了支持向量聚类对簇轮廓比较敏感,受一些稀疏分布且干扰簇轮廓或数据分布结构的噪声数据影响较大。针对传统的支持向量聚类算法因未能有效界定噪声数据点和孤立点而允许噪声数据点参与对偶问题求解,降低了训练阶段的效率、影响了算法对数据分布结构探索的有效性等问题,本文首次从分布特点和簇隶属关系的角度给出了噪声数据的定义,并提出了一种无监督的噪声消除算法。利用该算法,可在数据进入对偶问题求解之前的输入空间快速地移除噪声数据,避免了一部分无意义的特征空间映射操作,降低了聚类算法对核矩阵的存储空间要求,并且可在不对数据集的分布结构或簇轮廓造成任何负面影响的前提下,为提升支持向量聚类算法的效率提供帮助。
     (3)寻找合适的簇原型是提升支持向量聚类算法效率的主要途径之一。传统的支持向量聚类算法或者使用支持向量分组作为簇原型,或者将其转换为单簇单原型的问题。前者在处理大规模高维数据时效率较低,后者得到的簇原型对结构不规则或内部样本分布不均匀的簇的指代效果不理想,并可能降低簇标定的准确率。针对这一问题,本文提出了一种单簇多个簇原型,并且每个簇原型同时使用形状质心和密度质心进行指代的双质心支持向量聚类(Double Centroids Support Vector Clustering,简称DBC)算法。从原理上看,DBC算法是前两种传统模型的折中,特点是能允许在不规则的簇内部自适应地分布多对簇原型。大量的实验表明,DBC算法不仅继承了经典支持向量聚类算法对不规则簇轮廓的识别能力,而且还可发现簇内样本的分布均匀程度、显著提高簇标定的效率和准确率,同时双质心具有较强的簇指代能力,可用于大规模数据的分析。
     (4)簇标定算法与簇原型的查找或生成模式有着紧密的联系。研究发现,当前的支持向量聚类算法在通过对簇原型点对之间的线段抽样完成组件连接性判定时,使用了大量的冗余点对和采样点,严重影响了簇标定效率却没能带来准确率的提升。针对这一问题,本文提出一种基于凸分解的簇标定(Convex Decomposition based Cluster Labeling,简称CDCL)算法,该算法属于单簇多个簇原型方案的变体,其最大特点是不再通过已有的或者优化生成的单一样本作为簇原型,而是能够根据簇结构的不同,自适应地将其分解为一定数量、不同形状和大小的凸包来作为簇原型使用。本文还详细分析并定义了以凸包为簇原型时影响凸包连接性判断的关键因素—准支持向量,并将簇的连接性分析转换为最近邻凸包之间的连接性判断问题,通过构造最大概率穿越准支持向量密集区域的采样线段来避免抽样点对的冗余。另外,本文还提出了一种与凸分解模型相匹配的非线性抽样序列生成模式来最大程度避免点对之间的冗余采样,降低实际的平均抽样频率。大量实验表明,本论文所提出的CDCL算法不仅提高了簇标定的效率,并且对参数设置不敏感,能显著提高标定的准确率。
     (5)研究表明,对于以构造特征空问的最小包含超球体和支持函数为目的的支持向量聚类而言,那些簇轮廓内部的样本、外部的孤立点及噪声数据点都是不必要的,它们的存在只会增加存储空间的占用,降低训练效率。为此,本文提出一种快速的支持向量聚类(Fast Algorithm of Support Vector Clustering,简称FASVC)算法。该算法先在数据输入空间直接提取簇轮廓(或边界)样本来构造超球体、提取支持向量并完成支持函数的构造,然后采取一种自适应的簇标定策略,根据所构造的超球体半径R是否大于1来选择使用基于凸分解或圆锥的簇标定算法。由于FASVC算法高度约简了求解优化问题的规模,并且所采用的自适应簇标定策略不会增加优化问题的约束条件,可大幅度地提升聚类分析过程的存储空间利用率和运行时间效率,故而非常适合在存储空间受限的情况下实施大规模的数据分析。另外,算法还与惩罚因子C无关,并对其他参数设置不敏感。实验证明,本论文所提出的FASVC算法能高效地处理文本聚类和P2P流量分类问题。
     (6)在文本分类领域,支持向量机是公认最好的分类器之一。由于基于结构化风险最小化原理,使用支持向量机进行文本分类的性能与数据的可分性(即不同类别样本之间的分类间隔)直接相关,因此寻找最合适的增强数据集可分性的文本表示方法是提升文本分类性能的关键。研究表明,文本向量化表示过程实际上是对文本信息进行压缩的过程,因而最大程度的信息保留对提升文本分类性能意义重大。然而,目前主流的文本表示方案则因存在“单一的文档频率依赖”、“特征权重量化的全局策略”及“忽略文本结构的作用”等问题导致大量重要信息在文本向量化过程中被丢失,影响了数据的可分性。针对这些问题,本文从多个角度提出了不同的性能提升方案。1)首先,本文定义了特征的类别贡献度的概念,并提出兼顾“类别贡献度”与“类间区分能力”相结合的方案(Category Contribution Enhanced,简称CCE)来避免文本特征量化时对单一文档频率的依赖。2)其次,本文设计了自适应的文本块划分算法,以此为基础可进行文本块分布重要性的描述,并将其作为结构信息嵌入到不同的特征中。3)然后,本文还定义了特征的类别倾向和类别偏好的概念,并基于此提出了融合多类别倾向的特征类间区分能力强化方案；在将该方案与CCE权重方案、文本块分布重要性描述相结合后构建了一种融合多类别倾向的文本向量化(co-contributions of terms on class tendency for vectorizing text,简称C2TCTVT)算法,该算法不仅保留了那些因遵循“全局策略”而丢失的特征类别倾向的分布信息,而且实现了将文本向量从高维、稀疏到低维、稠密的高度压缩,并且所得到的低维向量还保留了文本的多类别倾向信息、提升了数据可分性和支持向量样本的指代价值；基于该算法框架可在显著提升文本分类效率的同时获得与传统方法相当的分类性能。4)最后,作为对特征的局部重要性的改进,本文还提出了两组嵌入文本块重要性分布信息的特征频率方案,该方案可替代传统的特征频率方案,在结合CCE方案后可显著提升基于支持向量机的文本分类性能。
With the advent of big-data age, besides the regular content resources with usability and validity generated by diverse applications, a plenty of malicious messages and behaviors are simultaneously distributed on the Internet that might interfere regular service, violate privacy, misguide people and even harm the social stability. Furthermore, all of these data are flowing and expanding dra-matically. From the perspective of data management, it is essential to organize, analyze, retrieve and protect the useful data or sensitive information in a fast and efficient way for customers from different industries and fields; whereas the sensitive information and malicious messages or behaviors, from the perspec-tive of content security, are respectively expected to be found out for protection, and to be classified, filtered and analyzed for tracing the attackers, protecting victims as well as invoking the intelligent defense systems to process data, learn knowledge and update model. Among the machine learning methods, since cluster analysis (unsupervised learning) and classification (supervised learning) are able to be employed for detecting, tracing, organizing and analyzing either available information or behavior patterns, they are suggested to be the effec-tive ways and crucial techniques for maximizing the efficiency of information security and protection.
     As an important machine learning method based on statistical learning the-ory, not only did the support vector machine (SVM) is able to learning from small-scale samples effectively, but also resolve such practical problems as non-linearity, high dimensionality and local minima, etc. Eventually, the mathemati-cally closed separating hyperplanes can be generated by SVM for clustering data in an unsupervised way, while the unclosed separating hyperplanes are usually constructed for handling the supervised issues of data classification, especial-ly for text data with high dimensional and sparse vectors and great correlated features. Therefore, SVM has the excellent qualities of efficient data analysis centered on data management and information security. However, since the data set to be clustered has the characteristics of large-scale, high-dimension, amoun-t of categories, irregular distribution and noise interference, such drawbacks as lower speed in training, parametric sensitivity, and without suitable cluster pro-totypes to improve both efficiency and accuracy are still found in the traditional clustering method based on the SVM (i.e., support vector clustering). Unfortu-nately, as a major existence form of information on the Internet, text usually can be found with those characteristics, according to lower its separability, which will affect the performance of text categorization system, for instance, slowing down both the speed of training and classification, reducing the accuracy and inductive value of the collected support vectors, etc.
     To overcome the aforementioned issues, the main works of the dissertation could be summarized as follows:
     (1) Consider the characteristics of both boundary-based clustering and clus-tering with prototype finding inherited by the support vector clustering algo-rithm, in this thesis we figure out the crucial factors which would affect the algorithm's performance and suggest some practicable directions for making improvements after a series of detail analysis with respect to parameter setting, resolving dual problem and strategy of cluster labeling. Afterwards we conclude the relationship between the kernel width q and the decomposition and combi-nation pattern of clusters, a novel method which extracts an appropriate value of q by applying binary search to reach an expected result.
     (2) As an important boundary-based clustering algorithm, the major advan-tage of support vector clustering is its capability of handling arbitrary cluster shapes effectively. However, this advantage did cause the algorithm to be sen-sitive to cluster profile, especially to those data interfered by noise data points with sparse distribution. Due to unable to distinguish noise data points and outliers, the traditional support vector clustering algorithms suffer from such limitations as lower efficiency of SVM training for estimating support function and ineffectively exploring data distribution while noise data points are allowed in resolving the dual problem. In consideration of this problem, this thesis gives an innovative definition of noise data points in terms of the distributional char-acteristics and subordination relations to their neighboring clusters. Inspired by the definition, we also develop a noise elimination algorithm which can remove meaningless noise data points before resolving the dual problem as well as im-prove its separability without destroying the profile. Actually, with the help of noise elimination algorithm, the efficiency of support vector clustering algorith-m would be improved since a plenty of useless nonlinear mapping operations are avoided and the memory space required by the kernel matrix is significantly reduced.
     (3) To achieve improvement on efficiency, one of the principal resolutions should be exploring an expected kind of cluster prototypes. The traditional sup-port vector clustering algorithms either choose the division of support vectors as cluster prototype or prefer a framework of one cluster with one prototype. However, the prior can hardly deals with large-scale data set with high dimen-sion for low efficiency whereas the cluster prototypes found by the latter are usually could not reach an expected inductive value while happens to clusters with irregular profiles or imbalance distributional data points. More importantly the latter frequently reaches improvements on efficiency at the cost of accura-cy. To overcome these problems, in this thesis we introduce a double centroids support vector clustering (DBC) algorithm which employs a framework of one cluster with multiple prototypes and each prototype will be represented by a shape centroid and a density centroid. In the view of the backend principle, the DBC algorithm is a compromising way of the aforementioned two types of tra-ditional algorithms, but it adaptively uses a number of cluster prototype pairs distributing in a irregular cluster. Large amount of numerical simulation ex-perimental results show that the DBC algorithm not only inherits the ability of identifying clusters with irregular profiles, but also is able to reflect the degree of imbalanced distribution in a cluster and improve both efficiency and accuracy for cluster labeling. Furthermore, it is suitable for analyzing large-scale data set for the outstanding inductive value carried by the double centroids.
     (4) Cluster labeling algorithms is closely related to the way of finding or generating cluster prototypes. Literatures have emerged that most of the current support vector clustering algorithms share the same disadvantages of employing a large number of redundant point pairs from cluster prototypes and segmers on line segment connecting any point pair, when applied on finding connect-ed components. However, it frequently fails in improvement on accuracy with reduced efficiency. Thus, we present a convex decomposition based cluster la-beling (CDCL) algorithm. Even though the proposed CDCL algorithm derives from the framework of one cluster with multiple prototypes, it employs a num-ber of convex hulls with different shapes and sizes which are decomposed from cluster adaptively as prototypes instead of a single data point. Meanwhile, the thesis also gives out detailed investigations and analysis on the crucial factor de- fined as quasi-support vector which usually affects the judgement of connection between any two neighboring convex hulls. Based on this finding, an alterna-tive strategy is proposed by the author for finding the connected components, i.e., to sample the line segments crossing the dense region of quasi-support vec-tor. Actually, the proposed sample strategy will significantly avoid a amount of redundant sampled point pairs. In coordination with the convex decomposi-tion model, a newly nonlinear sample sequence is proposed and recommended for avoiding redundant segmers on each line segment, i.e., reducing the actual average sample rate. Compared to the traditional algorithms, numerical exper-imental results confirm that the proposed CDCL algorithm not only improves both the efficiency and cluster quality significantly, but also has advantage of not sensitive to parameters.
     (5) Generally, in terms of constructing either the minimal enclosing hyper-shpere or support function for support vector clustering, those data points, i.e., points lying in clusters, outliers and noise data points, are actually not required. What make it worse is that such data points would raise the memory cost and reduce efficiency for training. Based on this consideration, in this thesis we pro-pose a fast algorithm of support vector clustering (FAS VC). Firstly, the proposed algorithm selects cluster boundaries in input space to construct hypersphere, ex-tract support vectors and complete the construction of support function. Then, under the instruction of radius R of the constructed hypersphere, a self-adaptive cluster labeling strategy is employed to invoke either the CDCL algorithm or cone cluster labeling algorithm. Since the training data set can be significant-ly reduced before solving the dual problem and the constraint conditions are loosen by the self-adaptive cluster labeling strategy, the proposed FASVC al-gorithm achieves significant improvements on storage utilization and time effi-ciency that make it is suitable to deal with large-scale data set. Furthermore, the FASVC algorithm is independent of the penalty factor C and insensitive to the other parameters. Various large-scale benchmark results, including text clus-tering and P2P traffic classification, are provided to show the effectiveness and efficiency of the proposed algorithm.
     (6) Support vector machine has been suggested to be one of the best classi-fiers for text categorization. Due to the principle of structural risk minimization, the performance of the support vector machine is directly related to the sepa-rability of a data set, i.e., the margin between different categories. Therefore, an effective way of increasing data set's separability is expected to achieve im-provements on text categorization. In the view of information theory, the pro-cedure of vectorizing text could be recognized as a work of text compression, thus how to keep information as much as possible is important for improving performance of text categorization. However, too much information has been abandoned or neglected by the traditional text representation methods for such problems as excessive dependence on document frequency, global strategy em-ployed for term (or feature) weighting factors and irrespective of text struc-ture information, which causes the reduction of separability. In consideration of these issues, different improved methods are developed by this thesis with multiple perspectives.1) Firstly, we give a novel definition of category contri-bution degree for a term, and based on which a category contribution enhanced (called, CCE) scheme that gives consideration to the description of both cate-gory contribution and distributional differences among categories for terms are proposed to weight a term and avoid only dependence on document frequency.2) Secondly, a self-adaptive strategy of partitioning text into content blocks is presented. With this strategy, text structure information can be formulated by the importance of distributed content blocks and be earried by different terms.3) Thirdly, the thesis also defines both class tendency and class basis for each term, and proposed an enhanced scheme which integrates multiple class ten-dencies for terms to strengthen their separating capacities. Moreover, a fresh algorithm, i.e., co-contributions of terms on class tendency for vectorizing text (C2TCTVT), is developed with advantages of maintaining the distributed in-formation of class tendencies which is neglected by the traditional algorithm with the global strategy. Moreover, the C2TCTVT algorithm is not. only able to compress the traditional text vector from high-dimension and sparsity into real-ly low-dimension and compactness, but also make the multiple class tendencies carried on the text vector. Experimental results demonstrate that both of the sep-arability of text vector and the inductive value of the collected support vectors are significantly improved. Meanwhile, compared to the traditional ones, the C2TCTVT algorithm did achieved comparable accuracy as expected.4) Final-ly, in order to make improvements on the measurement of local importance for terms, the thesis presents two group of weighted term frequency methods with text structure information embedded which are recommended to be substitutes for term frequency. Especially, a great improvement on performance of text cat- egorization based on support vector machine can be achieved when these two methods are combined with the proposed CCE scheme.

引文

[1]Jyothi B. S. and Dharanipragada J., SyMon:A Practical Approach to Defend Large Structured P2P Systems Against Sybil Attack, Peer-to-Peer Networking and Applica-tions,4 (3),2011, pp.289-308.
    [2]Command Five Pty Ltd, Advanced Persistent Threats:A Decade in Review, Com-mand Five Pty Ltd,2011, URL:http://www.commandfive.com.
    [3]Tan P.-N., Steinbach M., and Kumar V., Introduction to Data Mining, Addison Wes-ley,2006.
    [4]Xu R. and Wunsch D., Survey of Clustering Algorithms, IEEE Transactions on Neural Networks,16 (3),2005, pp.645-678.
    [5]Xu R. and Wunsch D., Clustering, Hoboken, New Jersey:A John Wiley&Sons,2008.
    [6]Tsang I. W., Kwok J. T., and Li S., Learning The Kernel in Mahalanobis One-class Support Vector Machines, In Proceedings of the 2006 International Joint Conference on Neural Networks (IJCNN 2006), IEEE,2006, pp.2148-2154.
    [7]Mary-Huard T. and Robin S., Tailored Aggregation for Classification, IEEE Transac-tions on Pattern Analysis and Machine Intelligence,31 (11),2009, pp.2098-2105.
    [8]Czarnowski I., Cluster-based Instance Selection for Machine Classification, Knowl-edge and Information Systems,2012, pp.1-21, DOI 10.1007/s10115-010-0375-z
    [9]Maggi F., Matteucci M., and Zanero S., Detecting Intrusions through System Cal-l Sequence and Argument Analysis, IEEE Transactions on Dependable and Secure Computing,7 (4),2010, pp.381-395.
    [10]Francois J., Abdelnur H., State R., et al., Machine Learning Techniques for Passive Network Inventory, IEEE Transactions on Network and Service Management,7 (4), 2010, pp.244-257.
    [11]Joseph J. F. C., Lee B.-S., Das A., et al., Cross-Layer Detection of Sinking Behavior in Wireless Ad Hoc Networks Using SVM and FDA, IEEE Transactions on Depend-able and Secure Computing,8 (2),2011, pp.233-245.
    [12]Yi Y., Wu J., and Xu W., Incremental SVM based on Reserved Set for Network Intrusion Detection, IEEE Transactions on Network and Service Management,38 (6), 2011, pp.7698-7707.
    [13]Hofmann A. and Sick B., Online Intrusion Alert Aggregation with Generative Data Stream Modeling, IEEE Transactions on Dependable and Secure Computing,8 (2), 2011, pp.282-294.
    [14]Nguyen T. T. T. and Armitage G., A Survey of Techniques for Internet Traffic Clas-sification using Machine Learnings, IEEE Communications Surveys and Tutorials,10 (4),2008, pp.56-76.
    [15]Jagannathan G., Pillaipakkamnatt P., Wright R. N., et al., Communication-Efficient Privacy-Preserving Clustering, Transactions on Data Privacy,3 (1),2010, pp.1-25.
    [16]Schroeder B. and Gibson G. A., A Large-Scale Study of Failures in High-Performance Computing Systems, IEEE Transactions on Dependable and Secure Computing,7 (4), 2010, pp.337-351.
    [17]Kalyani S. and Swarup K. S., Classification and Assessment of Power System Security Using Multiclass SVM, IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews,41 (5),2011, pp.753-758.
    [18]Qi X. and Davison B. D., Web Page Classification:Features and Algorithms, ACM Computing Surveys (CSUR),41 (2),2009, pp.12:1-31.
    [19]Yu B. and Xu Z. b., A Comparative Study for Content-based Dynamic Spam Clas-sification using Four Machine Learning Algorithms, Knowledge-Based Systems,21, 2008, pp.355-362.
    [20]Abu-Nimeh S. and Chen T., Proliferation and Detection of Blog Spam, IEEE Security & Privacy,8 (5),2010, pp.42-47.
    [21]Fuller C. M., Biros D. P., and Delen D., An Investigation of Data and Text Mining Methods for Real World Deception Detection, IEEE Transactions on Network and Service Management,38 (7),2011, pp.8392-8398.
    [22]Chen Z. Q., Zhang Y., and Chen Z. R., A Categorization Framework for Common Computer Vulnerabilities and Exposures, Computer Journal,53 (5),2010, pp.551-580.
    [23]Pham M. C., Cao Y. W., Klamma R., et al., A Clustering Approach for Collabora-tive Filtering Recommendation Using Social Network Analysis, Journal of Universal Computer Science,17 (4),2011, pp.583-604.
    [24]Cheng H., Zhou Y., and Yu J. X., Clustering Large Attributed Graphs:A Balance between Structural and Attribute Similarities, ACM Transactions on Knowledge Dis-covery from Data (ACM TKDD),5 (2),2011, pp.12:1-33.
    [25]Mitchell T. M., Machine Learning, New York:McGraw-Hill,1997.
    [26]Camastra F. and Vinciarelli A., Machine Learning for Audio, Image and Video Anal-ysis:Theory and Applications, Springer,2008.
    [27]高茂庭,文本聚类分析若干问题研究[博士论文],天津大学,天津,2006年
    [28]Mahalanobis P. C., On The Generalised Distance in Statistics, Proceedings of the National Institute of Sciences of India,2 (1),1936, pp.49-55.
    [29]Psorakis I., Damoulas T., and Girolami M. A., Multiclass Relevance Vector Ma-chines:Sparsity and Accuracy, IEEE Transactions on Neural Networks,21 (10),2010, pp.1588-1598.
    [30]MacQueen J., Some Methods for Classification and Analysis of Multivariate Observa-tions, In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, Berkeley, USA,1967, pp.281-297.
    [31]Estivill-Castro V. and Yang J., A Fast and Robust General Purpose Clustering Algo-rithm, In Mizoguchi R. and Slaney J., editors, Proceedings 6th Pacific Rim Interna-tional Conference on Artificial Intelligence (PRICAI 2000), Lecture Notes in Artificial Intelligence 1886, New York, NY:Springer-Verlag,2000, pp.208-218.
    [32]Ng R. and Han J., Efficient and Effective Clustering Methods for Spatial Data Mining, In Bocca J. B., Jarke M., and Zaniolo C., editors, Proceedings of the 20th Interna-tional Conference on Very Large Data bases, Morgan Kaufmann,1994, pp.144-155.
    [33]Guan R. C., Marchese X. H., Yang M., et al., Text Clustering with Seeds Affinity Propagation, IEEE Trans on Knowledge and Data Engineering,23 (4),2011, pp.627-637.
    [34]Shamir O. and Tishby N., Stability and Model Selection in K-means Clustering, Ma-chine Learning,80 (2-3),2010, pp.213-243.
    [35]Jin R. M., Goswami A., and Agrawal G., Fast and Exact Out-of-core and Distributed K-Means Clustering, Knowledge and Information Systems,10(1),2006, pp.17-40.
    [36]Hsieh T. W. and Taur J. S., A KNN-Scoring Based Core-Growing Approach to Cluster Analysis, Journal of Signal Processing Systems,60 (1),2009, pp.105-114.
    [37]Zhang T., Ramakrishnan R., and Livny M., BIRCH:An Efficient Data Clustering Method for Very Large Databases, In Jagadish H. V. and Mumick I. S., editors, Pro-ceedings of ACM SIGMOD International Conference on Management of Data, ACM Press,1996, pp.103-114.
    [38]Guha S., Rastogi R., and Shim K., CURE:an efficient clustering algorithm for large databases, In Haas L. M. and Tiwary A., editors, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, ACM Press, New York, NY, USA,1998, pp.73-84.
    [39]Chiang J. H. and Hao P. Y., A New Kernel-based Fuzzy Clustering Approach:Support Vector Clustering with Cell Growing, IEEE Transactions on Fuzzy Systems,11 (4), 2003, pp.518-527.
    [40]Ben-Hur A., Horn D., Siegelmann H. T., et al., A Support Vector Method for Hi-erarchical Clustering, In Leen T., Dietterich T., and Tresp V., editors, Advances in Neural Information Processing Systems 13, MIT Press,2001, pp.367-373.
    [41]Ester M., Kriegel H., Sander J., et al., A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise, In Simoudis E., Han J., and Fayyad U. M., editors, Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, AAAI Press,1996, pp.126-231.
    [42]Viswanath P. and Suresh B. V., Rough-DBSCAN:A fast hybrid density based cluster-ing method for large data sets, Pattern Recognition Letters,30 (16),2009, pp.1477-1488.
    [43]Ester M., Kriegel H., Sander J., et al., Density-connected Sets and Their Application for Trend Detection in Spatial Databases, In Heckerman D., Mannila H., and Pregibon D., editors, Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining (KDD'97), AAAI Press,1997, pp.10-15.
    [44]Ankerst M., Breunig M., and Kriegel H., OPTICS:Ordering Points to Identify the Clustering Structure, In Delis A., Faloutsos C., and Ghandeharizadeh S., editors, Proceedings of ACM SIGMOD'99 International Conference on Management of Data, ACM Press,1999, pp.49-60.
    [45]Wang W., Yang J., and Muntz R., STING:A Statistical Information Grid Approach to Spatial Data Mining, In Jarke M., Carey M. J., Dittrich K. R., et al., editors, Proceedings of the 23rd International Conference on VLDB, Morgan Kaufmann,1997, pp.186-195.
    [46]Agrawa R., Gehrke J., Gunopulos D., et al., Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications, In Gupta A., Shmueli O., and Widom J., editors, Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, Morgan Kaufmann,1998, pp.94-105.
    [47]Sheikholeslami G., Chatterjee S., and Zhang A. D., WaveCluster:A Multi-resolution Clustering Approach for Very Large Spatial Databases, In Gupta A., Shmueli O., and Widom J., editors, Proceedings of the 24th International Conference on Very Large Data Bases, Morgan Kaufmann,1998, pp.428-439.
    [48]Jung K.-H., Lee D., and Lee J., Fast Support-based Clustering Method for Large-scale Problems, Pattern Recognition,43 (5),2010, pp.1975-1983.
    [49]Bezdek J. C., Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press,1981.
    [50]Hruschka E., Campello R., and Castro L. D., Evolving Clusters in Gene-expression Data, Information Sciences,176 (13),2006, pp.1898-1927.
    [51]Merwe D. V. D. and Engelbrecht A. P., Data Clustering Using Particle Swarm Op-timization, In Proceedings of IEEE Congress on Evolutionary Computation, IEEE, 2003, pp.215-220.
    [52]Shelokar P. S., Jayaraman V. K., and Kulkarni B. D., An Ant Colony Approach for Clustering, Analytica Chimica Acta,509 (2),2004, pp.187-195.
    [53]Wan M., Li L. X., Xiao J. H., et al., CAS Based Clustering Algorithm for Web Users, Nonlinear Dynamics,61 (3),2010, pp.347-361.
    [54]Wan M., Li L. X., Xiao J. H., et al., Data Clustering Using Bacterial Forag-ing Optimization, Journal of Intelligent Information System,2012, pp.1-21, DOI: 10.1007/s 10844-011-0158-3.
    [55]Dhillon I. S., Guan Y., and Kulis B., Weighted Graph Cuts without Eigenvectors: A Multilevel Approach, IEEE Transactions on Pattern Analysis and Machine Intelli-gence,29 (11),2007, pp.1944-1957.
    [56]Filippone M., Camastra F., Masulli F., et al., A Survey on Spectral and Kernel Methods for Clustering, Pattern Recognition,41(1),2008, pp.176-190.
    [57]Cai D., He X. F., Han J. W., et al., Graph Regularized Nonnegative Matrix Factor-ization for Data Representation, IEEE Transactions on Pattern Analysis and Machine Intelligence,33 (8),2011, pp.1548-1560.
    [58]Wang Y., Jiang Y., Wu Y., et al., Spectral Clustering on Multiple Manifolds, IEEE Transactions on Neural Networks,22 (7),2011, pp.1149-1161.
    [59]Dhillon I. S., Guan Y., and Kulis B., A Unified View of Kernel k-means, Spectral Clustering and Graph Partitioning, Technical Report TR-04-25, UTCS,2005.
    [60]Tax D. M. J. and Duin R. P. W., Support Vector Domain Description, Pattern Recog-nition Letters,20 (11-13),1999, pp.1191-1199.
    [61]Ben-Hur A., Horn D., Siegelmann H. T., et al., Support Vecotr Clustering, Journal of Machine Learning Research,2 (mar),2001, pp.125-137.
    [62]Scholkopf B., Platt J. C., Shawe-Taylor J., et al., Estimating the Support of a High-Dimensional Distribution, Neural Computation,13 (7),2001, pp.1443-1472.
    [63]Hao P.-Y., Chiang J.-H., and Tu Y.-K., Hierarchically SVM Classification based on Support Vector Clustering Method and Its Application to Document Categorization, Expert Systems with Applications,33 (3),2007, pp.627-635.
    [64]Bilgin G., Erturk S., and Yildirim T., Segmentation of Hyperspectral Images via Sub-tractive Clustering and Cluster Validation Using One-Class Support Vector Machines, IEEE Transactions on Geoscience. and Remote Sensing,49 (8),2011, pp.2936-2944.
    [65]Jung K.-H., Kim N., and Lee J., Dynamic Pattern Denoising Method using Multi-basin System with Kernels, Pattern Recognition,44 (8),2011, pp.1698-1707.
    [66]Wang C.-D., Lai J.-H., Huang D., et al., SVStream:A Support Vector Based Algo-rithm for Clustering Data Streams, IEEE Transactions on Knowledge and Data Engi-neering,2012, pp.1-14, doi:10.1109/TKDE.2011.263.
    [67]Chicco G. and Ilie I.-S., Support Vector Clustering of Electrical Load Pattern Data, IEEE Transactions on Power Systems,24 (3),2009, pp.1619-1628.
    [68]Wang C. H., Separation of Composite Defect Patterns on Wafer Bin Map using Sup-port Vector Clustering, Expert Systems with Applications,36 (2),2009, pp.2554-2561.
    [69]Yuan T., Bae S. J., and Park J. I., Bayesian Spatial Defect Pattern Recognition in Semiconductor Fabrication using Support Vector Clustering, International Journal of Advanced Manufacturing Technology,51 (5-8),2010, pp.671-683.
    [70]Wang D. F., Shi L., Yeung D. S., et al., Support Vector Clustering for Brain Activation Detection, In Duncan J. S. and Gerig G., editors, Medical Image Computing and Computer-Assisted Intervention-Miccai 2005, Pt 1,2005, pp.572-579.
    [71]Wang D. F., Shi L., Yeung D. S., et al., Ellipsoidal Support Vector Clustering for Functional MRI Analysis, Pattern Recognition,40 (10),2007, pp.2685-2695.
    [72]Zhang G. X., Rong H. N., and Jin W. D., Intra-pulse Modulation Recognition of Unknown Radar Emitter Signals using Support Vector Clustering, In Wang L., Jiao L., Shi G., et al., editors, Proceedings on Fuzzy Systems and Knowledge Discovery, Springer,2006, pp.420-429.
    [73]Ling P. and Zhou C. G., Adaptive Support Vector Clustering for Multi-Relational Data Mining, In Wang J., Yi Z., Zurada J. M., et al., editors, Advances in Neural Networks-ISNN 2006, Pt 1, Lecture Notes in Computer Science 3971 Springer,2006, pp.1222-1230.
    [74]Jing L. P., Ng M. K., and Huang J. Z., Knowledge-based Vector Space Model for Text Clustering, Knowledge and Information Systems,25 (1),2010, pp.35-55.
    [75]张华平等,ICTCLAS汉语分词系统,北京理工大学网络搜索挖掘与安全实验室,2012年,URL:http://ictclas.org.
    [76]苏金树,张博锋,徐昕,基于机器学习的文本分类技术研究进展,软件学报,17(9),2006,pp.1848-1859.
    [77]Strzalkowski T., Natural Language Information Retrieval, The Netherlands:Kluwer Academic Publishers,1999
    [78]Scott S. and Matwin S., Text Classification using WordNet Hypernyms, In Proceed-ings of the COLING/ACL Workshop on Usage of WordNet in Natural Language Pro-cessing Systems,1998, pp.45-52.
    [79]Lewis D. D., An Evaluation of Phrasal and Clustered Representation on a Text Catego-rization Task, In Belkin N. J., Ingwersen P., and Pejtersen A. M., editors, Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Develop-ment in Information Retrieval (SIGIR'92), ACM,1992, pp.37-50.
    [80]Li Y. J., Chung S. M., and Holt J. D., Text Document Clustering based on Frequent Word Meaning Sequences, Data & Knowledge Engineering,64 (1),2008, pp.381-404.
    [81]赵军,金千里,徐波,面向文本检索的语义计算,计算机学报,28(12),2005,pp.2068-2078.
    [82]Zhang W., Yoshida T., and Tang X. J., Text Classification based on Multi-word with Support Vector Machine, Knowledge-Based Systems,28 (1),2008, pp.879-886.
    [83]Zhai Z. W., Xu H., Kan B. D., et al., Exploiting Effective Features for Chinese Sentiment Classification, Expert Systems with Applications,38 (8),2011, pp.9139-9146.
    [84]Xue X. B. and Zhou Z. H., Distributional Features for Text Categorization, IEEE Transactions on Knowledge and Data Engineering,21 (3),2009, pp.428-442.
    [85]Salton G. and Buckley C., Term-weighting Approaches in Automatic Text Retrieval, Information Processing and Management,24 (5),1988, pp.513-523.
    [86]Joachims T., Learning to Classify Text Using Support Vector Machines:Methods, Theory and Algorithms, Norwell, MA, USA:Kluwer Academic Publishers,2002
    [87]Lan M., Tan C. L., Su J., et al., Supervised and Ttraditional Term Weighting Meth-ods for Automatic Text Categorization, IEEE Transactions on Pattern Analysis and Machine Intelligence,31 (4),2009, pp.721-735.
    [88]Sebastiani F., Machine Learning in Automated Text Categorization, ACM Computing Surveys,34 (1),2002, pp.1-47.
    [89]Liu Y., Loh H. T., and Sun A. X., Imbalanced Text Classification:A Term Weighting Approach, Expert System with Applications,36 (1),2009, pp.690-701.
    [90]Altincay H. and Erenel Z., Analytical Evaluation of Term Weighting Schemes for Text Categorization, Pattern Recognition Letters,31 (11),2010, pp.1310-1323.
    [91]Joachims T., A Probabilistic Analysis of The Rocchio Algorithm with TFIDF for Text Categorization, In Fisher D. H., editor, Proceedings of the 14th Internet Conference on Machine Learning, Morgan Kaufmann,1997, pp.143-151.
    [92]Miao Y.-Q. and Kamel M., Pairwise Optimized Rocchio Algorithm for Text catego-rization, Pattern Recognition Letters,32 (2),2011, pp.375-382.
    [93]Chen J. N., Huang H. K., Tian S. F., et al., Feature selection for text classification with naive bayes, Expert Systems with Applications,36 (3),2009, pp.5432-5435.
    [94]Pernkopf F., Wohlmayr M., and Tschiatschek S., Maximum Margin Bayesian Net-work Classifiers, IEEE Transactions on Pattern Analysis and Machine Intelligence,34 (3),2012, pp.521-532.
    [95]Koprinska C. I. and Poon J., A Neural Network based Approach to Automated E-Mail Classification, In Proc. IEEE/WIC International Conference on Web Intelligence (WI), IEEE Computer Society,2003, pp.702-705.
    [96]Selamat A. and Omatu S., Web Page Feature Selection and Classification using Neural Networks, Information Sciences,158,2004, pp.69-88.
    [97]Tan S. B., Neighbor-weighted K-Nearest Neighbor for Unbalanced Text Corpus, Ex-pert Systems with Applications,28 (4),2005, pp.667-671.
    [98]Tan S. B., Wang Y. F., and Wu G. W., Adapting Centroid Classifier for Document Categorization, Expert Systems with Applications,38 (8),2011, pp.10264-10273.
    [99]Scholkopf B., Burges C. J. C., and Smola A. J., Introduction to Support Vector Learning, In Scholkopf B., Burges C. J. C., and Smola A. J., editors, Advances in Kernel Methods:Support Vector Learning, MIT Press,1999, pp.1-15.
    [100]Zhou T. Y., Tao D. C., and Wu X. D., Compressed Labeling on distilled Labelsets for Multi-Label Learning, Machine Learning,2012, pp.1-58, DOI 10.1007/s 10994-011-5276-1.
    [101]Osuna E., Freund R., and Girosi F., Training Support Vector Machines:An Applica-tion to Face Detection, In IEEE Conference on Computer Vision and Pattern Recogni-tion, IEEE Computer Society, Los Alamitos, CA, US,1997, pp.130-136.
    [102]Joachims T., Text Categorization with Support Vector Machines:Learning with Many Relevant Features, In Claire N. and Celine R., editors, Proceedings of the 10th Euro-pean Conference on Machine Learning, Springer-Verlag, London, UK, UK,1998, pp. 137-142.
    [103]Brown M. P. S., Grundy W. N., Lin D., et al., Knowledge-based Analysis of Microar-ray Gene Expression Data by Using Support Vector Machines, In Proceedings of the National Academy of Sciences,2000, pp.262-267.
    [104]Sebald D. J. and Buklew J. A., Support Vector Machine Techniques for Nonlinear Equalization, IEEE Transactions on Signal Processing,48 (11),2000, pp.3217-3226.
    [105]Navia-Vasquez A., Perez-Cruz F., and Artes-Rodriguez A., Weighted Least Squares Training of Support Vector Classifiers Leading to Compact and Adaptive Schemes, IEEE Transactions on Neural Networks,12 (5),2001, pp.1047-1059.
    [106]El-Naqa I., Yang Y., Wernik M., et al., A Support Vector Machine Approach for De-tection of Microcalcifications, IEEE Transactions on Medical Imaging,21 (12),2002, pp.1552-1563.
    [107]Fan R.-E., Chang K.-W., Hsien C.-J., et al., LIBLINEAR:A Library for Large Linear Classification, Journal of Machine Learning Research,9 (Aug),2008, pp.1871-1874.
    [108]Hsien C.-J., Chang K.-W., Lin C.-J., et al., A Dual Coordinate Descent Method for Large-scale Linear SVM, Journal of Machine Learning Research,9 (Aug),2008, pp. 1871-1874.
    [109]Yu H.-F., Hsieh C.-J., Chang K.-W., et al., Large Linear Classification When Data Cannot Fit in Memory, ACM Transactions on Knowledge Discovery From Data,5 (4), 2012, pp.23:1-23.
    [110]Burges C. J. C., A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery,2 (2),1998, pp.121-167.
    [111]Leopold E. and Kindermann J., Text Categorization with Support Vector Machines: How to Represent Texts in Input Space?, Machine Learning,46,2002, pp.423-444.
    [112]Vladimir N. Vapnik编著,许建华,张学工译,《统计学习理论》,电子工业出版社,2009年
    [113]田英杰,支持向量回归机及其应用研究[博士论文],中国农业大学,北京,2005年.
    [114]Boyd S. and Vandenberghe L., Convex Optimization,2nd Edition, Cambridge Uni-versity Press, Cambridge, New York,2009.
    [115]邓乃扬,田英杰编著,《支持向量机—理论、算法与拓展》,科学出版社,2009年
    [116]Deng N. Y., Tian Y. J., and Zhang C. H., Support Vector Machines:Theory, Algo-rithms, and Extensions, Springer,2012.
    [117]Abe S., Support Vector Machines for Pattern Classification,2nd Edition, Springer, 2010.
    [118]Hsien C.-J., Chang K.-W., Lin C.-J., et al., A Dual Coordinate Descent Method for Large-scale Linear SVM, In McCallum A. and Roweis S., editors, Proceedings of the 25 th International Conference on Machine Learning (ICML'08), ACM, New York, NY, USA,2008, pp.408-415.
    [119]Lin C.-J., Weng R. C., and Keerthi S. S., Trust Region Newton Method for Large-Scale Logistic Regression, Journal of Machine Learning Research,9 (Aug),2008, pp. 627-650.
    [120]Bennett K. P., Combining Support Vector and Mathematical Programming Methods for Classification, In Scholkopf B., Burges C. J. C., and Smola A. J., editors, Ad-vances in Kernel Methods:Support Vector Learning, MIT Press, Cambridge, MA, 1999, pp.307-326.
    [121]Mayoraz E. and Alpaydin E., Support Vector Machines for Multi-class Classification, In Mira J. and Sanchez-Andres J. V., editors, Engineering Applications of Bio-Inspired Artificial Neural Networks—Proceedings of International Work-Conference on Arti-ficial and Natural Neural Networks (IWANN'99), volume 1607, Lecture Notes in Computer Science,1999, pp.833-842.
    [122]Kikuchi T. and Abe S., Comparison between Error Correcting Output Codes and Fuzzy Support Vector Machines, Pattern Recognition Letters,26 (12),2005, pp.1937-1945.
    [123]Juneja A. and Espy-Wilson C., Speech Segmentation using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines, In Wunsch D. C., Hasselmo M. Venayagamoorthy G. K., et al., editors, Proceedings of International Joint Conference on Neural Networks (IJCNN 2003), volume 1, Elsevier Science Ltd,2003, pp.675-679.
    [124]Schwenker F., Solving Multi-class Pattern Recognition Problems with Tree Structured Support Vector Machines, In Radig B. and Florczyk S., editors, Pattern Recognition: Proceedings of Twenty-Third DAGM Symposium, Springer-Verlag, Berlin, Germany, 2001, pp.283-290.
    [125]Fan R.-E., Chen P.-H., and Lin C.-J., Working Set Selection using Second Order Information for Training SVM, Journal of Machine Learning Research,6 (Dec),2005, pp.1889-1918.
    [126]Chang F., Guo C.-Y., Lin X.-R., et al., Tree Decomposition for Large-Scale SVM Problems, Journal of Machine Learning Research,11 (Oct),2010, pp.2935-2972.
    [127]Smola A. J. and Scholkopf B., A Tutorial on Support Vector Regression, Statistics and Computing,14 (3),2004, pp.199-222.
    [128]Cortes C. and Vapnik V., Support Vector Networks, Machine Learning,20 (3),1995, pp.273-297.
    [129]Boser B. E., Guyon I., and Vapnik. V., A Training Algorithm for Optimal Margin Classifier, In Haussler D., editor, Proceedings of the 5th annual ACM workshop on Computational Learning Theory, ACM, New York, NY, USA,1992., pp.144-152
    [130]Platt J. C., Fast Training of Support Vector Machines using Sequential Minimal Op-timization, In Scholkopf B., Burges C. J. C., and Smola A. J.,editors, Advances in Kernel Methods:Support Vector Learning, MIT Press, Cambridge, MA, USA,1999, pp.185-208.
    [131]Osuna E., Freund R., and Cirosi F., Improved Training Algorithm for Support Vector Machines, In Proceedings of the IEEE NNSP, IEEE, Piscataway, NJ, US,1997, pp. 276-285.
    [132]Dong J. X., Krzyzak A., and Suen C. Y., Fast SVM Training Algorithm with Decom-position on Very Large Data Sets, IEEE Transactions on Pattern Analysis and Machine Intelligence,27 (4),2005, pp.603-618.
    [133]Zapien K., Fehr J., and Burkhardt H., Fast Support Vector Machine Classification using linear SVMs, In Proceedings of the 2006 International Conference on Pattern Recognition, Institute of Electrical and Electronics Engineers Inc., Piscataway, NJ, US, August 20-24,2006, pp.366-369.
    [134]Fehr J., Arreola K., and Burkhardt H., Fast Support Vector Machine Classification of Very Large Datasets, Data Analysis, Machine Learning and Applications, (Ⅰ),2008, pp.11-18.
    [135]Lin H.-J. and Yeh J., A Hybrid Optimization Strategy for Simplifying The Solutions of Support Vector Machines, Pattern Recognition Letters,31 (7),2010, pp.563-571.
    [136]Li Y. J., Liu B., Yang X. W., et al., Multiconlitron:A General Piecewise Linear Classifier, IEEE Transactions on Neural Networks,22 (2),2011, pp.276-289.
    [137]Silva C., Lotric U., Ribeiro B., et al., Distributed Text Classification With an En-semble Kernel-Based Learning Approach, IEEE Transactions on Systems, Man, and Cybernetics-Part C:Applicaitons and Review,40 (3),2010, pp.287-297.
    [138]Yu H.-F., Hsieh C.-J., Chang K.-W., et al., Large Linear Classification When Data Cannot Fit in Memory, In Rao B., Krishnapuram B., Tomkins A., et al., editors, Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Dis-covery and Data Mining, ACM, New York, NY, USA,2010, pp.833-842.
    [139]Yu H.-F., Hsieh C.-J., Chang K.-W., et al., Large Linear Classification When Data Cannot Fit in Memory, In Walsh T., editor, Proceedings of the International Joint Conference on Artificial Intelligence, IJCAI/AAAI, NY, US,2011, pp.2777-2782.
    [140]Scholkopf B. and Smola A., Learning with Kernels:Support Vector Machines, Regu-larization, Optimization and Beyond, MIT Press, Cambridge, MA,2002.
    [141]Keerthi S. S., Chapelle O., and DeCoste D., Building Support Vector Machines with Reduced Classifier Complexity, Journal of Machine Learning Research,7 (Dec),2006, pp.1493-1515.
    [142]Burges C. J. C., Simplified Support Vector Decision Rules, In Saitta L., editor, Pro-ceedings of the 13th International Conference on Machine Learning (ICML'96), Mor-gan Kaufmann, Morgan Kaufmann, San Francisco,1996, pp.71-77.
    [143]Collobert R., Sinz F., Weston J., et al., Trading Convexity for Scalability, In Cohen W. W. and Moore A., editors, Proceedings of the 23 th International Conference on Machine Learning, ACM, June 25-29,2006, pp.201-208.
    [144]Wu M., Scholkopf B., and Bakir G., A Direct Method for Building Sparse Kernel Learning Algorithms, Journal of Machine Learning Research,7 (Dec),2006, pp.603-624.
    [145]Yu J., Vishwanathan S. V. N., Gunter S., et al., A quasi-Newton Approach to Nons-mooth Convex Optimization, In Cohen W. W., McCallum A., and Roweis S. T., ed-itors, Proceedings of the 25 th International Conference on Machine Learning, ACM, June 5-9,2008, pp.1216-1223.
    [146]Roux N. L. and Fitzgibbon A., A Fast Natural Newton Method, In Furnkranz J. and Joachims T., editors, Proceedings of the 27 th International Conference on Machine Learning, Omnipress, June 21-24,2010, pp.623-630.
    [147]Bottou L., Bousquet O., and Zurich G., The Tradeoffs of Large Scale Learning, In Koller D., Schuurmans D., Bengio Y., et al., editors, Advances in Neural Information Processing Systems 20, MIT Press,2008, pp.161-168.
    [148]Khalil H. K., Nonlinear Systems,3rd Edition, Prentice Hall,2002.
    [149]Joachims T., Training Linear SVMs in Linear Time, In Eliassi-Rad T., Ungar L. H., Craven M., et al., editors, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM,2006, pp.217-226.
    [150]Teo C. H., Smola A., Vishwanathan S. V. N., et al., A Scalable Modular Convex Solver for Regularized Risk Minimization, In Berkhin P., Caruana R., and Wu X. D. , editors, Proceedings of the 13th ACM SIGKDD international conference on Knowl-edge discovery and data mining, ACM, August 12-15,2007, pp.727-736.
    [151]Franc V. and Sonnenburg S., Optimized Cutting Plane Algorithm for Large-scale Risk Minimization, Journal of Machine Learning Research,10 (Dec),2009, pp.2157-2192.
    [152]Sonnenburg S. and Franc V., COFFIN:A Computational Framework for Linear SVM-s, In Furnkranz J. and Joachims T., editors, Proceedings of the 27 th International Conference on Machine Learning, Omnipress, June 21-24,2010, pp.999-1006.
    [153]Bennett K. and Demiriz A., Semi-supervised Support Vector Machines, In Kearns M. S., Solla S. A., and Cohn D. A., editors, Advances in Neural Information Processing Systems 12, MIT Press, Cambridge, MA,1998, pp.368-374.
    [154]Li Y.-F. and Zhou Z.-H., Improving Semi-supervised Support Vector Machines Through Unlabeled Instances Selection, In Burgard W. and Roth D., editors, Pro-ceedings of the 25th AAAI Conference on Artificial Intelligence (AAAI'11), AAAI Press,2011, pp.386-391.
    [155]Suykens J. A. K. and Vandewalle J., Least Squares Support Vector Machine Classi-fiers, Neural Processing Letters,9 (3),1999, pp.293-300.
    [156]Suykens J. A. K. and Brananter J., Weighted Least Squares Support Vector Machines: Robustness and Sparse Approximation, Neurocomputing,48,2002, pp.85-105.
    [157]Abe S., Sparse Least Squares Support Vector Training in The Reduced Empirical Feature Space, Pattern Analysis and Applications,10 (3),2007, pp.203-214.
    [158]Abe S., Sparse Least Squares Support Vector Machines by Forward Selection based on Linear Discriminant Analysis, In Prevost L., Marinai S., and Schwenker F., ed-itors, Proceedings of Third IAPR Workshop on Artificial Neural Networks in Pattern Recognition, Springer-Verlag, Berlin, Germany,2008, pp.54-65.
    [159]Iwamura K. and Abe S., Sparse Support Vector Machines Trained in The Reduced Empirical Feature Space, In Proceedings of the 2008 International Joint Conference on Neural Networks (IJCNN 2008), Springer-Verlag,2008, pp.2399-2405.
    [160]Scholkopf B., Smola A. J., Williamson R. C., et al., New Support Vector Algorithms, Neural Computation,12 (5),2000, pp.1207-1245.
    [161]Zhu J., Rosset S., Hastie T., et al.,1-norm Support Vector Machines, In Thrun S., Lawrence K. S., and Scholkopf B., editors, Advances in Neural Information Processing Systems 16, MIT Press,2003, pp.49-56.
    [162]Tipping M., Sparse Bayesian Learning and The Relevance Vector Machine, Journal of Machine Learning Research,1 (Sep),2001, pp.211-244.
    [163]Zhu J. and Hastie T., Kernel Logistic Regression and The Import Vector Machine, Journal of Computational and Graphical Statistics,14 (1),2005, pp.185-205.
    [164]Fung G. and Mangasarian O. L., Proximal Support Vector Machine Classifiers, In Proceedings of Seventh International Conference on Knowledge and Data Discovery, Kluwer Academic Publishers,2001, pp.77-86.
    [165]Mangasarian O. L. and Wild E. W., Multisurface Proximal Support Vector Classifica-tion via Generalized Eigenvalues, IEEE Transactions on Pattern Analysis and Machine Intelligence,28 (1),2006, pp.69-74.
    [166]Jayadeva K. R. and Chandra S., Improvements on Twin Support Vector Machines, IEEE Transactions on Pattern Analysis and Machine Intelligence,29 (5),2007, pp. 905-910.
    [167]Shao Y.-H., Zhang C.-H., Wang X.-B., et al., Twin Support Vector Machines for Pattern Classification, IEEE Transactions on Neural Networks,22 (6),2011, pp.962-968.
    [168]Santanu G., Anirban M., and Pranab K. D., Nonparallel Plane Proximal Classifier, Signal Processing,89 (4),2009, pp.510-522.
    [169]Bennett K. P. and Bredensteiner E., Duality and Geometry in SVM Classifiers, In Sait-ta L., editor, Proceedings of the 17th International Conference on Machine Learning (ICML'96), Morgan Kaufmann,2000, pp.57-64.
    [170]Zhang J. P., Wang X. D., Kruger U., et al., Principal Curve Algorithms for Partitioning High-Dimensional Data Spaces, IEEE Transactions on Neural Networks,22 (3),2011, pp.367-380.
    [171]Segata N. and Blanzieri E., Fast and Scalable Local Kernel Machines, Journal of Machine Learning Research,11 (Aug),2010, pp.1883-1926.
    [172]Pujol O. and Masip D., Geometry-Based Ensembles:Toward a Structural Charac-terization of the Classification Boundary, IEEE Transactions on Pattern Analysis and Machine Intelligence,31 (6),2009, pp.1140-1146.
    [173]Chen H. H., Tino P., and Yao X., Probabilistic Classification Vector Machines, IEEE Transactions on Neural Networks,20 (6),2009, pp.901-914.
    [174]Belkin M., Niyogi P., and Sindhwani V., Manifold Regularization:A Geometric Framework for Learning from Examples, Univ. Chicago, Chicago, IL, Tech. Rep. TR-2004-06,2004.
    [175]Yeung D. S., Wang D. F., Ng W. W. Y., et al., Structured Large Margin Machines: Sensitive to Data Distributions, Machine Learning,68 (2),2007, pp.171-200.
    [176]Xue H., Chen S. C., and Yang Q., Structural Regularized Support Vector Machine: A Framework for Structural Large Margin Classifier, IEEE Transactions on Neural Networks,22 (4),2011, pp.573-587.
    [177]Li B., Chi M. M., Fan J. P., et al., Support Cluster Machine, In Ghahramani Z., editor, Proceedings of the 24th International Conference on Machine Learning (ICML), ACM, 2007, pp.505-512.
    [178]Platt J. C., Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods, In Advances in Large Margin Classifers, MIT Press, 1999,pp.61-74.
    [179]Blanzieri E. and Melgani F., An Adaptive SVM Nearest Neighbor Classifier for Re-motely, In IEEE International Conference on Geoscience and Remote Sensing Sym-posium (IGARSS'06), IEEE,2006, pp.3931-3934.
    [180]Blanzieri E. and Melgani F., Nearest Neighbor Classification of Remote Sensing Im-ages with The Maximal Margin Principle, IEEE Transactions on Geoscience and Re-mote Sensing,46 (6),2008, pp.1804-1811.
    [181]Lin C.-F. and S.-D.Wang, Fuzzy Support Vector Machines, IEEE Transactions on Neural Networks,13 (2),2002, pp.464-471.
    [182]Heo G. and Gader P., Fuzzy SVM for Noisy Data:A Robust Membership Calculation Method, In IEEE International Conference on Fuzzy Systems, IEEE,2009, pp.431-436.
    [183]Batuwita R. and Palade V., FSVM-CIL:Fuzzy Support Vector Machines for Class Imbalance Learning, IEEE Transactions on Fuzzy Systems,18 (3),2010, pp.558-571.
    [184]Cauwenberghs G. and Poggio T., Incremental and Decremental Support Vector Ma-chine Learning, In Lafferty J. D., Williams C. K. I., Shawe-Taylor J., et al., editors, Advances in Neural Information Processing Systems 2000, Curran Associates, Inc., 2000, pp.409-416.
    [185]Li Z. Y., Zhang J. F., and Hu S. S., Incremental Support Vector Machine Algorithm based on Multi-kernel Learning, Journal of Systems Engineering and Electronics,22 (4),2011, pp.702-706.
    [186]Karasuyama M. and Takeuchi I., Multiple Incremental Decremental Learning of Sup-port Vector Machines, IEEE Transactions on Neural Networks,21 (7),2010, pp.1048-1059.
    [187]Tang Y. C., Zhang Y.-Q., Chawla N. V., et al., SVMs Modeling for Highly Im-balanced Classification, IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics,39 (1),2009, pp.281-288.
    [188]Inoue T. and Abe S., Improvement of Generalization Ability of Multiclass Support Vector Machines by Introducing Fuzzy Logic and Bayes Theory, Transactions of the Institute of Systems, Control and Information Engineers,15 (12),2002, pp.643-651.
    [189]Lee D., Jung K.-H., and Lee J., Constructing Sparse Kernel Machines Using Attrac-tors, IEEE Transactions on Pattern Analysis and Machine Intelligence,20 (4),2009, pp.721-729.
    [190]Ertekin S., Bottou L., and Giles C. L., Context-Dependent Kernels for Object Clas-sification, IEEE Transactions on Pattern Analysis and Machine Intelligence,33 (4), 2011, pp.699-708.
    [191]Sahbi H., Audibert J.-Y., and Keriven R., Nonconvex Online Support Vector Ma-chines, IEEE Transactions on Pattern Analysis and Machine Intelligence,33 (2),2011, pp.368-381.
    [192]Wang Z., Chen S. C., and Sun T. K., MultiK-MHKS:A Novel Multiple Kernel Learning Algorithm, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(2),2008, pp.348-353.
    [193]Sonnenburg S., Ratsch G., Schafer C., et al., Large Scale Multiple Kernel Learning, Journal of Machine Learning Research,7 (Dec),2006, pp.1531-1565.
    [194]Rakotomamonjy A., Bach F. R., Canu S., et al., SimpleMKL, Journal of Machine Learning Research,9 (Nov),2008, pp.2491-2521.
    [195]Yang H. Q., Xu Z. L., Ye J. P., et al., Efficient Sparse Generalized Multiple Kernel Learning, IEEE Transactions on Neural Networks,22 (3),2011, pp.433-446.
    [196]Sarinnapakorn K. and Kubat M., Combining Subclassifiers in Text Categorization:A DST-based Solution and A Case Study, IEEE Transactions on Knowledge and Data Engineering,19 (12),2007, pp.1638-1651.
    [197]Nanni L. and Lumini A., Ensemble of Multiple Pedestrian Representations, IEEE Transactions on Intelligent Transportation Systems,9 (2),2008, pp.365-369.
    [198]Zhang L. and Zhou W.-D., Sparse Ensembles Using Weighted Combination Methods based on Linear Programming, Pattern Recognition,44 (1),2011, pp.97-106.
    [199]Shi Z. W. and Min H., Support Vector Echo-State Machine for Chaotic Time-Series Prediction, IEEE Transactions on Neural Networks,18 (2),2007, pp.359-372.
    [200]Saha I., Maulik U., Bandyopadhyay S., et al., SVMeFC:SVM Ensemble Fuzzy Clustering for Satellite Image Segmentation, IEEE Geoscience and Remote Sensing Letters,9 (1),2012, pp.52-55.
    [201]Munoz A. and Gonzalez J., Representing Functional Data using Support Vector Ma-chines, Pattern Recognition Letters,31 (6),2010, pp.511-516.
    [202]Law E., Settles B., and Mitchell T., Learning to Tag using Noisy Labels, In Balcazar J. L., Bonchi F., Gionis A., et al., editors, European Conference on Machine Learning and Knowledge Discovery in Databases, Springer,2010, pp.211-226.
    [203]Joseph J. F. C., Das A., Lee B.-S., et al., CARRADS:Cross layer based adaptive real-time routing attack detection system for MANETS, Computer Networks,54 (7), 2010, pp.1126-1141.
    [204]Tax D. M. J. and Duin R. P. W., Outliers and Data Descriptions, In Proceedings of the Seventh Annual Conference of the Advanced School for Computing and Imaging (ASCI 2011), Heijen, The Netherlands,2001, pp.234-241.
    [205]Tax D. M. J. and Juszczak P., Kernel Whitening for One-Class Classification, In S.-W.Lee and Verri A., editors, Proceedings of First International Workshop on Pattern Recognition with Support Vector Machines, Springer-Verlag, Berlin, Germany,2002, pp.40-52.
    [206]Tax D. M. J. and Duin R. P. W., Support Vector Domain Description, Machine Learn-ing,54 (1),2004, pp.45-66.
    [207]Lee J. and Lee D., An Improved Cluster Labeling Method for Support Vector Cluster-ing, IEEE Trans.Pattern Analysis and Machine Intelligence,27 (3),2005, pp.461-464.
    [208]Lee S.-H. and Daniels K. M., Gaussian Kernel Width Generator for Support Vector Clustering, In Leen T., Dietterich T., and Tresp V., editors, Proceedings of the Inter-national Conference on Bioinformatics and its Applications, World Scientific Pub Co Inc, Hackensack, USA,2005, pp.151-162.
    [209]Lee S.-H., Gaussian Kernel Width Selection and Fast Cluster Labeling for Support Vector Clustering, Department of Computer Science, University of Massachusetts Lowell, USA,2005
    [210]Forsythe G. E., Malcolm M. A., and Moler C. B., Computer methods for mathemat-ical computations, Prentice Halls,1977.
    [211]Lee J. and Lee D., Dynamic Characterization of Cluster Structures for Robust and Inductive Support Vector Clustering, IEEE Trans.Pattern Analysis and Machine Intel-ligence,28 (11),2006, pp.1869-1874.
    [212]Lee D. and Lee J., Equilibrium-based Support Vector Machine for Semisupervised Classification, IEEE Transactions on Neural Networks,18 (2),2007, pp.578-583.
    [213]Elzinga D. J. and Hearn D. W., The Minimum Covering Sphere Problem, Management Science,19 (1),1972, pp.96-104.
    [214]Scholkopf B., Burges C., and Vapnik V., Extracting Support Data for A Given Task, In Fayyad U. M. and Uthurusamy R., editors, First International Conference on Knowl-edge Discovery & Data Mining, AAAI Press, August 20-21,1995, pp.252-257.
    [215]Guo C. H., Lu M. Y., Sun J. T., et al., A New Algorithm for Computing The Minimal Enclosing Sphere in Feature Space, Lecture Notes in Artificial Intelligence,3614 (2), 2005, pp.196-204.
    [216]Guo C. H. and Li F., An Improved Algorithm For Support Vector Clustering based on Maximum Entropy Principle and Kernel Matrix, Expert Systems with Applications,38 (7),2011, pp.8138-8143.
    [217]Chu C. S., Tsang I. W., and Kwok J. T., Scaling up Support Vector Data Description by Using Core-Sets, In Proceedings of the International Joint Conference on Neural Networks (IJCNN2004), ACM Press,2004, pp.425-430.
    [218]Tsang I. W., Kwok J. T., and Cheung P.-M., Core Vector Machines:Fast SVM Training on Very Large Data Sets, Journal of Machine Learning Research,6 (Dec), 2005, pp.363-392.
    [219]Tsang I. W., Kwok J. T., and Zurada J. M., Generalized Core Vector Machines, IEEE Transactions on Neural Networks,17 (5),2006, pp.126-140.
    [220]Wang D., Zhang B., Zhang P., et al., An Online Core Vector Machine with Adaptive MEB Adjustment, Pattern Recognition,43 (10),2010, pp.3468-3482.
    [221]Wang J.-S. and Chiang J.-C., An Efficient Data Preprocessing Procedure for Support Vector Clustering, Journal of Universal Computer Science,15 (4),2009, pp.705-721.
    [222]Ling P., Zhou C.-G., and Zhou X., Improved Support Vector Clustering, Engineering Applications of Artificial Intelligence,23 (4),2010, pp.552-559.
    [223]Yang J. H., Estivill-Castro V., and Chalup S. K., Support Vector Clustering Through Proximity Graph Modelling, In Proceedings of the 9th International Conference on Neural Information Processing (ICONIP'02),2002, pp.898-903.
    [224]Nocedal J. and Wright S. J., Numerical Optimization,2nd Edition, Springer,2006.
    [225]Peng L., Yang B., Chen Y. H., et al., Data Gravitation based Classification, Informa-tion Sciences,179 (6),2009, pp.809-819.
    [226]Yin M. H., Hu Y. M., Yang F. Q., et al., A Novel Hybrid K-harmonic Means and Gravitational Search Algorithm Approach for Clustering, Expert Systems with Appli-cations,38 (8),2011, pp.9319-9324.
    [227]Rashedi E., Nezamabadi-pour H., and Saryazdi S., Filter Modeling Using Gravita-tional Search Algorithm, Engineering Applications of Artificial Intelligence,24 (1), 2011, pp.117-122.
    [228]Hsieh T. W., Taur J. S., Tao C. W., et al., A Kernel-Based Core Growing Clustering Method, International Journal of Intelligent Systems,24 (4),2009, pp.441-458.
    [229]Hocking T. D., Joulin A., and Vert F. B. J.-P., Clusterpath:An Algorithm for Cluster-ing using Convex Fusion Penalties, In Getoor L. and Scheffer T., editors, Proceedings of the 28th International Conference on Machine Learning (ICML), ACM, New York, NY, USA,2011, pp.745-752.
    [230]Lee S.-H. and Daniels K. M., Gaussian Kernel Width Generator for Support Vector Clustering, In Advances in Bioinformatics and Its Applications, Series in Mathemati-cal Biology and Medicine,2004, pp.151-162.
    [231]Lee S.-H. and Daniels K. M., Cone Cluster Labeling for Support Vector Clustering, In Ghosh J., Lambert D., Skillicorn D. B., et al., editors, Proceedings of 6th SIAM Conference on Data Mining, SIAM,2006, pp.484-488.
    [232]Yeh C.-Y., Huang C.-W., and Lee S.-J., Multi-Kernel Support Vector Clustering for Multi-Class Classification, International Journal of Innovative Computing Information and Control,6 (5),2010, pp.2245-2262.
    [233]Lee C.-H. and Yang H.-C., Construction of Supervised and Unsupervised Learn-ing Systems for Multilingual Text Categorization, Expert Systems with Applications, 2,Part 1 (36),2009, pp.2400-2410.
    [234]Wang J.-S. and Chiang J.-C., A Cluster Validity Measure With Outlier Detection for Support Vector Clustering, IEEE Transactions on Systems, Man, and Cybernetics, Part B:Cybernetics,38 (1),2008, pp.78-89.
    [235]Wang J.-S. and Chiang J.-C., A Cluster Validity Measure with A Hybrid Parameter Search Method for The Support Vector Clustering Algorithm, Pattern Recognition,41 (2),2008, pp.506-520.
    [236]Nath J. S. and Shevade S. K., An Efficient Clustering Scheme using Support Vector Methods, Pattern Recognition,39 (8),2006, pp.1473-1480.
    [237]Jarvis R. A. and Patrick E. A., Clustering Using a Similarity Measure Based on Shared Nearest Neighbors, IEEE Transactions on Computers,11 (C-22),1973, pp.1025-1034.
    [238]Ertoz L., Steinbach M., and Kumar V., Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional Data, In Barbara D. and Kamath C., editors, Proceedings of the Third SIAM International Conference on Data Mining, SIAM, May 1-3,2003, pp.1-10.
    [239]Karypis G., Han E.-H., and Kumar V., Chameleon:Hierarchical Clustering Using Dynamic Modeling, Computer,32 (8),1999, pp.68-75.
    [240]Asuncion A. and Newman D. J., UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences, Available on: http://www.ics.uci.edu/-mlearn/MLRepository.html,2007
    [241]Zhu X. J. and Goldberg A. B., Introduction to Semi-Supervised Learning, Morgan & Claypool Publishers,2009
    [242]Ben-Hur A., Horn D., Siegelmann H. T., et al., A Support Vector Cluster Method, In Proc. the 15th International Conference on Pattern Recognition, IEEE Computer Society, Sep.3-7,2000, pp.724-727.
    [243]Ban T. and Abe S., Spatially Chunking Support Vector Clustering Algorithm, In Proc. International Joint Conference on Neural Networks, July 25-29,2004, pp.413-418.
    [244]Puma-Villanueva W. J., Bezerra G. B., Lima C. A. M., et al., Improving Support Vector Clustering with Ensembles, In Proc. International Joint Conference on Neural Networks, Jul.31-Aug.4,2005, pp.13-15.
    [245]Wang F., Zhao B., and Zhang C. S., Linear Time Maximum Margin Clustering, IEEE Transactions on Neural Networks,21 (2),2010, pp.319-332.
    [246]Peng J. M., Mukherjee L., Singh V., et al., An Efficient Algorithm for Maximal Margin Clustering, Journal of Global Optimization,2,2011, pp.1-15, doi:10.1007/s10898-011-9691-4.
    [247]Russo V, State-of-the-art Clustering Techniques—Support Vector Methods and Mini-mum Bregman Information Principle [Dissertation], University of Naples, Italy,2006.
    [248]Li Y. H. and Maguire L., Selecting Critical Patterns based on Local Geometrical and Statistical Information, IEEE Transactions on Pattern Analysis and Machine Intelli-gence,33 (6),2011, pp.1189-1201.
    [249]Ping Y., Zhou Y. J., and Yang Y. X., A Novel Scheme for Acclereating Support Vector Clustering, Computing and Infomatics,31 (2),2012, pp.1-24.
    [250]Kpotufe S. and Luxburg U. V., Pruning Nearest Neighbor Cluster Trees, In Getoor L. and Scheffer T., editors, Proceedings of the 28th International Conference on Machine Learning (ICML 2011), Omnipress, Jun 28-Jul 2,2011, pp.225-232.
    [251]Hubert L. and Arabie P., Comparing Partitions, Journal of Classification,2 (1),1985, pp.193-218.
    [252]Kim H. C. and Lee J., Clustering based on Gaussian Processes, Neural Computation, 19(11),2007, pp.3088-3107.
    [253]Camastra F. and Verri A., A Novel Kernel Method for Clustering, IEEE Transactions on Pattern Analysis and Machine Intelligence,27 (5),2005, pp.801-805.
    [254]Wu M. and Scholkopf B., A Local Learning Approach For Clustering, In Platt J. C. Koller D., Singer Y., et al., editors, Proc. the 20th Annual Conference on Advances Neural Information Processing Systems (NIPS 2007), Curran Associates, Inc., Dec. 4-7,2006, pp.1529-1536.
    [255]Hao P.-Y., Chiang J.-H., and Lin Y.-H., A New Maximal-Margin Spherical-Structured Multi-class Support Vector Machine, Applied Intelligence,30 (2),2009, pp.98-111.
    [256]Marchiori E., Hit Miss Networks with Applications to Instance Selection, Journal of Machine Learning Research,9 (Jun),2008, pp.997-1017.
    [257]Angiulli F., Fast Nearest Neighbor Condensation for Large Data Sets Classification, IEEE Transactions on Knowledge and Data Engineering,19 (11),2007, pp.1450-1464.
    [258]Wilson D. R. and Martinez T. R., Reduction Techniques for Instance-Based Learning Algorithms, Machine Learning,38 (3),2000, pp.257-286.
    [259]Shin H. and Cho S., Neighborhood Property-based Pattern Selection for Support Vec-tor Machines, Neural Computation,19 (3),2007, pp.816-855.
    [260]Li Y. H., Selecting Training Points for One-class Support Vector Machines, Pattern Recognition Letters,32 (11),2011, pp.1517-1522.
    [261]Mehrotra S., On The Implementation of A Primal-Dual Interior Point Method, SIAM Journal on Optimization,2,1992, pp.575-601.
    [262]Ping Y., Tian Y. J., Zhou Y. J., et al., Convex Decomposition based Cluster Labeling Method for Support Vector Clustering, Journal of Computer Science and Technology, 27 (2),2012, pp.428-442.
    [263]Lang K., NewsWeeder:Learning to Filter Netnews, In Prieditis A. and Russell S. J., editors, Proceedings of the 12th International Conference on Machine Learning (ICML'95),, San Francisco, CA, USA:Morgan Kaufmann, Jul 9-12,1995, pp.331-339.
    [264]Su M. Y., Using Clustering to Improve the KNN-based Classifiers for Online Anomaly Network Traffic Identification, Journal of Network and Computer Applications,34 (2), 2011, pp.722-730.
    [265]Hurley J., Garcia-Palacios E., and Sezer S., Classifying Network Protocols:A'Two-Way'Flow Approach, IET Communications,5 (1),2011, pp.79-89.
    [266]UNIBS, The UNIBS Anonymized 2009 Internet Traces, The Telecommunication Net-works group@ UniBS, Mar 18,2010 URL http://www.ing.unibs.it/ntw/tools/traces
    [267]Peng J. F., Zhou Y. J., Wang C., et al., Early TCP Traffic Classification, Journal of Applied Sciences-Electronics and Information Engineering,29 (1),2011, pp.13-11.
    [268]Moarn M. and Kuhns J., On Relevance, Probabilistic Indexing and Information Re-trieval, Journal of the ACM,7 (3),1960, pp.216-244.
    [269]Cooper W., Getting Beyond Boole, Information Processing and Management,24 (3), 1988, pp.243-248.
    [270]Quan X. J., Liu W. Y., and Qiu B. T., Term Weighting Schemes for Question Cat-egorization, IEEE Transactions on Pattern Analysis and Machine Intelligence,33 (5), 2011, pp.1009-1021.
    [271]Kim J. and Kim M. J., An Evaluation of Passage-based Text Categorization, Journal of Intelligent Information Systems,23 (1),2004, pp.47-65.
    [272]Hearst M. A., Multi-paragraph Segmentation of Expository Texts, In Pustejovsky J. editor, Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, Morgan Kaufmann, June 27-30,1994, pp.9-16.
    [273]Debole F. and Sebastiani F., Supervised Term Weighting for Automated Text Catego-rization, In Proceedings of the 20th Annual ACM Symposium on Applied Computing (SAC 03), ACM, New York, NY, USA, Mar 9-12,2003, pp.784-788.
    [274]Ping Y., Zhou Y. J., Yang Y. X., et al., A Novel Term Weighting Scheme with Distributional Coefficient for Text Categorization with Support Vector Machine, In Proceedings of the IEEE 2nd Youth Conference on Information, Computing and T-elecommunications (YCICT'10), Piscataway, NJ, USA:IEEE, Nov 28-30,2010, pp. 182-185.
    [275]Isa D., Lee L. H., and Kallimani V. P., A Polychotomizer for Case-based Reasoning Beyond The Traditional Bayesian Classification Approach, Journal of Computer and Information Science,1(1),2008, pp.57-68.
    [276]Isa D., Lee L. H., Kallimani V. P., et al., Text Document Preprocessing with The Bayes Formula for Classification using The Support Vector Machine, IEEE Transac-tions on Knowledge and Data Engineering,20 (9),2008, pp.1264-1272.
    [277]Isa D., Kallimani V. P., and Lee L. H., Using The Self Organizing Map for Clustering of Text Documents, Expert Systems with Applications,36 (5),2009, pp.9584-9591.
    [278]Ko Y., Park J., and Seo J., Improving Text Categorization Using The Importance of Sentences, Information Processing and Management,40 (1),2004, pp.65-79.
    [279]Tseng C. Y., Sung P. C., and Chen M. S., Cosdes:A Collaborative Spam Detection System with A Novel E-mail Abstraction Scheme, IEEE Transactions on Knowledge and Data Engineering,23 (5),2011, pp.669-682.
    [280]Shanks V. and Williams H. E., Fast Categorisation of Large Document Collections, In Proceedings of the Eighth International Symposium on String Processing and Infor-mation Retrieval (SPIRE'01),2001, pp.194-204.
    [281]Wibowo W. and Williams H. E., Simple and Accurate Feature Selection for Hier-achical Categorisation, In Proceedings of the Eleventh International Conference on Information and Knowledge Management (CIKM), ACM Press, New York, NY,2002, pp.111-118.
    [282]Ping Y., Zhou Y. J., Li H. N., et al., Efficient Text Representation via Weighted Co-Contributions of Terms on Class Tendency, ICIC Express Letters,5 (12),2011, pp. 4329-4336.
    [283]Ping Y., Zhou Y. J., Li H. N., et al., Efficient Representation of Text with Multiple Perspectives, Journal of China Universities of Posts and Telecommunications,19 (1), 2012, pp.101-111.
    [284]Callan J. P., Passage Retrieval Evidence in Document Retrieval, In Proceedings of the 17th Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, ACM Press, New York, NY,1994, pp.302-310.
    [285]Lertnattee V. and Theeramunkong T., Effect of Term Distributions on Centroid-based Text Categorization, Information Sciences,158 (1),2004, pp.89-115.
    [286]Guan H., Zhou J. Y., and Guo M. Y., A Class-Feature-Centroid Classifier for Text Categorization, In Proceedings of the 18th International Conference on World Wide Web (WWW 09), ACM Press, New York, NY, USA, Apr 20.24,2009, pp.201-210.
    [287]Soucy P. and Mineau G. W., Beyond TFIDF Weighting for Text Categorization in The Vector Space Model, In Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI'05), AAAI Press, Menlo Park, CA, USA, Jul 30-Aug 5, 2005, pp.1130-1135.
    [288]Dietterich T. G., Approximate Statistical Tests for Comparing Supervised Classifica-tion Learning Algorithms, Neural Computation,10 (7),1998, pp.1895-1923.
    [289]Lewis D. D., Reuters-21578 text categorization collection, Available on: http://kdd.ics.uci.edu/databases/reuters21578/,1999.
    [290]Graven M., DiPasquo D., Freitag D., et al., Learning to Extract Symbolic Knowl-edge From The World Wide Web, In Mostow J. and Rich C., editors, Proceedings of the 15th National Conference for Artificial Intelligence (AAAI'98), Cambridge, MA, USA:MIT Press, Jul 26-30,1998, pp.509-516.
    [291]Hersh W., Buckley C., Leone T., et al., OHSUMED:An Interactive Retrieval Evalu-ation and New Large Test Collection for Research, In Croft W. B. and Rijsbergen C. J. v., editors, Proceedings of the 17th Annual ACM SIGIR Conference, ACM/Springer, 1994, pp.192-201.
    [292]Porter M., An Algorithm for Suffix Stripping, Program,14 (3),1980, pp.130-137.
    [293]Zhang Y. W. and Ghaoui L. E., Large-Scale Sparse Principal Component Analysis with Application to Text Data, In Shawe-Taylor J., Zemel R., Bartlett P., et al., editors, Advances in Neural Information Processing Systems 24, Neural Information Processing System Foundation, December 12,2011, pp.532-539.
    [294]Kalil T., Big Data is a Big Deal, Office of Science and Technology Policy,2012, URL http://www.whitehouse.gov/blog/2012/03/29/big-data-big-deal.