数据流上的聚类与分类算法
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
在现代社会中,越来越多的数据以数据流的形式出现。数据流与传统静态数据的区别在于其规模的无限增长以及其中蕴含概念的不断演化,这些特点使得许多根据静态数据模型设计的数据挖掘算法不再适用,因此针对数据流的数据挖掘算法研究成为一个重要的研究方向。本文对演化数据流的聚类与分类问题进行了研究,完成了如下工作:
     1.提出了一种处理混合属性数据流的聚类算法。该算法利用泊松过程对数据流的产生进行建模,并将数据流中样本的连续属性与离散属性统一考虑,定义了混合属性条件下样本之间的距离。在上述定义的基础上实现了一种包含在线与离线两个阶段的数据流聚类算法。
     2.提出了基于产生式模型的支持向量机输出概率化算法。该算法利用正态分布模型对支持向量机原始输出值的类条件概率密度进行建模,实现了批量式分类问题中测试集上的分类器输出调整,以解决训练集与测试集中类先验概率存在差异的问题。实验表明,该算法比已有经典算法更适合于分类器输出调整。
     3.针对存在类先验演化现象的数据流,提出了分类器输出调整算法。该算法利用时间序列分析中的指数平滑算法以及AR模型进行数据流上类先验概率的预测,并利用预测结果进行分类器的输出调整。实验表明,该算法可以很好的处理类先验演化这种特殊的概念漂移问题。此外,针对周期性的类先验演化提出了改进的类先验概率预测算法,并成功地用于智能视频交通监控中的车辆分类。
     4.提出了一种处理一般概念漂移问题的线性分类器增量更新算法。针对逻辑斯蒂回归模型,在自训练的框架下用二阶泰勒展开来近似数据流的对数条件似然函数,实现了近似对数条件似然函数的增量更新,并以此为基础进行分类器参数求解。与采用梯度下降的自训练方法相比,本文提出的算法在处理复杂的概念漂移问题时更为鲁棒。
In modern society, more and more data is generated in streaming format. The maindi?erences between data stream model and traditional static data model are growingand concept drifting. These characteristics make the data mining algorithms designedfor static data model are not valid for streaming model anymore. Therefore, some spe-cific algorithms are proposed for data stream mining accordingly. In this thesis, severalalgorithms for data stream clustering and classification are proposed. Specifically, themain contribution of this thesis is as follows:
     1. This thesis proposes an algorithm to handle the heterogeneous stream clusteringproblem. A Poisson model is used to describe the arriving process of the samples in thestream. The distance metric between heterogeneous samples is defined by consideringboth continuous and categorical attributes simultaneously. Based on such definition, atwo-step clustering algorithm containing online and o?-line steps is realized.
     2. This thesis proposes an algorithm for the probabilistic outputs of Support VectorMachines (SVM) using generative model. A univariate normal distribution model isused to approximate the within class density of the unthresholded outputs of SVM.According to this model, the output of the classifier on the test set is adjusted in orderto make up the decrease of the classification accuracy caused by the disparity betweenthe class priors on the training set and the test set. The proposed algorithm achievedhigher classification accuracy on some data sets than the classic algorithm.
     3. This thesis proposes a general classifier adjusting algorithm to deal with theclass priors evolution over the data stream. The algorithm uses exponential smooth-ing and AR model to forecast the class priors along the data stream dynamically, andadjusts the outputs of the classifier accordingly. Experimental results show that theproposed algorithm can handle the changing class priors problem well. Besides, thealgorithm is modified to make use of the periodicity in the seasonal class priors evolu-tion. The modified algorithm has been successfully applied to the vehicle classification problem in a smart video tra?c surveillance system.
     4. This thesis proposes an incremental linear classifier updating algorithm for thegeneral concept drift problem. Under the self-training framework, the second orderTaylor expansion is used to approximate the logarithmic conditional likelihood of thenew observed samples described by logistic regression model. Based on such approx-imation, the approximated log conditional likelihood can be updated incrementally.Then the parameter of the classifier is solved by maximizing the approximated logconditional likelihood. Comparing to the self-training method based on the gradientdescent algorithm optimizing the log conditional likelihood directly, the proposed al-gorithm is more robust when handling sophisticated concept drift.
引文
[1] Babcock B, Babu S, Datar M, et al. Models and Issues in Data Stream Systems. Proceed-ings of the 21st ACM SIGACT-SIGMOD-SIGART Symposium on Principles of DatabaseSystems, 2002. 1–16.
    [2] Zhu Y, Shasha D. StatStream: Statistical Monitoring of Thousands of Data Streams in RealTime. Proceedings of the 28th International Conference on Very Large Data Bases, 2002.358–369.
    [3] Golab L, O¨zsu M T. Issues in Data Stream Management. SIGMOD Record, 2003, 32(2):5–14.
    [4] Muthukrishnan S. Data Streams: Algorithms and Applications. Now Publishers Inc., 2005.
    [5] Garofalakis M N, Gehrke J. Querying and Mining Data Streams: You Only Get One Look(A Tutorial). Proceedings of the 28th International Conference on Very Large Data Bases,2002. 635.
    [6] Gaber M M, Zaslavsky A B, Krishnaswamy S. Mining Data Streams: A Review. SIGMODRecord, 2005, 34(2):18–26.
    [7] Guha S, Meyerson A, Mishra N, et al. Clustering Data Streams: Theory and Practice. IEEETransactions on Knowledge and Data Engineering, 2003, 15(3):515–528.
    [8] Aggarwal C C, Han J, Wang J, et al. A Framework for Clustering Evolving Data Streams.Proceedings of the 29th International Conference on Very Large Data Bases, 2003. 81–92.
    [9] O’Callaghan L, Meyerson A, Motwani R, et al. Streaming-Data Algorithms for High-Quality Clustering. Proceedings of the 18th International Conference on Data Engineering,2002. 685–694.
    [10] Cao F, Ester M, Qian W, et al. Density-Based Clustering over an Evolving Data Streamwith Noise. Proceedings of the 6th SIAM International Conference on Data Mining, 2006.328–339.
    [11] Aggarwal C C, Yu P S. A Framework for Clustering Massive Text and Categorical DataStreams. Proceedings of the 6th SIAM International Conference on Data Mining, 2006.479–483.
    [12] Aggarwal C C, Yu P S. A Framework for Clustering Uncertain Data Streams. Proceedingsof the 24th International Conference on Data Engineering, 2008. 150–159.
    [13] Wang H, Fan W, Yu P S, et al. Mining Concept-drifting Data Streams Using EnsembleClassifiers. Proceedings of the 9th ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining, 2003. 226–235.
    [14] Ganti V, Gehrke J, Ramakrishnan R. Mining Data Streams under Block Evolution.SIGKDD Explorations, 2002, 3(2):1–10.
    [15] Domingos P, Hulten G. Mining High-speed Data Streams. Proceedings of the 6th ACMSIGKDD International Conference on Knowledge Discovery and Data Mining, 2000. 71–80.
    [16] Aggarwal C C, Han J, Wang J, et al. A Framework for On-Demand Classification ofEvolving Data Streams. IEEE Transactions on Knowledge and Data Engineering, 2006,18(5):577–589.
    [17] Gao J, Fan W, Han J, et al. A General Framework for Mining Concept-Drifting Data Streamswith Skewed Distributions. Proceedings of the 7th SIAM International Conference on DataMining, 2007. 226–235.
    [18] Cohen L, Avrahami G, Last M, et al. Info-fuzzy Algorithms for Mining Dynamic DataStreams. Applied Soft Computing, 2008, 8(4):1283–1294.
    [19] Manku G S, Motwani R. Approximate Frequency Counts over Data Streams. Proceedingsof the 28th International Conference on Very Large Data Bases, 2002. 346–357.
    [20] Jin C, Qian W, Sha C, et al. Dynamically Maintaining Frequent Items Over a Data Stream.Proceedings of the 12th ACM Conference on Information and Knowledge Management,2003. 287–294.
    [21] Cormode G, Muthukrishnan S. What’s Hot and What’s Not: Tracking Most Frequent ItemsDynamically. ACM Transactions on Database System, 2005, 30(1):249–278.
    [22] Jiang N, Gruenwald L. Research Issues in Data Stream Association Rule Mining. SIGMODRecord, 2006, 35(1):14–19.
    [23] Han J. Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers Inc., 2005.
    [24] Domingos P, Hulten G. Catching up with the Data: Research Issues in Mining Data Streams.Proceedings of ACM SIGMOD Workshop on Research Issues in Data Mining and Knowl-edge Discovery, 2001.
    [25] Guha S, Mishra N, Motwani R, et al. Clustering Data Streams. Proceedings of the 41stIEEE Annual Symposium on Foundations of Computer Science, 2000. 359–366.
    [26] Babcock B, Datar M, Motwani R, et al. Maintaining Variance and k-medians Over DataStream Windows. Proceedings of the ACM Symposium on Principles of Database Systems,2003. 234–243.
    [27] Charikar M, O’Callaghan L, Panigrahy R. Better Streaming Algorithms for ClusteringProblems. Proceedings of the 35th Annual ACM Symposium on Theory of Computing,2003. 30–39.
    [28] Domingos P, Hulten G. A General Method for Scaling Up Machine Learning Algorithmsand its Application to Clustering. Proceedings of the 18th International Conference onMachine Learning, 2001. 106–113.
    [29] Park N H, Lee W S. Statistical Grid-based Clustering over Data Streams. SIGMOD Record,2004, 33(1):32–37.
    [30] Gao J, Li J, Zhang Z, et al. An Incremental Data Stream Clustering Algorithm Based onDense Units Detection. Proceedings of the 9th Pacific-Asia Conference on Advances inKnowledge Discovery and Data Mining, 2005. 420–425.
    [31] Park N H, Lee W S. Cell Trees: An Adaptive Snopsis Structure for Clustering Multi-dimensional On-line Data Streams. Data and Knowledge Engineering, 2007, 63(2):528–549.
    [32] Park N H, Lee W S. Grid-based Subspace Clustering over Data Streams. Proceedings ofthe 16th ACM Conference on Information and Knowledge Management, 2007. 801–810.
    [33] Jae Woo Lee W S L. E?ciently Tracing Clusters over High-dimensional On-line DataStreams. Data and Knowledge Engineering, 2009, 68(3):362–379.
    [34] Zhang T, Ramakrishnan R, Livny M. BIRCH: An E?cient Data Clustering Method for VeryLarge Databases. Proceedings of ACM SIGMOD International Conference on Managementof Data, 1996. 103–114.
    [35] Aggarwal C C, Han J, Wang J, et al. A Framework for Projected Clustering of High Di-mensional Data Streams. Proceedings of the 13th International Conference on Very LargeData Bases, 2004. 852–863.
    [36] Ester M, Kriegel H P, Sander J, et al. A Density-Based Algorithm for Discovering Clustersin Large Spatial Databases with Noise. Proceedings of the 2nd ACM SIGKDD InternationalConference on Knowledge Discovery and Data Mining, 1996. 226–231.
    [37] Ong K L, Li W, Ng W K, et al. SCLOPE: An Algorithm for Clustering Data Streams of Cat-egorical Attributes. Proceedings of the 6th International Conference on Data Warehousingand Knowledge Discovery, 2004. 209–218.
    [38] Zhou A, Cao F, Qian W, et al. Tracking Clusters in Evolving Data Streams Over SlidingWindows. Knowledge and Information Systems, 2008, 15(2):181–214.
    [39] Lu¨hr S, Lazarescu M. Connectivity Based Stream Clustering Using Localised DensityExemplars. Proceedings of the 12th Pacific-Asia Conference on Advances in KnowledgeDiscovery and Data Mining, 2008. 662–672.
    [40] Lu¨hr S, Lazarescu M. Incremental Clustering of Dynamic Data Streams Using ConnectivityBased Representative Points. Data and Knowledge Engineering, 2009, 68(1):1–27.
    [41] Chakrabarti D, Kumar R, Tomkins A. Evolutionary Clustering. Proceedings of the 12thACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006.554–560.
    [42] Chi Y, Song X, Zhou D, et al. Evolutionary Spectral Clustering by Incorporating Tem-poral Smoothness. Proceedings of the 13th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining, 2007. 153–162.
    [43] Chi Y, Zhu S, Song X, et al. Structural and Temporal Analysis of the Blogosphere ThroughCommunity Factorization. Proceedings of the 13th ACM SIGKDD International Confer-ence on Knowledge Discovery and Data Mining, 2007. 163–172.
    [44] Asur S, Parthasarathy S, Ucar D. An Event-based Framework for Characterizing the Evo-lutionary Behavior of Interaction Graphs. Proceedings of the 13th ACM SIGKDD Interna-tional Conference on Knowledge Discovery and Data Mining, 2007. 913–921.
    [45] Lin Y R, Chi Y, Zhu S, et al. FacetNet: A Framework for Analyzing Communities andTheir Evolutions in Dynamic Networks. Proceedings of the 17th International Conferenceon World Wide Web, 2008. 685–694.
    [46]王涛,李舟军,颜跃进,等.数据流挖掘分类技术综述.计算机研究与发展, 2007,44(11):1809–1815.
    [47] Schlimmer J C, Granger R H. Incremental Learning from Noisy Data. Machine Learning,1986, 1(3):317–354.
    [48] Widmer G, Kubat M. Learning in the Presence of Concept Drift and Hidden Contexts.Machine Learning, 1996, 23(1):69–101.
    [49] Duda R O, Hart P E, Stork D G. Pattern Classification (2nd Edition). Wiley-Interscience,2000.
    [50] Chapelle O, Scho¨lkopf B, Zien A, (eds.). Semi-Supervised Learning. MIT Press, 2006.
    [51] Huang S, Dong Y. An Active Learning System For Mining Time-changing Data Streams.Intelligent Data Analysis, 2007, 11(4):401–419.
    [52] Quinlan R J. C4.5: Programs for Machine Learning (Morgan Kaufmann Series in MachineLearning). Morgan Kaufmann, 1993.
    [53] Harries M B, Sammut C, Horn K. Extracting Hidden Context. Machine Learning, 1998,32:101–126.
    [54] Hulten G, Spencer L, Domingos P. Mining Time-changing Data Streams. Proceedings of the7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,2001. 97–106.
    [55] Gama J, Rocha R, Medas P. Accurate Decision Trees for Mining High-Speed Data Streams.Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discoveryand Data Mining, 2003. 523–528.
    [56] Jin R, Agrawal G. E?cient Decision Tree Construction on Streaming Data. Proceedingsof the 9th ACM SIGKDD International Conference on Knowledge Discovery and DataMining, 2003. 571–576.
    [57] Last M. Online classification of nonstationary data streams. Intelligent Data Analysis,2002, 6(2):129–147.
    [58] Vural V, Dy J G. A Hierarchical Method for Multi-Class Support Vector Machines. Pro-ceedings of the 21st International Conference on Machine Learning, 2004.
    [59]尹志武.数据流挖掘若干问题的研究[博士学位论文].上海:上海交通大学, 2007.
    [60]尹志武,黄上腾.一种自适应局部概念漂移的数据流分类算法.计算机科学, 2008,35(2):138–143.
    [61] Aha D W. Lazy Learning. Kluwer Academic Publishers, 1997.
    [62] Aggarwal C C, Han J, Wang J, et al. On Demand Classification of Data Streams. Proceed-ings of the 10th ACM SIGKDD International Conference on Knowledge Discovery andData Mining, 2004. 503–508.
    [63] Beringer J, Hu¨llermeier E. E?cient Instance-Based Learning on Data Streams. IntelligentData Analysis, 2007, 11(6):627–650.
    [64] Law Y N, Zaniolo C. An Adaptive Nearest Neighbor Classification Algorithm for DataStreams. Proceedings of the 9th European Conference on Principles and Practice of Knowl-edge Discovery in Databases, 2005. 108–120.
    [65] Street W N, Kim Y. A Streaming Ensemble Algorithm (SEA) for Large-Scale Classification.Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discoveryand Data Mining, 2001. 377–382.
    [66] Schlimmer J, Fisher D. A Case Study of Incremental Concept Induction. Proceedings ofthe 5th National Conference on Artificial Intelligence, 1986. 495–501.
    [67]孙岳,毛国君,刘旭,等.基于多分类器的数据流中的概念漂移挖掘.自动化学报, 2008,34(1):93–97.
    [68]孙岳,毛国君,刘旭.数据流中概念漂移检测的集成分类器设计.计算机应用研究,2008, 25(1):164–167.
    [69] Freund Y, Schapire R E. Experiments with a New Boosting Algorithm. Proceedings of the13th International Conference on Machine Learning, 1996. 148–156.
    [70] Breiman L. Bagging Predictors. Machine Learning, 1996, 24(2):123–140.
    [71] Chu F, Zaniolo C. Fast and Light Boosting for Adaptive Mining of Data Streams. Proceed-ings of the 8th Pacific-Asia Conference on Advances in Knowledge Discovery and DataMining, 2004. 282–292.
    [72] Kolter J Z, Maloof M A. Dynamic Weighted Majority: A New Ensemble Method forTracking Concept Drift. Proceedings of the 3rd IEEE International Conference on DataMining, 2003. 123–130.
    [73] Kolter J Z, Maloof M A. Using Additive Expert Ensembles to Cope with Concept Drift.Proceedings of the 22nd International Conference on Machine Learning, 2005. 449–456.
    [74] Wang H, Yin J, Pei J, et al. Suppressing Model Overfitting in Mining Concept-Drifting DataStreams. Proceedings of the 12th ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining, 2006. 736–741.
    [75]胡学钢,潘春香.基于实例加权方法的概念漂移问题研究.计算机工程与应用, 2008,44(21):188–191.
    [76] Zhang P, Zhu X, Shi Y. Categorizing and Mining Concept Drifting Data Streams. Proceed-ings of the 14th ACM SIGKDD International Conference on Knowledge Discovery andData Mining, 2008. 812–820.
    [77] Huang J, Smola A J, Gretton A, et al. Correcting Sample Selection Bias by Unlabeled Data.Proceedings of Advances in Neural Information Processing Systems, 2006. 601–608.
    [78] Fan W, Huang Y, Wang H, et al. Active Mining of Data Streams. Proceedings of the 4thSIAM International Conference on Data Mining, 2004. 457–461.
    [79] Fan W, Huang Y, Yu P S. Decision Tree Evolution Using Limited Number of Labeled DataItems from Drifting Data Streams. Proceedings of the 4th IEEE International Conferenceon Data Mining, 2004. 379–382.
    [80] Ho S S, Wechsler H. Learning from Data Streams via Online Transduction. Proceedingsof ICDM Workshop Temporal Data Mining: Algorithms, Theory and Applications, 2004.45–50.
    [81] Zhu X, Zhang P, Lin X, et al. Active Learning from Data Streams. Proceedings of the 7thIEEE International Conference on Data Mining, 2007. 757–762.
    [82] Zhu X. Semi-Supervised Learning Literature Survey. Technical Report 1530, ComputerSciences, University of Wisconsin-Madison, 2005. http://www.cs.wisc.edu/~jerryzhu/pub/ssl survey.pdf.
    [83] Kuh A, Petsche T, Rivest R L. Learning Time-Varying Concepts. Proceedings of Advancesin Neural Information Processing Systems, 1990. 183–189.
    [84] Helmbold D P, Long P M. Tracking Drifting Concepts By Minimizing Disagreements.Machine Learning, 1994, 14(1):27–45.
    [85] Bennett K P, Demiriz A. Semi-Supervised Support Vector Machines. Proceedings of Ad-vances in Neural Information Processing Systems, 1998. 368–374.
    [86] Zhu X, La?erty J D. Harmonic Mixtures: Combining Mixture Models and Graph-basedMethods for Inductive and Scalable Semi-supervised Learning. Proceedings of the 22ndInternational Conference on Machine Learning, 2005. 1052–1059.
    [87] Cao Y, He H. Learning from Testing Data: A New View of Incremental Semi-supervisedLearning. Proceedings of the International Joint Conference on Neural Networks, 2008.2872–2878.
    [88] Nagy G. Classifiers That Improve with Use. Proceedings of Conference on Pattern Recog-nition and Multimedia, 2004. 79–86.
    [89] Kuncheva L I, Whitaker C J, Narasimhamurthy A. A Case-study on Na¨?ve Labelling forthe Nearest Mean and the Linear Discriminant Classifiers. Pattern Recognition, 2008,41(10):3010–3020.
    [90] Liu Z, Almhana J, Choulakian V, et al. Online EM Algorithm for Mixture with Applicationto Internet Tra?c Modeling. Computational Statistics & Data Analysis, 2006, 50(4):1052–1071.
    [91] Same′A, Ambroise C, Govaert G. An Online Classification EM Algorithm Based on theMixture Model. Statistics and Computing, 2007, 17(3):209–218.
    [92] Blum A, Mitchell T M. Combining Labeled and Unlabeled Data with Co-Training. Pro-ceedings of the 11th Annual Conference on Learning Theory, 1998. 92–100.
    [93] Zhou Z H, Li M. Tri-Training: Exploiting Unlabeled Data Using Three Classifiers. IEEETransactions on Knowledge and Data Engineering, 2005, 17(11):1529–1541.
    [94] Wu S, Yang C, Zhou J. Clustering-training for Data Stream Mining. Proceedings of the 6thIEEE International Conference on Data Mining Workshops, 2006. 653–656.
    [95] Shimodaira H. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference, 2000, 90(2):227–244.
    [96] Sugiyama M, Nakajima S, Kashima H, et al. Direct Importance Estimation with ModelSelection and Its Application to Covariate Shift Adaptation. Proceedings of Advances inNeural Information Processing Systems, 2007. 1433–1440.
    [97] Sugiyama M, Krauledat M, Mu¨ller K R. Covariate Shift Adaptation by ImportanceWeighted Cross Validation. Journal of Machine Learning Research, 2007, 8:985–1005.
    [98] Bickel S, Sche?er T. Dirichlet-Enhanced Spam Filtering Based on Biased Samples. Pro-ceedings of Advances in Neural Information Processing Systems, 2006. 161–168.
    [99] Bickel S, Bru¨ckner M, Sche?er T. Discriminative Learning for Di?ering Training andTest Distributions. Proceedings of the 24th International Conference on Machine Learning,2007. 81–88.
    [100] Smith A T, Elkan C. Making Generative Classifiers Robust to Selection Bias. Proceedingsof the 13th ACM SIGKDD International Conference on Knowledge Discovery and DataMining, 2007. 657–666.
    [101] Bickel S, Sawade C, Sche?er T. Transfer Learning by Distribution Matching for TargetedAdvertising. Proceedings of Advances in Neural Information Processing Systems, 2008. tobe published.
    [102] Kullback S, Leibler R A. On Information and Su?ciency. Annals of Mathematical Statis-tics, 1951, 22(1):49–86.
    [103] Yamauchi K. Covariate Shift and Incremental Learning. Proceedings of the IEICE Interna-tional Conference on Neural Information Processing, 2008. 231–236.
    [104] Kubat M, Matwin S. Addressing the Curse of Imbalanced Training Sets: One-Sided Se-lection. Proceedings of the 14th International Conference on Machine Learning, 1997.179–186.
    [105] Japkowicz N, Stephen S. The Class Imbalance Problem: A Systematic Study. IntelligentData Analysis, 2002, 6(5):429–449.
    [106] Hulse J V, Khoshgoftaar T M, Napolitano A. Experimental Perspectives on Learning fromImbalanced Data. Proceedings of the 24th International Conference on Machine Learning,2007. 935–942.
    [107] Vucetic S, Obradovic Z. Classification on Data with Biased Class Distribution. Proceedingsof the 12th European Conference on Machine Learning, 2001. 527–538.
    [108] Saerens M, Latinne P, Decaestecker C. Adjusting the Outputs of A Classifier to New APriori Probabilities: A Simple Procedure. Neural Computation, 2002, 14(1):21–41.
    [109] Huang X, Ariki Y, Jack M. Hidden Markov Models for Speech Recognition. ColumbiaUniversity Press, 1990.
    [110] Mikheev A, Moens M, Grover C. Named Entity Recognition without Gazetteers. Proceed-ings of the 9th Conference of the European Chapter of the Association for ComputationalLinguistics, 1999. 1–8.
    [111] Derin H, Elliott H, Cristi R, et al. Bayes Smoothing Algorithms for Segmentation of BinaryImages Modeled by Markov Random Fields. IEEE Transactions on Pattern Analysis andMachine Intelligence, 1984, 6(6):707–720.
    [112] Lee H K, Kim J H. An HMM-Based Threshold Model Approach for Gesture Recognition.IEEE Transactions on Pattern Analysis and Machine Intelligence, 1999, 21(10):961–973.
    [113] Decaprio D, Vinson J P, Pearson M D, et al. Conrad: Gene Prediction using ConditionalRandom Fields. Genome Research, 2007, 17(9):1389–1398.
    [114] Rabiner L, Juang B. An Introduction to Hidden Markov Models. ASSP Magazine, IEEE[see also IEEE Signal Processing Magazine], 1986, 3(1):4–16.
    [115] Eddy S R. Profile hidden Markov models. Bioinformatics, 1998, 14(9):755–763.
    [116] McCallum A, Freitag D, Pereira F C N. Maximum Entropy Markov Models for InformationExtraction and Segmentation. In: Langley P, (eds.). Proceedings of the 17th InternationalConference on Machine Learning, 2000. 591–598.
    [117] La?erty J D, McCallum A, Pereira F C N. Conditional Random Fields: Probabilistic Mod-els for Segmenting and Labeling Sequence Data. Proceedings of the 18th InternationalConference on Machine Learning, 2001. 282–289.
    [118] Kindermann R. Markov Random Fields and Their Applications (Contemporary Mathemat-ics ; V. 1). American Mathematical Society, 1980.
    [119] Huang Z. Extensions to the k-Means Algorithm for Clustering Large Data Sets with Cate-gorical Values. Data Mining and Knowledge Discovery, 1998, 2(3):283–304.
    [120] Platt J. Probabilistic Outputs for Support Vector Machines and Comparison to RegularizedLikelihood Methods. Proceedings of Advances in Large Margin Classifiers, 2000. 61–74.
    [121] Vapnik V N. The Nature of Statistical Learning Theory. Springer-Verlag New York, Inc.,1995.
    [122] Wahba G, Wang Y, Gu C, et al. Structured Machine Learning for Soft Classification withSmoothing Spline ANOVA and Stacked Tuning, Testing, and Evaluation. Proceedings ofAdvances in Neural Information Processing Systems, 1993. 415–422.
    [123] Wahba G, Lin X, Gao F, et al. The Bias-Variance Tradeo? and the Randomized GACV.Proceedings of Advances in Neural Information Processing Systems, 1998. 620–626.
    [124] Wahba G. Support Vector Machines, Reproducing Kernel Hilbert Spaces and the Random-ized GACV. Proceedings of Advances in Kernel Methods: Support Vector Learning, 1999.69–88.
    [125] Chang C C, Lin C J. LIBSVM: A Library for Support Vector Machines, 2001. Softwareavailable at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
    [126] Hastie T, Tibshirani R. Classification by Pairwise Coupling. The Annals of Statistics, 1998,26(1):451–471.
    [127] Madevska-Bogdanova A, Nikolik D, Curfs L. Probabilistic SVM Outputs for PatternRecognition Using Analytical Geometry. Neurocomputing, 2004, 62:293–303.
    [128]张翔,肖小玲,徐光祐.基于最大熵估计的支持向量机概率建模.控制与决策, 2006,21(7):767–770.
    [129] Ng A Y, Jordan M I. On Discriminative vs. Generative Classifiers: A Comparison of Logis-tic Regression and Naive Bayes. Proceedings of Advances in Neural Information ProcessingSystems, 2001. 841–848.
    [130] Asuncion A, Newman D. UCI Machine Learning Repository, 2007. http://www.ics.uci.edu/~mlearn/MLRepository.html.
    [131] Forman G. Tackling Concept Drift by Temporal Inductive Transfer. Proceedings of the 29thAnnual International ACM SIGIR Conference on Research and Development in Informa-tion Retrieval, 2006. 252–259.
    [132] Cozman F G, Cohen I. Unlabeled Data Can Degrade Classification Performance of Gen-erative Classifiers. Proceedings of the 15th International Florida Artificial Intelligence Re-search Society Conference, 2002. 327–331.
    [133] Cohen I, Cozman F G, Sebe N, et al. Semisupervised Learning of Classifiers: Theory,Algorithms, and Their Application to Human-Computer Interaction. IEEE Transactions onPattern Analysis and Machine Intelligence, 2004, 26(12):1553–1567.
    [134] Latinne P, Saerens M, Decaestecker C. Adjusting the Outputs of A Classifier to New APriori Probabilities May Significantly Improve Classification Accuracy: Evidence from AMulti-Class Problem in Remote Sensing. Proceedings of the 18th International Conferenceon Machine Learning, 2001. 298–305.
    [135] Box G, Jenkins G M, Reinsel G. Time Series Analysis: Forecasting & Control (3rd Edition).Prentice Hall, 1994.
    [136]张贤达.现代信号处理(第二版).北京:清华大学出版社, 2002.
    [137] Neal R M, Hinton G E. A View of the EM Algorithm that Justifies Incremental, Sparse, andother Variants. In: Jordan M. I., (eds.). Learning in Graphical Models. MIT Press, 1999.355–368.
    [138] Hettich S, Bay S D. UCI KDD Archive, 1999. http://kdd.ics.uci.edu.
    [139] Blackard J A. Comparison of Neural Networks and Discriminant Analysis in PredictingForest Cover Types[D]. Fort Collins, CO, USA, 1998. Adviser-Dean,, Denis J..
    [140] Winters P R. Forecasting Sales by Exponentially Weighted Moving Averages. ManagementScience, 1960, 6(3):324–342.
    [141] Gao D, Zhou J, Xin L. A Novel Algorithm of Adaptive Background Estimation. Proceed-ings of International Conference on Image Processing, 2001. 483–486.
    [142] Zhou J, Gao D, Zhang D. Moving Vehicle Detection for Automatic Tra?c Monitoring.IEEE Transactions on Vehicular Technology, 2007, 56(1):51–59.
    [143] Gupte S, Masoud O, Martin R F K, et al. Detection and classification of vehicles. IEEETransactions on Intelligent Transportation Systems, 2002, 3(1):37–47.
    [144] Jolli?e I T. Principal Component Analysis (2nd Edition). Springer, 2002.
    [145] Guo G, Li S Z, Chan K L. Support Vector Machines for Face Recognition. Image andVision Computing, 2001, 19(9-10):631–638.
    [146] Sidla O, Paletta L, Lypetskyy Y, et al. Vehicle Recognition for Highway Lane Survey. Pro-ceedings of the 7th International IEEE Conference on Intelligent Transportation Systems,2004. 531–536.
    [147] Zhang T, Oles F J. Text Categorization Based on Regularized Linear Classification Meth-ods. Information Retrieval, 2001, 4(1):5–31.
    [148]边肇祺,张学工.模式识别(第二版).北京:清华大学出版社, 2000.
    [149] Press S J, Wilson S. Choosing Between Logistic Regression and Discriminant Analysis.Journal of the American Statistical Association, 1978, 73(364):699–705.
    [150] Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. Springer NewYork, 2001.
    [151] Goodman J, Yih W. Online Discriminative Spam Filter Training. Proceedings of the 3rdConference on Email and Anti-Spam, 2006.
    [152] Genkin A, Lewis D D, Madigan D. Large-Scale Bayesian Logistic Regression for TextCategorization. Technometrics, 2007, 49(3):291–304.
    [153] Charniak E. Statistical Parsing with a Context-Free Grammar and Word Statistics. Proceed-ings of the 14th National Conference on Artificial Intelligenc, 1997.
    [154] McClosky D, Charniak E, Johnson M. E?ective Self-Training for Parsing. Proceedingsof the Human Language Technology Conference of the North American Chapter of theAssociation of Computational Linguistics, 2006. 152–159.
    [155] Yarowsky D. Unsupervised Word Sense Disambiguation Rivaling Supervised Methods.Proceedings of the Meeting of the Association for Computational Linguistics, 1995. 189–196.
    [156] Amini M R, Gallinari P. Semi Supervised Logistic Regression. Proceedings of the 15thEureopean Conference on Artificial Intelligence, 2002. 390–394.
    [157] Balakrishnan S, Madigan D. Algorihms for Sparse Linear Classifiers in the Massive DataSetting. Journal of Machine Learning Research, 2008, 9(2):313–337.
    [158] Fu W J. Penalized Regressions: the Bridge Versus the Lasso. Journal of Computational andGraphical Statistics, 1998, 7(3):397–416.
    [159] Sculley D, Wachman G. Relaxed Online SVMs for Spam Filtering. Proceedings of the 30thAnnual International ACM SIGIR Conference on Research and Development in Informa-tion Retrieval, 2007. 415–422.
    [160] Fdez-Riverola F, Iglesias E L, D′?az F, et al. Applying Lazy Learning Algorithms to TackleConcept Drift in Spam Filtering. Expert Systems with Applications, 2007, 33(1):36–48.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700