时间序列中的知识发现

英文题名：Knowledge Discovery in Time Series
副题名：基于频繁模式发现的分类和聚类方法研究
英文副题名：Researches on Classification and Clustering Based on Frequent Pattern Discovery
作者：万里
论文级别：博士
学科专业名称：计算机科学与技术
中文关键词：时间序列 ; 数据挖掘 ; 频繁模式发现 ; 复杂网络 ; 社区划分
英文关键词：Time series ; Data mining ; Frequent pattern discovery ; Complex network ; Community Identification
学位年度：2009
导师：廖建新
学科代码：081202
学位授予单位：北京邮电大学
论文提交日期：2009-08-16

摘要

随着数据库技术、因特网、电信技术等信息技术的飞速发展,时间序列数据在现实生产和生活的各个领域中广泛存在(如电信运营、金融市场、工业过程、科学实验、医疗、气象、生物信息等),且存储规模呈现爆炸式增长。如何从海量时间序列数据中发现能够帮助人们决策且以前不知道或不易知道的模式、信息和知识是人们现阶段最急切的需求,也是时间序列数据挖掘研究的核心问题。目前时间序列数据挖掘的研究尚处于起步阶段,很多研究问题还极富挑战性,很多挖掘算法还有待扩充和完善。
     本文从预测、分类、聚类、搜索和频繁模式发现五个方面对时间序列数据挖掘的研究现状进行了综述,对目前各个研究方向的主要方法进行总结和评价,在频繁模式发现和动态复杂网络社区划分两个方面进行了深入研究。最后在总结全文的基础上,指出有待本文深入研究的问题。本文的创新性工作主要包括以下内容：
     (1)提出一种频繁模式挖掘算法FPM(Frequent Pattern Mining),该算法充分考虑频繁模式在时间序列中出现次数和分布,能从时间序列数据中挖掘出只在“某个”时间段内频繁出现的“异常”事件序列和在“整个”时间序列中频繁出现的“序列”事件。基于这些不同分布的频繁模式扩展MAMC (Mixed memory Aggregation Markov Chain)模型得到FMAMC (Frequent pattern based Mixed memory Aggregation Markov Chain)模型。FMAMC描述了时间序列中不同类型频繁模式之间的时序关联关系。实验表明,FPM算法时间性能优于PrefixSpan (Prefix-projected Sequential pattern mining)和WinMiner算法,FMAMC模型能够比MAMC模型更准确的预测时间序列中的事件。
     (2)提出一种基于频繁模式的时间序列分类框架,该框架分为特征提取、特征选择和分类模型训练三个阶段：首先利用本文提出的MNOE (Mining Non-Overlap Episode)算法挖掘时间序列中的非重叠频繁模式,基于非重叠频繁模式提出EGMAMC(Episode Generated Mixed memory Aggregation Markov Chain)模型描述时间序列。然后,根据似然比检验原理,从理论上推导出频繁模式在时间序列中出现的次数和EGMAMC模型是否能显著描述时间序列之间的关系；根据信息增益定义,选择能显著描述时间序列并且信息增益大于给定阈值的频繁模式作为时间序列特征输入传统分类算法训练分类模型。实验表明,选择频繁模式作为特征进行分类的结果优于不选择频繁模式作为特征的结果。
     (3)提出特征流的概念描述频繁模式实例在时间序列中的分布情况,根据特征流的频谱特征将相应的频繁模式分为三种类别,并分别与时间序列中隐藏的不同类型的事件相对应。提出EDPA (Event Detection by Pattern Aggregation)算法,将紧密关联的频繁模式聚合在一起,每个聚类中的频繁模式构成一个事件。实验表明,选择显著非重叠频繁模式输入EDPA算法进行事件探测的准确率高于选择其它类型频繁模式。
     (4)提出一系列基于极大团的复杂网络静态和动态社区划分方法：首先提出一种复杂网络中极大团挖掘算法CLIM (CLIque Mining),该算法利用复杂网络中节点聚集系数高的特点设计剪枝策略。实验表明,CLIM算法计算大规模复杂网络和节点聚集系数较高的随机网络中极大团的时间性能明显优于Improved BK (Bron-Kerboscht)算法。基于极大团及其重叠关系定义社区核心和附属节点组成社区。提出一种重叠社区划分算法CDPM (Clique Directed Percolation Method)。CDPM算法采用作者提出的结构轮廓系数SSC(StructureSilhouette Coefficient)衡量复杂网络社区划分质量,SSC越大,社区划分越优,算法最终输出使得SSC最大的社区划分。实验表明,CDPM算法和CPM (Clique Percolation Method)、GN (Given-Newman)算法相比能够更准确的划分社区,其F-measure和VAD (Vertex Average Degree)值更高。同时考虑复杂网络链接结构和节点附属属性信息,提出信息图社区划分算法JCCM (Joint Clustering Coefficient Method),首先采用启发式方法计算出由极大团结构重叠而成的社区核心,然后算法采用本文提出的JCC (Joint Clustering Coefficient)系数为目标函数将社区核心和附属节点聚合在一起,采用不同的距离函数度量附属节点到社区核心中不同地位节点间的距离。实验表明,JCCM算法划分信息图中社区的效果优于只考虑网络结构信息或节点属性信息的算法。在静态社区划分算法的基础上,定义相邻时刻静态社区间的演化关系并提出一种新的动态社区划分算法DCI(Dynamic Community Identification)。DCI算法首先利用本文提出的基于最短描述长度原则(MDL:Minimum Description Length)的划分算法划分静态网络社区。然后,将相邻时刻静态社区及其成员间的重叠关系抽象为一个二分图,将动态社区演化问题抽象为二分图划分问题。提出一种基于MDL的算法划分二分图。实验证明,DCI算法能够比GS (GraphScope)、GC (GroupColoring)算法更加准确的划分出动态网络中的动态社区,并且具有较好的时间性能。
     (5)将信息在社会网络中的传播过程抽象为节点按一定转移概率依次出现的半马氏过程。提出“空状态”的概念扩展多维半马氏过程,在此基础上提出社会网络信息流模型。社会网络中的成员和模型状态空间中的状态一一对应,社会成员间信息交互的概率即模型中状态间的转移概率,转移概率的大小由社会成员自身特性和所属的社会网络子结构所决定。提出基于社会网络结构的回报函数,并构造有偿半马氏模型计算用户价值。在信息流模型和有偿半马氏模型的基础上,综合考虑社会网络和用户自身偏好对资源选择的影响,提出协同过滤算法SMRR (Semi-Markov and Reward Renewal)。实验表明,SMRR算法的预测准确率优于传统的余弦算法和RIF(Rate Information Flow)算法。
With the rapid development of database, internet and telecom technique, time series data appears widely in real world life, such as telecommunication business, financial market, manufacture process, science experiment, medical care, meteorology, bio-information etc. and the scale of data is huge and exploding. How to help users to discover hidden pattern, information and knowledge from large time series data in order to support decision is a raring demand. It's also the core research topic in time series data mining. Recently, researchers just start to concern time series data mining, many topics are challenging, many algorithms need to be extended and perfected.
     This paper reviewed time series data mining from prediction, classification, clustering, search and frequent pattern discovery five aspects. The main methods in each area are summarized and reviewed, then we focus on the research topics in frequent pattern discovery and community identification in dynamic complex networks deeply. Finally based on the conclusion of this paper, we proposed some research directions in future. The contributions of this paper are as follows:
     (1) Proposed a frequent pattern discovery algorithm FPM (Frequent Pattern Mining), this algorithm considered not only the supports but also the distributions of frequent patterns. It can discover the "exceptions" frequently appears in a special time segment and the sequential event frequently appears in the whole time series. Based on these frequent patterns with different distributions, we extend MAMC (Mixed memory Aggregation Markov Chain) model to FMAMC (Frequent pattern based Mixed memory Aggregation Markov Chain) model. FMAMC described the temporal correlations among different types of frequent patterns in time series. Experiments demonstrate FPM performs better than PrefixSpan (Prefix-projected Sequential pattern mining) and WinMiner algorithms, FMAMC model can predict time series more accurately than MAMC model.
     (2) Proposed a frequent pattern based time series classification framework, it included feature extraction, feature selection and classification three phases:the proposed MNOE algorithm extracts non-overlap frequent patterns from time series and proposed EGMAMC (Episode Generated Mixed memory Aggregation Markov Chain) model based on these non-overlap episode to describe time series. After that, according to ratio likelihood test, we induced the support that can guarantee the significance of non-overlap episode. Finally, according to the definition of entropy, we selected the significant non-overlap episode whose entropy of distribution exceed a specified threshold to train classification model. Experiments demonstrate that select frequent patterns as features can improve classification models effectively.
     (3) Proposed the concept of feature flow to describe the distributions of frequent patterns in time series, we categorized the frequent pattern into three types according to their spectrum of feature flows and map different types of frequent patterns to different types of event hidden in time series. We proposed EDPA (Event Detection by Pattern Aggregation) algorithm to aggregate tightly correlated frequent patterns, each cluster of frequent patterns represents an event. Experiments demonstrate that inputting maximal significant non-overlap frequent patterns into EDPA algorithm to detect event can get more accurate result than selecting other types of frequent patterns.
     (4) Proposed a family of static community identification algorithms based on maximal cliques and a dynamic community identification algorithm:Firstly, this paper proposed an algorithm CLIM (CLIque Mining) to discover maximal cliques in complex networks. We designed the prune strategy of CLIM based on the observation that the clustering coefficient of most of the vertices in complex network is large. Experiments demonstrate CLIM performs better than Improved BK(Bron-Kerboscht) algorithm on complex network and random graphs with high clustering coefficient. Based on the maximal cliques found by CLIM, this paper defined a community consists of community core and peripheral vertices and proposed a community identification algorithm CDPM (Clique Directed Percolation Method) based on overlapping maximal cliques. CDPM utilized the proposed SSC (Structure Silhouette Coefficient) as objective function to identify communities in complex networks. The value of SSC is larger the corresponding community identification is better. Experiments demonstrate that CDPM algorithm performs better than CPM (Clique Percolation Method), GN (Given-Newman) algorithms evaluated by F-measure and VAD (Vertex Average Degree). By considering link structures and attributes attached on vertices in complex networks, this paper proposed a JCCM (Joint Clustering Coefficient Method) algorithm to identify communities in informative graph. Firstly, it utilized heuristic method to calculate community cores formed by overlapping maximal cliques. Then JCC is used as the objective function to aggregate community cores and peripheral vertices together, different distance functions are used to measure the distance between vertices in different positions. Experiments demonstrate JCCM performs better than existing algorithms which do not consider structure information and attributes attached on vertices cooperatively. Based on the static community identification algorithms, by defining evolving relationships among static communities at adjacent time points, this paper proposed a dynamic community identification algorithm DCI (Dynamic Community Identification). DCI utilized a MDL (Minimum Description Length) based method to identify communities in static complex networks. Then, we abstract the communities in adjacent complex networks and their overlapping relationship as bi-graphs and reduce the evolutions of communities in adjacent static complex networks into a bi-graph partition problem. A MDL based method is also proposed to partition bi-graphs. Experiments demonstrate that DCI algorithm can identify communities more accurately and run faster than GS (GraphScope), GC (GroupColoring) algorithms.
     (5) Abstract the process of information transition in a social network as a semi-Markov process model in which vertices appear following some transition probability. We proposed "idle state" to extend multi-dimensional Markov process, and proposed social network information flow model based on it. Each state in state space of information flow model represents an actor in a social network, transition probability in information flow model is the probability that two actors exchange information. This transition probability is determined by the characteristic of actors and the sub-structures of social network actors involved in. We proposed reward function based on social network structure and construct reward Markov model to calculate customer value. Based on information flow model and reward Markov model, we proposed a collaborative filtering algorithm SMRR (Semi-Markov and Reward Renewal) by considering personalized behavior patterns of actors and the influence of interacting social actors together. Experiments demonstrate SMRR algorithm performs better than traditional cosine algorithm and RIF (Rate Information Flow) algorithm evaluated by accuracy.

引文

[1]朱明.数据挖掘[M].合肥：中国科技大学出版社,2002：35-36.
    [2]曲文龙.复杂时间序列知识发现模型与算法研究[D].北京：北京科技大学,2006.
    [3]Han J, Kamber M. Data Mining:Concepts and Techniques[M]. San Fransisco:Morgan Kaufinann Publishers,2001:30-33.
    [4]安鸿志,陈兆国,杜金观,等.时间序列的分析与应用[M].北京：科学出版社,1983：8-10.
    [5]Jenkins G M, Reinsel G C.时间序列分析：预测与控制[M].北京：中国统计出版社,1997：125-131.
    [6]Bozkaya T, Yazdani N, Ozsoyoglu Z M. Matching and indexing sequences of different lengths[C]. Proceeding of the 6th International Conference on Information and Knowledge Management. New York, NY. USA:ACM,1997:128-135.
    [7]肖辉.时间序列的相似性查询与异常检测[D].上海：复旦大学,2005.
    [8]Morchen F. Unsupervised pattern mining from symbolic temporal data[J]. SIGKDD Explorations,2007,9(1):41-55.
    [9]Yule G. On a method of investigating periodicity in distributed series with special reference to Wolfer's sunspot numbers[J]. Philos. Trans. R. Soc. London A226,1927,267-298.
    [10]Box G E P, Jenkins G M, Reinsel G C. Time series analysis:Forecasting and control[M].3rd edition. New Jersey, USA:Prentice Hall,1994:147-156.
    [11]Chatfield C. The analysis of time series[M].5th edition, New York, NY, USA:Chapman and Hall,1996:91-102.
    [12]Hastie T, Tibshirani R, Friedman J. The elements of statistical learning:Data mining, inference and prediction[M]. New York, NY, USA:Springer-Verlag,2001:12-22.
    [13]Sutton R S. Learning to predict by method of temporal differences[J]. Machine Learning, 1988,3(1):9-44.
    [14]Wan E A. Temporal backpropagation for FIR neural networks[C]. Proceeding of International Joint Conference on Neural Networks. Hillsdale, NJ. USA:Lawrence Erlbaum Associates,1990,1:575-580
    [15]Haykin S. Neural networks:A comprehensive foundation[M]. New York, NY, USA: Macmillan,1992:33-47.
    [16]Koskela T, Lehtokangas M, Saarinen J, et al. Time series prediction with multilayer perceptron, FIR and Elman neural networks[C]. Proceeding of World Congress on Neural Networks. Hillsdale, NJ. USA:Lawrence Erlbaum Associates,1996:491-496.
    [17]Dietterich T G, Michalski R S. Discovering patterns in sequences of events[J]. Artificial Intelligence.1985,25:187-232
    [18]Juang B H, Rabiner L. Fundamentals of speech recognition[M]. Englewood Cliffs, NJ: Prentice Hall,1993:50-78.
    [19]O'Shaughnessy D. Speech communications:Human and machine[M]. Piscataway, NJ:IEEE Press,2003:21-34.
    [20]Gold B, Morgan N. Speech and audio signal processing:Processing and perception of speech and music[M]. New York:John Wiley & Sons,2000:78-99.
    [21]Darrell T. Pentland A. Space-time gestures[C]. Proceeding of 1993 IEEE Comput. Soc. Conf. on Computer Vision and Pattern Recognition (CVPR'93). Washington, DC:Optical Society of America,1993:335-340.
    [22]Yamato J, Ohya J, Ishii K Recognizing human action in time-sequential images using Hidden Markov Model[C]. Proceeding of 1992 IEEE Comput. Soc. Conf. on Computer Vision and Pattern Recognition (CVPR'92), Champaign, IL:Optical Society of America,1992: 379-385.
    [23]Roverso D. Multivariate temporal classification by windowed wavelet decomposition and recurrent neural networks[C]. Proceeding of the 3rd ANS International Topical Meeting on Nuclear Plant Instrumentation, Control and Human Machine Interface Technologies, NPIC and HMIT 2000. ANS,2000:219-228.
    [24]Kadous W. Learning comprehensible descriptions of multivariate time series[C]. Proceedings of the 16th International Conference on Machine Learning. San Fransisco:Morgan Kaufmann Publishers,1999:454-463
    [25]Kadous W. Temporal Classification:Extending the Classification Paradigm to Multivariate Time Series[D]. Australia:University of New South Wales,2003.
    [26]Kadous W, Sammut C. Constructive induction for classifying time series[C]. Proceedings of the 15th European Conference on Machine Learning (ECML'04). Berlin Heidelberg:Springer, 2004:192-204.
    [27]Kadous W, Sammut C. Classification of multivariate time series and structured data using constructive induction[J]. Machine Learning,2005,58:179-216.
    [28]Corpet F. Multiple sequence alignment with hierarchical clustering[J]. Nucleic Acids Research,1988,16:10881-10890
    [29]Miller R T, Christoffels A G, Gopalakrishnan C, et al. A comprehensive approach to clustering of expressed human gene sequence:The sequence tag alignment and consensus knowledge base[J]. Genome Res.,1999,9:1143-1155
    [30]Osata N, Itoh M, konno H, et al. A computer-based method of selecting clones for a full-length cDNA project:Simultaneous collection of negligibly redundant and variant cDNAs[J]. Genome Res.,2002,12:1127-1134
    [31]Smyth P. Clustering sequences with hidden Markov models[J]. Adv. Neural Inf. Process. 1997,9:648-655
    [32]Sebastiani P, Ramoni M, Cohen P, et al. Discovering dynamics using Bayesian clustering. In Lecture notes in computer science[C]. Adv. in Intelligent Data Analysis:3rd Int. Symp., IDA-99. Heidelberg:Springer-Verlag,1999,1642:199
    [33]Law M H, Kwok J T. Rival penalized competitive learning for model-based sequence clustering[C]. Proceeding of IEEE Int. Conf. on Pattern Recognition (ICPR00). Washington, USA:IEEE Computer Society Press,2000,2:2195.
    [34]Xong Y, Yeung D Y. Mixtures of ARMA models for model-based time series clustering[C]. Proceeding of 2002 IEEE Int. Conf. on Data Mining. Washington, USA:IEEE Computer Society Press,2002:717-720
    [35]Cadez I, Heckerman D, Meek C, et al. Model-based clustering and visualization of navigation patterns on a web site[J]. Data Mining and Knowledge Discovery,2003,7(4): 399-424.
    [36]Alon J, Sclaroff S, Kollios G, et al. Discovering clusters in motion time series data[C]. Proceeding of 2003 IEEE Comput. Soc. Conf. on Computer Vision and Pattern Recognition. Washington, USA:IEEE Computer Society Press,2003:375-381.
    [37]Wu S, Manber U. Fast text searching allowing errors[J]. Commun. ACM,1992,35(10): 83-91.
    [38]Ghias A, Logan J, Chamberlin D, et al. Query by humming-musical information retrieval in an audio database[C]. Proceeding of ACM Multimedia 95. New York, NY, USA:ACM, 1995:231-236.
    [39]Gray R M, Buzo A, Gray J, et al. Distortion measures for speech processing[J]. IEEE Trans. Acoust., Speech Signal Process,1980,28:367-376
    [40]董晓莉,时间序列数据挖掘相似性度量和周期模式挖掘研究[D].天津：天津大学,2006.
    [41]Perng C S, Wang H, Zhang S R, et al. Landmarks:A new model for similarity-based pattern querying in time series databases[C]. Proceeding of 16th Int. Conf. on Data Engineering (ICDE00). Washington, USA:IEEE Computer Society Press,2000:33-42.
    [42]Gusfield D. Algorithms on strings, trees and subsequences[M]. New York:University of Cambridge Press.1997:78-93.
    [43]Ewens W J, Grant G R. Statistical:methods in bioinformatics:An introduction[M]. New York.Springer-Verlag,2001:73-78.
    [44]Kruskal J B. An overview of sequence comparison:Time warps, string edits and macromolecules[J].SIAM Rev.1983,21:201-237.
    [45]Cohen J. Bioinformatics-an introduction for computer scientists[J]. ACM Comput. Surv. 2004,36(2):122-158.
    [46]Agrawal R, Lin K I, Sawhney H S, et al. Fast similarity search in the presence of noise, scaling and translation in time series databases.[C]. Proceeding of 21st Int. Conf. on Very Large Data Bases (VLDB95). New York, NY, USA:ACM,1995:490-501.
    [47]Agrawal R, Psaila G, Wimmers E L, et al. Querying shapes of histories[C]. Proceeding of 21st Int. Conf. on Very Large Databases. New York, NY, USA:ACM,1995:502-514.
    [48]Keogh E J, Pazzani M J. Scaling up dynamic time warping for data mining applications[C]. Proceeding of 6th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data mining. New York, NY, USA:ACM,2000:285-289.
    [49]Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large databases[C]. Proceedings of the 1993 ACM-SIGMOD international conference on management of data (SIGMOD'93), New York, NY, USA:ACM,1993:207-216.
    [50]Agrawal R, Srikant R. Fast algorithms for mining association rules[C]. Proceeding of the 1994 international conference on very large data bases (VLDB'94), New York, NY, USA: ACM,1994:487-499.
    [51]Mannila H, Toivonen H, Verkamo A I. Efficient algorithms for discovering association rules[C]. Proceeding of the AAAI'94 workshop knowledge discovery in databases (KDD'94). New York, NY. USA:Oxford University Press,1994:181-192.
    [52]Park J S, Chen M S, Yu P S. An effective hash-based algorithm for mining association rules. Proceeding of the 1995 ACM-SIGMOD international conference on management of data (SIGMOD'95).New York,NY,USA:ACM,1995:175-186
    [53]Savasere A, Omiecinski E, Navathe S. An efficient algorithm for mining association rules in large databases[C]. Proceeding of the 1995 international conference on very large data bases (VLDB'95), New York, NY, USA:ACM,1995:432-443
    [54]Toivonen H. Sampling large databases for association rules[C]. Proceeding of the 1996 international conference on very large data bases (VLDB'96). New York, NY, USA:ACM, 1996:134-145
    [55]Brin S, Motwani R, Ullman J D, et al. Dynamic itemset counting and implication rules for market basket analysis[C]. Proceeding of the 1997 ACM-SIGMOD international conference on Management of data (SIGMOD'97). New York, NY, USA:ACM,1997:255-264.
    [56]Cheung D W, Han J, Ng V, et al. Maintenance of discovered association rules in large an incremental updating technique[C]. Proceeding of the 1996 international conference on data engineering (ICDE'96). Washington, USA:IEEE Computer Society Press,1996:106-114.
    [57]Agrawal R, Shafer J C. Parallel mining of association rules:design, implementation, and experience[J]. IEEE Trans Knowl Data Eng.,1996,8:962-969.
    [58]Zaki M J, Parthasarathy S, Ogihara M, et al. Parallel algorithm for discovery of association rules[J]. Data mining knowledge discovery,1997,1:343-374.
    [59]Sarawagi S, Thomas S, Agrawal R. Integrating association rule mining with relational database systems:alternatives and implications[C]. Proceeding of the 1998 ACM-SIGMOD international conference on management of data (SIGMOD'98). New York, NY, USA: ACM,1998:343-354.
    [60]Geerts F, Goethals B, Bussche J. A tight upper bound on the number of candidate patterns[C]. Proceeding of the 2001 international conference on data mining (ICDM'01). Washington, USA:IEEE Computer Society Press,2001:155-162.
    [61]Han J, Pei J, Yin Y. Mining frequent patterns without candidate generation[C]. Proceeding of the 2000 ACM-SIGMOD international conference on management of data (SIGMOD'00). New York, NY, USA:ACM,2000:1-12.
    [62]Agarwal R, Aggarwal C C, Prasad V V. A tree projection algorithm for generation of frequent itemsets[J]. J Parallel Distribute Comput,2001,61:350-371
    [63]Pei J, Han J, Mortazavi A B, et al. PrefixSpan:mining sequential patterns efficiently by prefix-projected pattern growth[C]. Proceeding of the 2001 international conference on data engineering (ICDE'01). Washington, USA:IEEE Computer Society Press,2001:215-224.
    [64]Liu J, Pan Y, Wang K, et al. Mining frequent item sets by opportunistic projection[C]. Proceeding of the 2002 ACM SIGKDD international conference on knowledge discovery in databases (KDD'02). New York, NY, USA:ACM,2002:239-248.
    [65]Liu G, Lu H, Lou W, et al. On computing, storing and querying frequent patterns[C]. Proceeding of the 2003 ACM SIGKDD international conference on knowledge discovery and data mining (KDD'03). New York, NY, USA:ACM,2003:607-612.
    [66]Grahne G, Zhu J. Efficiently using prefix-trees in mining frequent itemsets[C]. Proceeding of the ICDM'03 international workshop on frequent itemset mining implementations (FIMI'03). New York, NY, USA:ACM,2003:123-132.
    [67]Agrawal R, Srikant R. Mining sequential patterns[C]. Proceeding of the 1995 international conference on data engineering (ICDE'95). Washington, USA:IEEE Computer Society Press, 1995:3-14.
    [68]Srikant R, Agrawal R. Mining sequential patterns:generalizations and performance improvements[C]. Proceeding of the 5th international conference on extending database technology (EDBT'96). New York, NY, USA:ACM,1996:3-17.
    [69]Zaki M. SPADE:an efficient algorithm for mining frequent sequences[J]. Machine Learning, 2001,40:31-60.
    [70]Zaki M J. Scalable algorithms for association mining[J]. IEEE Trans. Knowl. Data Eng.2000, 12:372-390.
    [71]Zaki M J, Hsiao C J. CHARM:an efficient algorithm for closed itemset mining[C]. Proceeding of the 2002 SIAM international conference on data mining (SDM'02). Philadelphia, PA, USA:SIAM,2002:457-473
    [72]Pei J, Han J, Mortazavi A B, et al. Mining sequential patterns by pattern-growth:the prefixspan approach[J]. IEEE Trans Knowl Data Eng,2004,16:1424-1440.
    [73]Pei J, Han J, Wang W. Constraint-based sequential pattern mining in large databases[C]. Proceeding of the 2002 international conference on information and knowledge management (CIKM'02). New York, NY. USA:ACM,2002:18-25.
    [74]Mannila H, Toivonen H, Verkamo I. Discovery of frequent episodes in event sequences[C]. Proceeding of the 1st International Conference on Knowledge Discovery and Data Mining (KDD'95). AAAI Press,1995:210-215.
    [75]Mannila H, Toivonen H. Discovering generalized episodes using minimal occurrences[C]. Proceeding of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD'96). AAAI Press,1996:146-151.
    [76]Baixeries J, Garriga C G, Balcazar J L. Mining unbounded episodes from sequential data. Technical Report NC-TR-01-091, NeuroCOLT, Royal Holloway University of London, UK, 2001.
    [77]Garriga C G. Discovering unbounded episodes in sequential data[C]. Proceeding of the 7th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'03). Berlin Heidelberg:Springer,2003:83-94.
    [78]Meger N, Rigotti C. Constraint-based mining of episode rules and optimal window sizes[C]. Proceeding of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'04). Berlin Heidelberg:Springer,2004:313-324.
    [79]Manku G, Motwani R. Approximate frequency counts over data streams[C]. Proceeding of the 2002 international conference on very large data bases (VLDB'02). New York, NY. USA: ACM,2002:346-357
    [80]Chang J, Lee W. Finding recent frequent itemsets adaptively over online data streams[C]. Proceeding of the 2003 international conference on knowledge discovery and data mining (KDD'03). New York, NY. USA:ACM,2003:487-492.
    [81]Lin C, Chiu D, Wu Y, et al. Mining frequent itemsets from data streams with a time-sensitive sliding window[C]. Proceeding of the 2005 SIAM international conference on data mining (SDM'05). Philadelphia, PA, USA:SIAM,2005:68-79
    [82]Karp R M, Papadimitriou C H, Shenker S. A simple algorithm for finding frequent elements in streams and bags[J]. ACM Transaction Database System,2003,28:51-55.
    [83]Yu J X, Chong Z, Lu H, et al. False positive or false negative:mining frequent itemsets from high speed transactional data streams[C]. Proceeding of the 2004 international conference on very large data bases (VLDB'04). New York, NY. USA:ACM,2004:204-215
    [84]Chi Y, Wang H, Yu P S, et al. Moment:maintaining closed frequent itemsets over a stream sliding window[C]. Proceeding of the 2004 international conference on data mining (ICDM'04). Washington, USA:IEEE Computer Society Press,2004:59-66.
    [85]Metwally A, Agrawal D, Abbadi A. Efficient computation of frequent and top-k elements in data streams[C]. Proceeding of the 2005 international conference on database theory (ICDT'05). London, UK:Springer-Verlag,2005:398-412.
    [86]Jin R, Agrawal G. An algorithm for in-core frequent itemset mining on streaming data[C]. Proceeding of the 2005 international conference on data mining (ICDM'05). Washington, USA:IEEE Computer Society Press,2005:210-217.
    [87]Han J W, Cheng H, Dong X, et al. Frequent pattern mining:current status and future directions[J]. Data Mining Knowledge Discovery,2007,15:55-86.
    [88]Liu B, Hsu W, Ma Y. Integrating classification and association rule mining[C]. Proceeding of the 1998 international conference on knowledge discovery and data mining (KDD'98). New York, NY. USA:ACM,1998:80-86.
    [89]Dong G, Li J. Efficient mining of emerging patterns:discovering trends and differences[C]. Proceeding of the 1999 international conference on knowledge discovery and data mining (KDD'99). New York, NY. USA:ACM,1999:43-52.
    [90]Li J, Dong G, Ramamohanrarao K. Making use of the most expressive jumping emerging patterns for classification[C]. Proceeding of the 2000 Pacific-Asia conference on knowledge discovery and data mining (PAKDD'00). London, UK:Springer-Verlag,2000:220-232.
    [91]Yin X, Han J. CPAR:classification based on predictive association rules[C]. Proceeding of the 2003 SIAM international conference on data mining (SDM'03). Philadelphia, PA, USA: SIAM,2003:331-335.
    [92]Gao C, Tan K L, Tung A K H et. al. Mining.top-K covering rule groups for gene expression data[C]. Proceeding of the 2005 ACM SIGKMOD international conference on management of data (SIGMOD'05). New York, NY. USA:ACM,2005:670-681.
    [93]Wang J, Karypis G. HARMONY:efficiently mining the best rules for classification. Proceeding of the 2005 SIAM conference on data mining (SDM'05). Philadelphia, PA, USA: SIAM,2005:205-216.
    [94]Cheng H, Yan X, Han J, et al. Discriminative frequent pattern analysis for effective classification[C]. Proceeding of the 2007 international conference on data engineering (ICDE'07), Washington, USA:IEEE Computer Society Press,2007:716-725.
    [95]Agrawal R, Gehrke J, Gunopulos D, et al. Automatic subspace clustering of high dimensional data for data mining applications[C], Proceedings of the 1998 ACM-SIGMOD international conference on management of data (SIGMOD'98). New York, NY. USA:ACM,1998: 94-105.
    [96]Cheng C H, Fu A W, Zhang Y. Entropy-based subspace clustering for mining numerical data[C]. Proceeding of the 1999 international conference on knowledge discovery and data mining (KDD'99). New York, NY. USA:ACM,1999:84-93.
    [97]Beil F, Ester M, Xu X. Frequent term-based text clustering[C]. Proceeding of the 2002 ACM SIGKDD international conference on knowledge discovery in databases (KDD'02), New York, NY. USA:ACM,2002:436-442.
    [98]Wang W, Lu H, Feng J, et al. Condensed cube:an effective approach to reducing data cube size[C]. Proceeding of the 2002 international conference on data engineering (ICDE'02). Washington, USA:IEEE Computer Society Press,2002:155-165.
    [99]Watts D J, Strogatz S H. Collective dynamics of "small-world" networks[J]. Nature,1998, 393:440-442.
    [100]Bollobas B. The diameter of random graphs[J]. Transaction American Mathematics Society, 1981,267:41-52.
    [101]Bollobas B. Random Graphs,2nd edition. New York, NY. USA:Academic Press, 2001:32-53.
    [102]Chung F, Lu L. The average distances in random graphs with given expected degrees[J]. Proc. Natl. Acad. Sci. USA,2002,99:15879-15882.
    [103]Dorogovtsev S N, Mendes J F F, Samukhin A N. Metric Structure of Random Networks[J]. Nuclear Physics B.2003,653(3):307-338.
    [104]Fronczak P, Holyst J A.Average Path Length in Random Graphs[J].Physical Review E, 2004,70(5):1-7.
    [105]Bollobas B, Riordan O. The Diameter of a Scale-Free Random Graph[J]. Combinatorica, 2002,24(1):5-34.
    [106]Bordens M, Gomez I. Collaboration networks in science[J]. The Web of Knowledge:A Festschrift in Honor of Eugene Garfield. Medford, NJ:Information Today,2000:197-215.
    [107]Getoor L, Christopher P D. Link Mining:A survey[J]. SIGKDD Explorations,2005,7(2): 3-12.
    [108]Wasserman S, Faust K. Social Network Analysis:Methods and Applications[M]. Cambridge:Cambridge University Press,1994:108-132.
    [109]Ding C H Q. A tutorial on spectral clustering[C]. International Conference on Machine Learning (ICML'04),2004, URL:http://crd.lbl.gov/cding/Spectral/.
    [110]Newman M E J. Detecting community structure in networks[J]. European Physical Journal B,2004,38:321-330.
    [111]Gibson D, Kleinberg J, Raghavan P. Inferring web communities from link topology[C]. Proceeding of ACM Conference on Hypertext and Hypermedia. New York, NY. USA: ACM,1998:225-234.
    [112]Tyler J R., Wilkinson D M, Huberman B A. Email as Spectroscopy:Automated Discovery of Community Structure within Organizations[J]. Communities and technologies, The Netherlands:Kluwer, B.V. Deventer, The Netherlands.2003:81-96.
    [113]Freeman L. Centrality in social networks:Conceptual clarifications[J]. Social Networks, 1979,1:215-239.
    [114]Nowicki K, Snijders T A B. Estimation and prediction for stochastic blockstructures[J]. Journal of the American Statistical Association,2001,96(455):1077-1087.
    [115]Kemp C, Griths T L, Tenenbaum J B. Discovering latent classes in relational data[OL]. Technical Report AI Memo 2004-019, Massachusetts Institute of Technology,2004, http://hdl.handle.net/1721.1/30489.
    [116]Wolfe P, Jensen D. Playing multiple roles:discovering overlapping roles in social networks[C]. Proceeding of ICML04 Workshop on Statistical Relational Learning and its Connections to Other Fields,2004.
    [117]Adibi J, Chalupsky H, Melz E, et al. The KOJAK group Finder:Connecting the dots via integrated knowledge-based and statistical reasoning[C]. Proceeding of Innovative Applications of Artificial Intelligence Conference. Menlo Park, CA. USA:American Association for Artificial Intelligence,2004:800-808.
    [118]Kubica J, Moore A, Schneider J, et al. Stochastic link and group detection[C]. Proceeding of Eighteenth National Conference on Artificial Intelligence. Menlo Park, CA. USA: American Association for Artificial Intelligence,2002:798-804.
    [119]Kubica J, Moore A., Schneider J. Tractable group detection on large link data sets[C]. Proceeding of The Third IEEE International Conference on Data Mining. Washington, USA:IEEE Computer Society Press,2003:573-576,.
    [120]Wang X., Mohanty N, McCallum A. Group and topic discovery from relations and text[C]. Proceeding of KDD Workshop on Link Discovery. New York, NY. USA:ACM,2005: 46-55.
    [121]Ester M, Ge R, Gao B J, et al. Joint cluster analysis of attribute data and relationship data: the connected k-center problem[C]. Proceeding of the 2006 SIAM international conference on data mining (SDM'06). Philadelphia, PA, USA:SIAM,2006:246-258.
    [122]Moser F, Ge R, Ester M. Joint cluster analysis of attribute and relationship data without a-priori specification of the number of clusters[C]. Proceedings of Knowledge Discovery in Databases:KDD. New York, NY, USA:ACM,2007:510-519.
    [123]Palla G, Derenyi I, Farkas I, et al. Uncovering the overlapping community structure of complex networks in nature and society[J]. Nature,2005,435:814-818.
    [124]Inokuchi T, Washio, Motoda H. An Apriori-based algorithm for mining frequent substructures from graph data[C]. Proceeding of European Conference on Principles and Practice of Knowledge Discovery and Data Mining. London, UK:Springer Verlag,2000: 13-23.
    [125]Kuramochi M, Karypis G. Frequent subgraph discovery[C]. Proceeding of IEEE International Conference on Data Mining. Washington, USA:IEEE Computer Society Press,2001:313-320.
    [126]Yan X, Han J. gSpan:Graph-based substructure pattern mining[C]. Proceeding of IEEE International Conference on Data Mining. Washington, USA:IEEE Computer Society Press,2002:721-725.
    [127]Cook D J, Holder L B. Substructure discovery using minimum description length and background knowledge[J]. Journal of Artificial Intelligence Research,1994,1:231-255.
    [128]Yoshida K, Motoda H, Indurkhya N. Graph based induction as a unified learning framework[J]. Journal of Applied Intelligence,1994,4(3):297-316.
    [1291 Bron C, Kerbosch J. Algorithm 457:Finding all cliques of an undirected graph[J]. Communication of the ACM,1973,16(9):575-577.
    [130]Tsukiyama S, Ariyoshi H, Shirakawa I. A new algorithm for generating all the maximal independent sets[J]. SIAM J. COMPUT,1977,6(3):505-517.
    [131]Makino K, Uno T. New algorithms for enumerating all maximal cliques[C]. Proceeding of Scandinavian Workshop on Algorithm Theory (SWAT 2004). Heidelberg:Springer Berlin, 2004:260-272.
    [132]Kose F, Weckwerth W, Linke T, et al. Visualizing plant metabolomic correlation networks using clique-metabolite matrices[J]. Bioinformatics,2001,17:1198-1208.
    [133]Crane D. Invisible colleges:Diffusion of knowledge in scientific communities[M]. Chicago: University of Chicago Press,1972:96-112.
    [134]Barbour D, Reinert G. Small worlds[J]. Random Structures Algorithms,2001,19:54-74.
    [135]Newman M E J, Girvan M. Finding and evaluating community structure in networks[J]. Phys. Rev. E,2004,69(026113):56-68.
    [136]Newman M E J. Fast algorithm for detecting community structure in networks[J]. Phys. Rev.E 2004,69(6):1-5.
    [137]Dhillon S, Mallela S, Modha D S. Information-theoretic co-clustering[C]. Proceeding of Knowledge Discovery in Databases:KDD. New York, NY, USA:ACM,2003:89-98.
    [138]Chakrabarti D, Papadimitriou S, Modha D S, et al. Fully automatic cross-associations[C]. Proceeding of Knowledge Discovery in Databases:KDD. New York, NY, USA:ACM, 2004:79-88.
    [139]Noble C C, Cook D J. Graph-based anomaly detection[C]. Proceeding of Knowledge Discovery in Databases:KDD. New York, NY, USA:ACM,2003:631-636.
    [140]Keogh E, Lonardi S, Ratanamahatana C A. Towards parameter-free data mining[C]. Proceeding of Knowledge Discovery in Databases:KDD. New York, NY, USA:ACM, 2004:206-215.
    [141]Derrnyi I, Palla G, Vicsek T. Clique percolation in random networks[J]. Physical Review Letter,2005,94 (16):76-85.
    [142]Palla G, Barabasi A L, Vicsek T. Quantifying social group evolution[J]. Nature,2007:446: 664-667.
    [143]Adamcsek B, Palla G, Farkas I, et al. CFinder:locating cliques and overlapping modules in biological networks[J]. Bioinformatics,2006,22:1021-1023.
    [144]Gregory S. An algorithm to find overlapping community structure in networks[C]. Knowledge Discovery in Databases. Proceedings of Knowledge Discovery in Databases: PKDD 2007. Berlin Heidelberg:Springer,2007,4213:593-600.
    [145]Li X, Liu B, Yu P S. Discovering overlapping communities of named entities[C]. Knowledge Discovery in Databases. Proceedings of Knowledge Discovery in Databases: PKDD 2006. Berlin Heidelberg:Springer,2006,4213:593-600.
    [146]Sun J M, Yu P S, Faloutsos C, et al. GraphScope:Parameter-free mining of large time-evolving graphs[C], Proceedings of Knowledge Discovery in Databases:KDD. New York, NY, USA:ACM,2007:687-696.
    [147]Chayant T, Tanya B W, David K. A framework for community identification in dynamic social networks[C]. Proceedings of Knowledge Discovery in Databases:KDD. New York, NY, USA:ACM,2007:717-726.
    [148]Asur S, Parthasarathy S, Ucar D. An event-based framework for characterizing the evolutionary behavior of interaction graphs[C]. Proceedings of Knowledge Discovery in Databases:KDD. New York, NY, USA:ACM,2007:913-921
    [149]Guha S, Gunopulos D, Koudas N. Correlating synchronous and asynchronous data streams[C]. Proceedings of Knowledge Discovery in Databases:KDD. New York, NY, USA:ACM,2003:529-534.
    [150]Papadimitriou S, Sun J, Faloutsos C. Streaming pattern discovery in multiple time-series[C]. Proceeding of 31st Int. Conf. on Very Large Data Bases (VLDB95). New York, NY, USA: ACM,2005:697-708.
    [151]Sun J, Tao D, Faloutsos C. Beyond streams and graphs:dynamic tensor analysis[C]. Proceedings of Knowledge Discovery in Databases:KDD. New York, NY, USA:ACM, 2006:374-383.
    [152]Ning H, Xu W, Chi Y, et al. Incremental spectral clustering with application to monitoring of evolving blog communities[C]. Proceedings of SIAM International Conference on Data Mining. Philadelphia, PA, USA:SIAM,2007:261-271.
    [153]Tong H H, Papadimitriou S, et al. Colibri:Fast mining of large static and dynamic graphs[C]. Proceedings of Knowledge Discovery in Databases:KDD. New York, NY, USA:ACM,2008:686-694.
    [154]Boukerche, Handbook of algorithms for wireless networking and mobile computing[M]. Chapman & Hall/CRC,2005:13-32.
    [155]Boukerche A, Samarah S. A novel algorithm for mining association rules in wireless Ad Hoc sensor networks[J]. IEEE Transactions on parallel and distributed systems,2008,19(7): 865-877.
    [156]Laxman S. Stream prediction using a generative model based on frequent episodes in event sequences. Proceedings of Knowledge Discovery in Databases:KDD. New York, NY, USA:ACM,2008:453-461.
    [157]Laxman S, Sastry P S, Unnikrishnan K P. Discovering frequent episodes and learning Hidden Markov Models:A formal connection[J]. IEEE Transactions on Knowledge and Data Engineering,2005,17(11):1505-1517.
    [158]Giannella C, Han J, Pei J, et al. Mining frequent patterns in data streams at multiple time granularities[M]. Next Generation Data Mining, AAAI/MIT,2003:chapter 3.
    [159]Cormode G, Muthukrishnan S. What's hot and what's not:tracking most frequent items dynamically[C]. Proceeding of PODS international conference, New York, NY, USA: ACM,2003:296-306.
    [160]Gaber M M, Krishnaswamy S, Zaslavsky A. On-board mining of data streams in sensor networks[M]. Advanced Methods of Knowledge Discovery from Complex Data, London UK:Springer,2005:307-335.
    [161]Mannila H, Salmenkivi M. Finding simple intensity descriptions from event sequence data[C]. Proceedings of Knowledge Discovery in Databases:KDD. New York, NY, USA: ACM,2001:341-346.
    [162]Kiernan J, Terzi E. Constructing comprehensive summaries of large event sequences[C]. Proceedings of Knowledge Discovery in Databases:KDD. New York, NY, USA:ACM, 2008:417-425.
    [163]Koivisto M, Perola M, Varilo T, et al. An MDL method for finding haplotype blocks and for estimating the strength of haplotype block boundaries[C]. Proceeding of Pacific Symposium on Biocomputing, World Scientific,2003:502-513.
    [164]Chudova D, Smyth P. Pattern discovery in sequences under a Markovian assumption[C]. Proceedings of Knowledge Discovery in Databases:KDD. New York, NY, USA:ACM, 2002:153-162.
    [165]Wang X, Kaban A. A dynamic bibliometric model for identifying online communities[J]. Journal of Data Mining Knowledge Discovery,2008,16(1):67-107.
    [166]URL:db.csail.mit.edu/labdata/labdata.html
    [167]Aach J, Church G. Aligning gene expression time series with time warping algorithms[J]. Bioinformatics,2001,17(6):495-508.
    [168]Vladimir V. The nature of statistical learning theory[M]. New York, NY. USA: Springer-Verlag,1999:68-89.
    [169]Dasarat B V. Nearest Neighbor (NN) norms:NN pattern classification techniques[M]. Los Alamitos:IEEE Computer Society Press,1990:57-78.
    [170]Lin J, Keogh E, Lonardi S, et al. A symbolic representation of time series with implications for streaming algorithms[C]. Proceeding of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery. New York, NY, USA:ACM,2003:2-11.
    [171]Patel D. Hsu W, Lee M L. Mining relationships among interval-based events for classification[C]. Proceedings of the 2008 ACM-SIGMOD international conference on management of data (SIGMOD'08), New York, NY, USA:ACM,2008:393-404.
    [172]Yang Y, Pedersen J O. A comparative study on feature selection in text categorization[C]. Proceeding of International Conference on Machine Learning. San Fransisco:Morgan Kaufmann Publishers,1997:412-420.
    [173]URL:http://archive.ics.uci.edu/ml/
    [174]Chandrasekaran S, Cooper O, Deshpande A, et al. TelegraphCQ:Continuous dataflow processing for an uncertain world[C]. Proceedings of the Conference on Innovative Data Systems Research (C.IDR 2003), New York, NY, USA:ACM,2003:668-668.
    [175]Abadi D, Carney D, Cetintemel U, et al. Aurora:A new model and architecture for data stream management[J]. VLDB Journal,2003,12(2):120-139,
    [176]Abadi D, Carney D, Cetintemel U, et al. Aurora:A data stream management system (Demonstration)[C]. Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'03), New York, NY, USA:ACM,2003:666-666.
    [177]Daniel J A, Ahmad Y, Balazinska M, et al. The design of the Borealis stream processing engine[C].2nd Biennial Conference on Innovative Data Systems Research (CIDR'05), New York, NY, USA:ACM,2005:277-289.
    [178]Motwani R, Widom J, Arasu A, et. al. Query processing, resource management, and approximation in a data stream management system[C]. Proceedings of the Conference on Innovative Data Systems Research (CIDR 2003), New York, NY, USA:ACM,2003: 245-256.
    [179]Weiss G M, Hirsh H. Learning to predict rare events in event sequences[C]. Proceeding of the 4th International Conference on Knowledge Discovery and Data Mining (KDD 98), New York City, NY, USA:ACM,1998:359-363.
    [180]Wren C R., Ivanov Y A, Leigh D, et. Al. The MERL Motion Detector Dataset[C] Workshop on Massive Datasets (MD),2007:10-14.
    [181]Johnson D S, Yanakakis M, Papadimitriou C H. On generating all maximal independent sets[J]. Information Processing Letters.1988,27 (3):119-123.
    [182]Read R C, Tarjan R E. Bounds on backtrack algorithms for listing cycles, paths, and spanning trees[J]. Networks,1975,5:237-252
    [183]Predrag T T, Gul A A. Maximal clique based distributed coalition formation for task allocation in large-scale multi-agent systems[C].Fist International Workshop on Massively Multi-Agent Systems(MMAS 2004), Heidelberg:Springer Berlin,2004:104-120.
    [184]Eiter T, Makino K. On computing all abductive explanations[C]. Proceeding of Association for the Advancement of Artificial Intelligence (AAAI'02), AAAI Press,2002:62-67.
    [185]Selman B, Levesque H J. Support set selection for abductive and default reasoning[J]. Artificial Intelligence,1996,82(1-2):259-272,
    [186]Agrowal R, Mannila H, Srikant R, et. al. Fast discovery of association rules[M]. Advances in Knowledge Discovery and Data Mining, MIT Press,1996:307-328.
    [187]Faisal N, Khzam A, Nicole E, et. al. On the relative efficiency of maximal clique enumeration algorithms, with application to High-Throughput computational biology[C]. Proceeding of International Conference on Research Trends in Science and Technology, 2005.
    [188]Kaufman L, Rousseeuw P. Finding groups in data:An introduction to cluster analysis[M]. New York, NY. USA:Wiley,1990:67-99.
    [189]Zachary W W. An information flow model for conflict and fission in small groups[J]. Journal of Anthropological Research,1977,33(4):452-473.
    [190]Lusseau D, Schneider K., Boisseau OJ, et. al. The bottlenose dolphin community of doubtful sound features a large proportion of long lasting associations[J]. Behavioral Ecology and Sociobiology,2003,54(4):396-405.
    [191]Girvan M, Newman M E J. Community structure in social and biological networks[J]. Proc. Natl. Acad. Sci. USA,2002,99(12):7821-7826.
    [192]Baumes J, Goldberg M, Ismail M. Efficient identification of overlapping communities[C]. International Conference on Intelligence and Security Informatics, ISI'05. Heidelberg: Springer,2005:27-36.
    [193]刘军.社会网络分析导论[M].北京：社会科学文献出版社,2004：20-68.
    [194]Salton G, McGill M J. Introduction to modern information retrieval[M]. New York, NY. USA:McGraw-Hill,1983:213-234.
    [195]Pelleg D, Moore A. X-means:Extending k-means with efficient estimation of the number of clusters. Proceeding of International Conference on Machine Learning (ICML'00). San Fransisco:Morgan Kaufmann Publishers,2000:727-734.
    [196]Rissanen J. Modeling by shortest data description[J]. Automatica,1978,14:465-471.
    [197]Rissanen J. Stochastic complexity in statistical inquiry Theory[M]. River Edge, NJ, USA: World Scientific,1989:92-110.
    [198]Rissanen J. A universal prior for integers and estimation by minimum description length[J]. Annals of Statistics,1983, 11(2):416-431.
    [199]Strehl A, Ghosh J. Cluster ensembles — a knowledge reuse framework for combining multiple partitions[J]. Journal of Machine Learning Research,2003,3:583-617.
    [200]邢春晓,高凤荣,战思南,等.适应用户兴趣变化的协同过滤推荐算法[J].计算机研究与发展,2007,44(2)：296-301.
    [201]Song X D, Chi Y, Hino K, et. al. Information flow modeling based on diffusion rate for (WWW 2007).,2007:191-200. prediction and ranking. Proceeding of the 16th International World Wide Web Conference
    [202]Rogers E M. Diffusion of innovations[M]. New York, NY. USA:The Free Press,1995: 78-96.
    [203]Kempe D, Kleinberg J, Tardos E. Maximizing the spread of influence through a social network[C]. Proceeding of 10th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data mining. New York, NY, USA:ACM,2003:301-311.
    [2041 Sarwar B, Karypis G, Konstan J, et. al. Item-based collaborative filtering recommendation 2001),2001,285-295. algorithms[C]. Proceeding of the 10th International World Wide Web Conference (WWW
    [205]Bass F. A new product growth for model consumer durables[J]. Management Science,1969, 15(5):215-227.
    [206]Richardson M, Domingos P. Mining knowledge-sharing sites for viral marketing[C]. Proceeding of 9th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data mining. New York, NY, USA:ACM,2002:108-118.
    [207]Oliveira J G, Barabasi A L. Human dynamics:Darwin and Einstein correspondence patterns[J]. Nature,2005,437,1251.
    [208]Ching W K, Micheal K N. Markov Chains:Models, algorithms and applications[M]. New York, NY. USA:Springer,2006:32-68.
    [209]Yao D D. First-Passage time moments of Markov processes[J],Journal of Applied Probability,1985,22(4):939-945.
    [210]Ross S M. Introduction to probability models[M]. 8th edition. Beijing, China:Posts and Telecom Press,2007:10-129.
    [211]Michael L. DBLP Library[OL], URL:http://www.informatik.uni-trier.de/-ley/db/,2007.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700