用户名: 密码: 验证码:
社会网络分析与挖掘的若干关键问题研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
社会网络(Social Network)是指社会个体成员之间通过社会关系结成的网络体系。近年来,随着信息技术的发展,在线社交和微博等新兴社会网络应用获得了快速发展,为人类交互、知识共享、信息传播提供了完善的通信平台,也对人们的日常生活和行为方式产生了显著的影响,因此,在当今这个在线社交的时代,对社会网络研究更具有重要的理论价值和实际价值。
     对在线社交网络的分析与挖掘也存在着巨大的挑战。社会网络与一般信息网络相比,有其独特的特性。例如,社会网络中的个体和群体在行为上体现出了强烈的社会性特征,对其他节点和网络上下文具有强烈的依赖性;在线社会网络时代多样的应用类型带来了多样的网络属性,对个体和群体可以从各个侧面进行描述,具有多维性;个体在网络中的行为与网络拓扑结构相互影响,内容和结构具有明显的互动关联性;个体在网络上发表信息内容,生成相互关系,演化性也是社会网络中的重要属性。本文在分析了相关工作的基础上,主要针对社会性、多维度、关联性、演化性四个特征分别进行了研究,主要研究内容和成果包括:
     (1)在社会网络的社会性分析方面,本文重点研究了节点间的相互依赖和支持现象。本文通过分析网络链接结构与个体影响力的相互关系,将社会网络中个体间的依赖关系从一般社会关系中抽取出来,建立了一种依赖网络模型,将一般社会网络结构转化为依赖关系网。在此基础上,结合依赖距离的最近邻和k近邻方法,给出了一种类似于“投票”的个体支持力计算方法。支持力可以用来对社会网络上的个体进行排名,与其他基于链接关系的节点重要性度量不同的是,支持力度量则更侧重于个体对于其他网络群体的贡献,体现的是网络对个体的“倚重”程度。此外,我们还将依赖模型扩展到社会网络的社团发现技术上来。通过依赖性模型,可以发现成员间紧密依赖、互不可分的网络群体,即依赖型社团。本文我们还给出了依赖型社团的定义和发现方法,以及多个衡量社团质量的标准。
     (2)在多维性方面,本文重点研究多属性约束的社团分析和挖掘方法。本文在加权图的基础上,提出了一种基于最近邻关系和准完全图的社团定义,目标是识别网络中,既满足规模要求、又连接紧密的社团。基于这个模型,我们分析了社团紧密度和社团成员规模的相互约束条件,给出了一种基于多准则约束的挖掘方法。该算法不但具有在网络的不同边密度地区识别社团的能力,而且在结果上能够同时体现出社团规模和紧密度的折衷关系。我们分析了多种约束条件对社团搜索的影响,并利用约束间的相互制约关系,给出了高效的局部搜索算法和剪枝策略。
     (3)在内容和结构关联性方面,本文重点研究层次化的特征蔟抽取方法。本文将在扭曲空间中的社会网络结构,通过层次式提取,映射到空间曲面上来,形成一种类似于等高线地形图的密度分布层次结构,旨在分析社会网络上的特征群体在内容上的独特性。本文给出了一种基于网络拓扑结构的标注密度估计方法,即将网络上的离散的标注信息,通过一种类似于疾病传播的模型,抽象为标注密度函数。并通过在层次结构中不同粒度的簇中的聚合计算结果,估计标注在链接结构中各处的特征指标。本文设计了特征簇的挖掘算法,即在抽取网络层次结构的同时,通过估计各部分的特征函数的上下界,达到了自底向上的剪枝效果,从而实现了高效精确计算特征簇的算法。
     (4)在演化性方面,本文重点研究个体社会影响力的早期预测技术。本文首先将话题的传播趋势模型化为一组带有时间标注的序列,将当前已经发生的事件作为训练样本,通过对当前时刻之前的已发生的事实数据进行训练,从而生成分类和预测规则,以达到对当前时间之后的话题趋势进行预测分析的目标。本文综合社会网络内容的演化性和网络固有结构特征两个方面,设计了时间序列模型,并给出了基于结构和内容相似性的时间序列匹配方法。通过对新近发生的事实与之前的预测结果进行对比,从而达到对分类器进行评估,调优或重新训练的目标。
     综上所述,本文针对社会网络的社会性、多维度、关联性、演化性四个特征,研究了社会网络的个体排名、社团发现、特征抽取和趋势预测等关键技术,对于社会网络的分析和挖掘工作具有重要的理论意义和应用价值。
A social network is a social structure made up of individuals which are tied by so-cial links. In recent years, with the rapid development of information technology, onlinesocial networking services and micro blogging service received a lot of attentions. So-cial networks provide people a comprehensive communication platform of interaction,knowledge sharing, information dissemination, and so on. It also brought a significantimpact on people’s daily life and behaviors. Therefore, at this age of online social, weconsider the social network has important theoretical and practical value on research andapplications.
     Analyzing and mining the online social network also brings great challenges. Differ-ent with traditional information networks, social networks have their unique characteris-tics. For example, individuals and groups in a social network reflect strong social charac-teristics in behavior, and show strong interdependence with the other nodes and networkcontext; Diverse online social network applications bring various network attributes, andthe description of individuals or groups are multidimensional; The individuals and thenetwork topology structure interact continuously, and the content and structure are corre-lated; The individuals publish new contents on the network, form new connections, andthe social networks are evolving over time. In this dissertation, social, multi-dimensional,correlationandevolutionarefourimportantfeatures,andweconductedthecorrespondingresearch on these four features. The content and contributions include:
     (1) On analyzing the sociability of individuals, we study the interdependence andsupportiveness among vertices. We extract the relationship of interdependence from or-dinary social connections, and propose a way for measuring it. We study the relationbetween topological structure and individuals’social impact, and propose a formal def-inition of interdependence, so that a social network can be converted to an independencenetwork. We also develop an efficient algorithm for calculating the individuals’support-iveness, which mimics a voting process. The individuals’supportiveness can be adoptedfor ranking social entities. Different with other ordinary scoring functions, our support-iveness measure reflects the individuals’contribution to others and one’s reliability.Our interdependence model can also be used for discovering the tightly self-dependedand connected cliques. We also propose the corresponding definition and measures.
     (2) On the multidimensional context, we focus on discovering the communities withmulti-constraints. In social networks, community size and tightness are often two con-flicting goals. In this paper, based on the weighted graphs, in order to detect the large and tightly connected subgraphs, we introduce a novel community model based on nearestneighbors and quasi-cliques. We analyze the constraints between the attributes of tight-ness and community size, and introduce a mining algorithm based on the constraints. Ourmethod can detect the communities on different subpart of a large graph, and the resultscan meet the requirement of both goals. We analyze and utilize the impact from multipleconstraints, and develop efficient searching algorithms and pruning techiques.
     (3) For the correlation of user generated content and topological structure, we focuson extracting the featured vertex cluster in a multi-hierarchical way. In order to analyzethe relevance between clustered vertices and their rich label information, in this paper,a hierarchical structure extraction approach based on agglomerative clustering has beenproposed, and a density estimation based on topological structure has been designed. Byconducting the hierarchical aggregation on layers of hierarchical structure, the charac-teristic of clusters can be measured. We design efficient algorithms for estimating theparticularity score. By conducting the pruning in a bottom-up way, the featured clusterscan be calculated precisely.
     (4) The social networks are evolving over time. We focus on studying the predictionproblemoftopics’socialimpacttrends. Inthispaper, weconsiderboththestructuresim-ilarityandtopicalpropertyofsocialnetworks,andanoveltimeseriesmodelisintroduced.The existing user-generated contents can be summarized with a set of valued sequences.Moreover, we introduce a novel hybrid similarity measure, and a best matching basedsupervised learning process are conducted for training the time series. The events beforethe current timestamp can be adopted as a training set, and an early predictor will be gen-erated by learning the rules from the training set. The newly coming events will be usedfor verifying the predictor, or assessing and tuning it.
     In summary, in this paper, we aim at the four key characteristics of social networks.Some key techniques, including entity ranking, community discovery, feature extractionand trend prediction, are studied. These techniques are interesting and useful, and havebrilliant perspective on social network analysis and data mining.
引文
[1] Travers J, Milgram S. An experimental study of the small world problem [J]. So-ciometry. 1969, 32 (4): 425–443.
    [2] Newman M, Strogatz S, Watts D. Random graphs with arbitrary degree distribu-tions and their applications [J]. Physical Review E. 2001, 64 (2): 26118.
    [3] Kleinberg J. Navigation in a small world [J]. Nature. 2000, 406 (6798): 845.
    [4] Fabrikant A, Koutsoupias E, Papadimitriou C. Heuristically optimized trade-offs:A new paradigm for power laws in the Internet [J]. Automata, Languages and Pro-gramming. 2002: 781–781.
    [5] Kleinberg J M, Kumar R, Raghavan P, et al. The Web as a Graph: Measurements,Models and Methods [J]. Lecture Notes in Computer Science. 1999, 1627: 1–17.
    [6] PageL,BrinS,MotwaniR,etal.ThePageRankCitationRanking:BringingOrderto the Web [R]. 1998.
    [7] Brin S, Page L. The anatomy of a large-scale hypertextual Web search engine [J].Computer networks and ISDN systems. 1998, 30 (1-7): 107–117.
    [8] Pandurangan G, Raghavan P, Upfal E. Using pagerank to characterize web struc-ture [J]. Internet Mathematics. 2006, 3 (1): 1–20.
    [9] Redner S. How popular is your paper? An empirical study of the citation distribu-tion [J]. The European Physical Journal B. 1998, 4 (2): 131–134.
    [10] Palmer C, Gibbons P, Faloutsos C. ANF: A fast and scalable tool for data miningin massive graphs [C]. In Proceedings of the eighth ACM SIGKDD internationalconference on Knowledge discovery and data mining. 2002: 81–90.
    [11] AlbertR,JeongH,BarabásiA.Errorandattacktoleranceofcomplexnetworks[J].Arxiv preprint cond-mat/0008064. 2000.
    [12] Tangmunarunkit H, Govindan R, Jamin S, et al. Network topologies, power laws,and hierarchy [J]. ACM SIGCOMM Computer Communication Review. 2002,32 (1): 76.
    [13] WattsD,StrogatzS.Collectivedynamicsof‘small-world’networks[J].Nature.1998, 393 (6684): 440–442.
    [14]张宇.在线社会网络信任计算与挖掘分析中若干模型与算法研究[D].杭州:浙江大学计算机学院,2009.
    [15]中国互联网信息中心.中国互联网络发展状况统计报告[R].中国互联网信息中心(CNNIC).2011.
    [16] Han Y, 0002 B Z, Pei J, et al. Understanding Importance of Collaborations inCo-authorship Networks: A Supportiveness Analysis Approach [C]. In Proceed-ings of the 2009 SIAM international conference on data mining (SDM’09). 2009:1111–1122.
    [17] Tong H, Papadimitriou S, Yu P S, et al. Proximity Tracking on Time-EvolvingBipartite Graphs [C]. In Proceedings of the 2008 SIAM international conferenceon data mining (SDM’08). 2008.
    [18] Latapy M, Phan T, Crespelle C, et al. Termination of multipartite graph series aris-ing from complex network modelling [J]. Combinatorial Optimization and Appli-cations. 2010: 1–10.
    [19] Kumar R, Novak J, Tomkins A. Structure and evolution of online social network-s [C]. In Proceedings of the 12th ACM SIGKDD international conference onKnowledge discovery and data mining (KDD’06). 2006: 611–617.
    [20] Palla G, Derényi I, Farkas I, et al. Uncovering the overlapping community struc-ture of complex networks in nature and society [J/OL]. Nature. 2005, 435 (7043):814–818.
    [21] Newman M E J, Girvan M. Finding and evaluating community structure in net-works [J]. Physical Review E. 2004, 69: 26113.
    [22] Kossinets G, Kleinberg J M, Watts D J. The Structure of Information Pathways ina Social Communication Network [J]. CoRR. 2008, abs/0806.3201.
    [23] Kleinberg J M. Link structures, information flow, and social processes [C]. In Hy-pertext. 2008: 3–4.
    [24] Kuperman M, Abramson G. Small world effect in an epidemiological model [J].Physical Review Letters. 2001, 86 (13): 2909–2912.
    [25] Pastor-Satorras R, Vespignani A. Epidemic spreading in scale-free networks [J].Physical review letters. 2001, 86 (14): 3200–3203.
    [26] Balthrop J, Forrest S, Newman M, et al. COMPUTER SCIENCE: TechnologicalNetworksandtheSpreadofComputerViruses[J].Science.2004,304(5670):527.
    [27] Cosley D, Huttenlocher D P, Kleinberg J M, et al. Sequential Influence Models inSocial Networks [C]. In ICWSM. 2010.
    [28] Kempe D, Kleinberg J,éva Tardos. Maximizing the spread of influence througha social network [C]. In Proceedings of the 9th ACM SIGKDD international con-ference on Knowledge discovery and data mining (KDD’03). 2003: 137–146.
    [29] Kleinberg J. Small-World Phenomena and the Dynamics of Information [C]. InAdvances in Neural Information Processing Systems 14. 2001: 2001.
    [30] Backstrom L, Huttenlocher D, Kleinberg J, et al. Group Formation in Large So-cial Networks: Membership, Growth, and Evolution [C]. In Proceedings of the12th ACM SIGKDD International Conference on Knowledge Discovery and DataMining (KDD’06). 2006: 44–54.
    [31] Nascimento M A, Sander J, Pound J. Analysis of SIGMOD’s co-authorshipgraph [J]. SIGMOD Record. 2003, 32 (3): 8–10.
    [32] Wasserman S, Faust K. Social network analysis: Methods and applications [M].Cambridge Univ Pr, 1994.
    [33] Aggarwal C, Wang H. Managing and Mining Graph Data [M]. Springer-VerlagNew York Inc, 2010.
    [34] Freeman L. A set of measures of centrality based on betweenness [J]. Sociometry.1977, 40 (1): 35–41.
    [35] Kempe D, Kleinberg J, Tardosé. Maximizing the spread of influence through asocial network [C]. In Proceedings of the ninth ACM SIGKDD international con-ference on Knowledge discovery and data mining. 2003: 137–146.
    [36] Dodds P, Muhamad R, Watts D. An experimental study of search in global socialnetworks [J]. Science. 2003, 301 (5634): 827.
    [37] Kleinberg J M. Authoritative Sources in a Hyperlinked Environment [C]. In Pro-ceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithm (SO-DA’98). 1998: 668–677.
    [38] Nie L, Davison B D, Qi X. Topical link analysis for web search [C]. In Proceed-ings of the 29th annual international ACM SIGIR conference on Research anddevelopment in information retrieval (SIGIR’06). 2006: 91–98.
    [39] Wu B, Goel V, Davison B D. Topical TrustRank: Using Topicality to Combat WebSpam [C]. In Proceedings of the 15th International World Wide Web Conference(WWW’06). 2006: 63–72.
    [40] Abello J, Buchsbaum A L, Westbrook J R. A Functional Approach to ExternalGraph Algorithms [C] // Bilardi G, Italiano G F, Pietracaprina A, et al. In Proceed-ings of the 6th Annual European Symposium on Algorithms (ESA’98). August1998: 332–343.
    [41] Karp R M. Reducibility Among Combinatorial Problems [J]. Complexity of Com-puter Computations. 1972.
    [42] Flake G, Lawrence S, Giles C L. Efficient Identification of Web Communities [C].In Proceedings of the 6th ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining (KDD’00). August 20–23 2000: 150–160.
    [43] Newman M E J. Finding community structure in networks using the eigenvectorsof matrices [J/OL]. Physical Review E. 2006, 74 (3): 036104+.
    [44]沈华伟,程学旗,陈海强等.基于信息瓶颈的社区发现[J].计算机学报.2008,31 (04):677–686.
    [45] Inokuchi A, Washio T, Motoda H. An apriori-based algorithm for mining frequentsubstructures from graph data [J]. In Proceedings of the 4th European ConferenceonPrinciplesofDataMiningandKnowledgeDiscovery(PKDD’00).2000:13–23.
    [46] Kuramochi M, Karypis G. GREW-A Scalable Frequent Subgraph Discovery Al-gorithm [C]. In Proceedings of IEEE International Conference on Data Min-ing(ICDM). 2004: 439–442.
    [47] Vanetik N, Gudes E, Shimony S E. Computing Frequent Graph Patterns fromSemistructured Data [C]. In Proceedings of IEEE International Conference on Da-ta Mining(ICDM). 2002: 458–465.
    [48] Yan X, Han J. gSpan: Graph-Based Substructure Pattern Mining [C]. In Proceed-ingsofthe2002IEEEInternationalConferenceonDataMining(ICDM’02).2002:721.
    [49] HuanJ,0010WW,PrinsJ.EfficientMiningofFrequentSubgraphsinthePresenceof Isomorphism [C]. In Proceedings of IEEE International Conference on DataMining(ICDM). 2003: 549–552.
    [50] Huan J, Wang W, Prins J, et al. SPIN: mining maximal frequent subgraphs fromgraph databases [C/OL]. In Proceedings of the tenth ACM SIGKDD internationalconference on Knowledge discovery and data mining. New York, NY, USA, 2004:581–586.
    [51] ZhouY,ChengH,YuJX.GraphClusteringBasedonStructural/AttributeSimilar-ities[J].InProceedingsoftheVLDBEndowment(PVLDB).2009,2(1):718–729.
    [52] Leskovec J, Backstrom L, Kumar R, et al. Microscopic evolution of social net-works [C]. In Proceeding of the 14th ACM SIGKDD international conference onKnowledge discovery and data mining. 2008: 462–470.
    [53] Liben-Nowell D, Kleinberg J. The link-prediction problem for social networks [J].Journal of the American Society for Information Science and Technology. 2007,58 (7): 1019–1031.
    [54] ClausetA,MooreC,NewmanM.Hierarchicalstructureandthepredictionofmiss-ing links in networks [J]. Nature. 2008, 453 (7191): 98–101.
    [55] Malin B. Unsupervised name disambiguation via social network similarity [C]. InSIAM SDM Workshop on Link Analysis, Counterterrorism and Security. 2005:93–102.
    [56] O’Madadhain J, Hutchins J, Smyth P. Prediction and ranking algorithms for event-based network data [J]. ACM SIGKDD Explorations Newsletter. 2005, 7 (2):23–30.
    [57] Pei J, Jiang D, Zhang A. Mining Cross-Graph Quasi-Cliques in Gene Expressionand Protein Interaction Data [C]. In Proceedings of the 21st International Confer-ence on Data Engineering (ICDE’05). 2005: 353–354.
    [58] Langville A, Meyer C. Deeper Inside PageRank [J]. Internet Mathematics. 2004,1 (3): 335–380.
    [59] Sun J, Qu H, Chakrabarti D, et al. Relevance search and anomaly detection inbipartite graphs [J]. ACM SIGKDD Explorations Newsletter. 2005, 7 (2): 48–55.
    [60] Tong H, Faloutsos C, Pan J-Y. Random walk with restart: fast solutions and appli-cations [J]. Knowl. Inf. Syst. 2008, 14 (3): 327–346.
    [61] Yang Y, Slattery S, Ghani R. A study of approaches to hypertext categorization [J].Journal of Intelligent Information Systems. 2002, 18 (2): 219–241.
    [62] Lafferty J D, McCallum A, Pereira F C N. Conditional Random Fields: Proba-bilistic Models for Segmenting and Labeling Sequence Data [C]. In ICML. 2001:282–289.
    [63] Lu Q, Getoor L. Link-based Classification [C]. In ICML. 2003: 496–503.
    [64] JehG,WidomJ.SimRank:ameasureofstructural-contextsimilarity[C].InKDD.2002: 538–543.
    [65] Zeng Z, Wang J, Zhou L, et al. Coherent closed quasi-clique discovery from largedense graph databases [C/OL]. In Proceedings of the 12th ACM SIGKDD inter-national conference on Knowledge discovery and data mining. New York, NY,USA, 2006: 797–802.
    [66] Pei J, Jiang D, Zhang A. On mining cross-graph quasi-cliques [C]. In Proceedingsofthe11thACMSIGKDDInternationalConferenceonKnowledgeDiscoveryandData Mining (KDD’05). 2005: 228–238.
    [67] On B-W, Elmacioglu E, Lee D, et al. An effective approach to entity resolutionproblem using quasi-clique and its application to digital libraries [C]. In Proceed-ings of the 2006 ACM/IEEE Joint Conference on Digital Libraries (JCDL’06).2006: 51–52.
    [68] Abello J, Resende M, Sudarsky S. Massive quasi-clique detection [J]. LATIN2002: Theoretical Informatics. 2002: 598–612.
    [69] Hanneman R A, Riddle M. Introduction to Social Network Methods [M]. Univer-sity of California, Riverside, 2005.
    [70] Hopcroft J, Khan O, Kulis B, et al. Natural Communities in Large Linked Net-works [C]. In Proceedings of the 9th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD’03). 2003: 541–546.
    [71] Gibson D, Kumar R, Tomkins A. Discovering large dense subgraphs in massivegraphs [C]. In Proceedings of the 31st international conference on Very large databases. 2005: 721–732.
    [72] Zhou B, Pei J. Sketching Landscapes of Page Farms [C]. In Proceedings of the2007 SIAM International Conference on Data Mining (SDM’07). 2007.
    [73] Wakita K, Tsurumi T. Finding community structure in mega-scale social network-s [C]. In Proceedings of the 16th international conference on World Wide Web.2007: 1275–1276.
    [74] Li X, Guo L, Zhao Y. Tag-based social interest discovery [C]. In Proceeding of the17th international conference on World Wide Web. 2008: 675–684.
    [75] Nolker R, Zhou L. Social computing and weighting to identify member roles inonline communities [J]. 2005.
    [76] Liu X, Bollen J, Nelson M, et al. Co-authorship networks in the digital libraryresearch community [J]. Information Processing & Management. 2005, 41 (6):1462–1480.
    [77] Zhang H, Qiu B, Giles C, et al. An LDA-based community structure discoveryapproach for large-scale social networks [C]. In Intelligence and Security Infor-matics, 2007 IEEE. 2007: 200–207.
    [78] Zhou D, Ji X, Zha H, et al. Topic evolution and social interactions: how authorseffect research [C]. In CIKM. 2006: 248–257.
    [79] Freeman L. Centrality in social networks conceptual clarification [J]. Social net-works. 1979, 1 (3): 215–239.
    [80] Papadias D, Tao Y, Fu G, et al. Progressive skyline computation in database sys-tems [J]. ACM Trans. Database Syst. 2005, 30 (1): 41–82.
    [81] Tao Y, Xiao X, Pei J. Efficient Skyline and Top-k Retrieval in Subspaces [J]. IEEETrans. Knowl. Data Eng. 2007, 19 (8): 1072–1088.
    [82] Tao Y, Ding L, Lin X, et al. Distance-Based Representative Skyline [C]. In Pro-ceedings of IEEE International Conference on Data Engineering(ICDE). 2009:892–903.
    [83] Dourisboure Y, Geraci F, Pellegrini M. Extraction and classification of dense com-munities in the web [C]. In ACM Internation Conference on World Wide We-b(WWW). 2007: 461–470.
    [84] Porter M F. An algorithm for suffix stripping [M/OL]. San Francisco, CA, USA:Morgan Kaufmann Publishers Inc., 1997: 313–316.
    [85] Zhou Y, Cheng H, Yu J X. Clustering Large Attributed Graphs: An Efficient Incre-mental Approach [C]. In Proceedings of IEEE International Conference on DataMining(ICDM). 2010: 689–698.
    [86] TianY,HankinsRA,PatelJM.Efficientaggregationforgraphsummarization[C].In SIGMOD Conference. 2008: 567–580.
    [87]李庆臻.科学技术方法大辞典[M].北京:科学出版社,1999.
    [88] RosenblattM.Remarksonsomenonparametricestimatesofadensityfunction[J].The Annals of Mathematical Statistics. 1956, 27 (3): 832–837.
    [89] Parzen E. On estimation of a probability density function and mode [J]. The annalsof mathematical statistics. 1962, 33 (3): 1065–1076.
    [90] Lee U, Liu Z, Cho J. Automatic identification of user goals in web search [C].In Proceedings of the 14th international conference on World Wide Web. 2005:391–400.
    [91] Tsuda K, Kudo T. Clustering graphs by weighted substructure mining. [C]. In Pro-ceedings of IEEE International Conference on Machine Learning(ICML). 2006.
    [92] Tsuda K, Kurihara K. Graph Mining with Variational Dirichlet Process MixtureModels [C]. In Proceedings of the SIAM international conference on data min-ing(SDM). 2008: 432–442.
    [93] MurataT,MoriyasuS.Linkpredictionofsocialnetworksbasedonweightedprox-imity measures [C]. In Proceedings of the IEEE/WIC/ACM International Confer-ence on Web Intelligence. 2007: 85–88.
    [94] MurataT,MoriyasuS.Linkpredictionofsocialnetworksbasedonweightedprox-imity measures [C]. In Proceedings of the IEEE/WIC/ACM International Confer-ence on Web Intelligence. 2008: 85–88.
    [95] Leskovec J, Huttenlocher D P, Kleinberg J M. Predicting positive and negativelinksinonlinesocialnetworks[C].InACMInternationConferenceonWorldWideWeb(WWW). 2010: 641–650.
    [96] Backstrom L, Leskovec J. Supervised Random Walks: Predicting and Recom-mending Links in Social Networks [C]. In ACM International Conference on WebSearch and Data Mining(WSDM). 2011.
    [97] Han J, Kamber M. Data mining: concepts and techniques [M]. Morgan Kaufmann,2006.
    [98]王东生,曹磊.混沌,分形及其应用[M].合肥:中国科学技朮大学出版社,1995.
    [99] Kohara K, Ishikawa T, Fukuhara Y, et al. Stock price prediction using prior knowl-edge and neural networks [J]. Intelligent Systems in Accounting, Finance & Man-agement. 1997, 6 (1): 11–22.
    [100]洪飞,吴志美.基于小波的多尺度网络流量预测模型[J].计算机学报.2006,29 (001): 166–170.
    [101] AgrawalR,FaloutsosC,SwamiA.Efficientsimilaritysearchinsequencedatabas-es [J]. Foundations of Data Organization and Algorithms. 1993: 69–84.
    [102] Chan K, Fu W. Efficient time series matching by wavelets [C]. In Proceedings ofIEEE International Conference on Data Engineering(ICDE). 1999: 126.
    [103] Agrawal R, Lin K-I, Sawhney H S, et al. Fast Similarity Search in the Presenceof Noise, Scaling, and Translation in Time-Series Databases [C]. In VLDB. 1995:490–501.
    [104] Berndt D, Clifford J. Using dynamic time warping to find patterns in time se-ries [C]. In AAAI-94 workshop on knowledge discovery in databases. 1994:229–248.
    [105] Athitsos V, Papapetrou P, Potamias M, et al. Approximate embedding-based sub-sequence matching of time series [C]. In Proceedings of the 2008 ACM SIGMODinternational conference on Management of data. 2008: 365–378.
    [106] Bozkaya T, Yazdani N,”Ozsoyo?lu M. Matching and indexing sequences of different lengths [C]. In Pro-ceedingsofthesixthinternationalconferenceonInformationandknowledgeman-agement. 1997: 128–135.
    [107] LiC,YuP,CastelliV.Hierarchyscan:Ahierarchicalsimilaritysearchalgorithmfordatabases of long sequences [C]. In Proceedings of IEEE International Conferenceon Data Engineering(ICDE). 1996: 546.
    [108] Katoh K, Misawa K, Kuma K, et al. MAFFT: a novel method for rapid multiplesequence alignment based on fast Fourier transform [J]. Nucleic acids research.2002, 30 (14): 3059.
    [109] Han J, Pei J, Yin Y, et al. Mining frequent patterns without candidate generation:A frequent-pattern tree approach [C]. 2004: 53–87.
    [110] Pei J, Han J, Mortazavi-Asl B, et al. Mining sequential patterns by pattern-growth:The prefixspan approach [J]. IEEE Transactions on Knowledge and Data Engi-neering. 2004: 1424–1440.
    [111] Keogh E J, Kasetty S. On the need for time series data mining benchmarks: a sur-vey and empirical demonstration [C]. In Proceedings of ACM International Con-ference on Knowledge Discovery and Data Mining(KDD). 2002: 102–111.
    [112] Keogh E J, Lonardi S, chi Chiu B Y. Finding surprising patterns in a time seriesdatabase in linear time and space [C]. In Proceedings of ACM International Con-ference on Knowledge Discovery and Data Mining(KDD). 2002: 550–556.
    [113] Smeaton A F, Keogh G, Gurrin C, et al. Analysis of papers from twenty-five yearsof SIGIR conferences: what have we been doing for the last quarter of a centu-ry? [J]. SIGIR Forum. 2003, 37 (1): 49–53.
    [114] Kajan L, Kertosz-Farkas A, Franklin D, et al. Application of a simple likelihoodratioapproximanttoproteinsequenceclassification.[J/OL].Bioinformatics.2006,22 (23): 2865–2869.
    [115] Keogh E J, Pazzani M J. A Simple Dimensionality Reduction Technique for FastSimilaritySearchinLargeTimeSeriesDatabases[C].InPAKDD.2000:122–133.
    [116] Manning C, Sch”utze H, MITCogNet. Foundations of statistical natural language processing [M].MIT Press, 1999.
    [117] Maguitman A, Menczer F, Roinestad H, et al. Algorithmic detection of semanticsimilarity [C]. In Proceedings of the 14th international conference on World WideWeb. 2005: 107–116.
    [118] BollegalaD,MatsuoY,IshizukaM.Measuringsemanticsimilaritybetweenwordsusing web search engines [C]. In Proceedings of the 16th international conferenceon World Wide Web. 2007: 757–786.
    [119] Manning C, Raghavan P, Schutze H, et al. Introduction to information re-trieval [M]. Cambridge University Press, 2008.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700