社会网络中兴趣发现与信息组织的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
在线社会网络(online social network)是一类帮助用户建立人与人之间的在线朋友关系,从而使得人们可以在朋友间分享兴趣和活动的在线服务、平台或Web站点[1]。因为具有丰富的信息自由发布和朋友分享功能,在线社会网络在短短十几年时间中迅速吸引了大量用户的使用,各种类型的在线社会网络应用也是层出不穷。而在产业界取得巨大成功的同时,这些在线社会网络应用也积累了海量用户信息和指数级增长的用户活动数据,从而为学术研究者开展各种分析和应用研究工作提供了宝贵的平台。已经或正在进行的学术研究包括对在线朋友网络结构的分析,对用户活动和其他动态信息的分析和挖掘,以及基于在线社会网络平台的应用研究等多个方面。
     正是在这些国内外研究的基础上,本文从目前的在线社会网络平台中所存在的信息组织和传播方面的问题出发,在系统架构,朋友关系组织与维护,用户活动和兴趣以及主题信息组织等方面对在线社会网络开展了以下几项研究工作:
     1)提出了基于层次聚类算法的动态兴趣组构建的在线社区系统结构。本文基于从一个拥有超过63000名用户的在线社区中所收集的超过200万条帖子和1800万条看帖等信息,开展了对在线社会网络平台系统结构,特别是内容组织与用户活动的匹配的分析,发现了目前在线社会网络平台中所存在的灵活的用户兴趣与静态的版块结构之间不匹配的问题。为了解决这种不匹配达到改善在线社会网络内容组织形式的目的,本文提出了基于层次聚类算法的动态兴趣组构建的系统结构,并提供一个灵活的订阅机制,即可同时在内容和用户两个维度上对感兴趣内容进行订阅,从而将用户感兴趣的内容以更高效的方式组织在统一的视图里。
     2)提出了用户在线会话区间概念和兴趣组内推荐的思想来改善在线社区中内容推送系统的设计。内容推送系统是在线社会网络中一类重要的应用,它使得在线社会网络中的内容传播更具有针对性。本文通过分析什么类型的用户活动更能体现用户兴趣,改进了内容推送系统中用户兴趣的定义方式;通过分析用户在时间上的特定行为模式,帮助确定用户的在线会话区间,从而避免用户兴趣分析中的“假错误”问题;并通过对用户兴趣的倾向性分析,确定了在进行内容推荐时只考虑进行兴趣组内推荐的设计策略。本文基于实际在线社区所设计实施的内容推荐系统也进一步验证了推送效果的改善。
     3)提出了基于共同兴趣的潜在朋友关系推荐框架,并在特定应用领域设计实现了朋友推荐系统以验证该框架的有效性。本文通过对朋友关系,交互关系以及共同浏览活动之间的异同的分析,提出了基于共同兴趣的潜在朋友关系推荐的思想来改善在线社会网络中内容传播的针对性。此外,针对以往相关研究中推荐准确性不高的问题,提出了基于多维度兴趣融合和结合领域知识的潜在朋友关系推荐框架。为了验证该框架的有效性,本文建立了一个实际的生物领域社区,它包括对小鼠突变品系数据进行发布和分享的数据中心,对生物学家感兴趣的小鼠突变体和基因信息的内容组织,以及基于用户访问数据和基因本体的领域知识所设计实现的帮助生物学家找到具有共同研究兴趣的其他研究者的朋友推荐系统。
     4)提出了基于在线社会媒体网络的主题信息层次化组织的思想,并设计增量层次聚类算法来保持主题结构模型的实时更新。与现实世界中的社会热点相关的实时讨论信息是在线社会媒体网络中用户感兴趣的一类重要内容。而个人用户往往无法从这些毫无结构化组织且存在大量冗余的内容中获得自己感兴趣主题的完整且精简的结构化描述。针对这样的问题,本文提出了基于在线社会媒体网络的主题信息层次化组织的思想,即将用户感兴趣的社会热点信息组织成以层次化展现不同方面主题的形式并在每个粒度上对主题内容进行概括总结,同时提出了增量层次聚类算法从而能动态地容纳社会热点的新产生信息保持主题结构模型的快速更新。利用搜集于Twitter的350万条社会热点讨论数据,本文验证了层次化组织用户兴趣主题信息的可行性和有效性。
Online social network service is an online service, platform, or site that focuses on building social networks or social relations among people to help users sharing interests and/or activities between each other [1]. With the comprehensive functionality of flexibly publishing and sharing information, online social network has enjoyed explosive growth during the recent years and evolved in many different application forms (from online social community to online social media network). In the meantime, the huge success of kinds of online social network applications accumulates massive amount of user profiles and activity data, which further offers a great opportunity for researchers to conduct kinds of data analysis and application designing. These already done or ongoing academic researches include the analysis of online friendship network structure, mining of user activity and other dynamic information in online social network, as well as developing kinds of applications in the platform of online social network.
     In this work, we focus on the angle of analyzing information organization and diffusion in online social network, and conduct researches on system architecture, friendship organization and maintaining, user activity and interest utilization, and event modeling:
     1) A novel structure of online social community with the characteristics of dynamic interest group structure is proposed. By conducting a detail analysis on the real and large scale online social community with over 63,000 users,2 million posts, and 18 million views, this paper points out the problem of mismatch between flexible user interest and fixed social group structure. Addressing on this problem, we design a novel structure of online social community with the characteristics of dynamic interest group structure and flexible subscribing mechanism, which can provide dynamic and unique content organization for every user automatically.
     2) The concept of user session and content recommendation in interest group are proposed to improve the design of content recommendation system. The fast-growing user-generated content poses great challenges on the design of efficient content management and delivery in online social network system. By conducting a comprehensive study of online user activities, including content, social, and time characteristics, this paper try to accurately characterize user interest and user context. And the experiment results obtained in a real online social community further justify that our user interest analysis and context analysis can help to improve the quality and efficiency of content management and delivery.
     3) A general friend recommendation framework based on common interests and a real friend recommendation system in a biology community are designed. By leveraging interest-based features, we also design a general friend recommendation framework, which can characterize user interest in two dimensions:context (location, time) and content, as well as combining domain knowledge to improve friendship recommending quality. To demonstrate the effectiveness of the proposed framework, the Mutagenesis Information CEnter (MICE) providing biologists publishing, sharing and searching discovered mice mutagenesis information was built. Based on the user accessing requests collected from this data center, we design a friend recommendation system for helping biologists finding other interested researchers.
     4) User-interested topic information organization solution is proposed. Real-time generated discussion information related to some social events is one of the most important content in online social media network. However, in current online social media network, users are constantly swamped by long streams of unstructured, redundant, and sometimes irrelevant messages, while at the same time lacking a comprehensive and well-organized view of social events. This paper proposes an effective and efficient topic information organization solution (called ETree) with an incremental and hierarchical modeling technique for identifying and constructing event theme structures at different granularities. Detailed evaluation results on event modeling of 20 real social events with 3.5 million tweets demonstrate that ETree can efficiently and incrementally generate high-quality topic structures.
引文
[1]Wiki-Social network service: http://en.wikipedia.org/wiki/Social_network_service
    [2]Danah M. Boyd, Nicole B. Ellison. Social network sites:Definition, history, and scholarship [J]. Journal of Computer-Mediated Communication.2008,13(1): 210-230.
    [3]TheGlobe.com to Cut Staff, Fold Sites [N]. news.com,2001,8(3).
    [4]Beverly Hills Internet, Builder of Web Communities, Changes Name to GeoCities; Monthly Page [N]. Business Wire,1995,12(14).
    [5]Twitter:http://www.twitter.com.
    [6]D. Zhao and M. B. Rosson, How and why people twitter:the role that micro-blogging plays in informal communication at work [A]. In:Proceedings of the ACM 2009 international conference on Supporting group work[C],2009: 243-252.
    [7]Eric Eldon. Friendster raises $20 million, nabs a Googler to be CEO [N]. VentureBeat,2008,8(4)
    [8]Notable social networking web sites. Searcher,2007:36-37.
    [9]Y.-Y. Ahn, S. Han, H. Kwak, S. Moon, and H. Jeong. Analysis of topological characteristics of huge online social networking services [A]. In:Proc. of the 16th intl. conf. on World Wide Web[C],2007:835-844.
    [10]L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan. Group formation in large social networks:Membership, growth, and evolution [A]. In:Proc. of the 12th ACM SIGKDD intl. conf. on Knowledge discovery and data mining[C],2006:44-54.
    [11]Alan Mislove, Massimiliano Marcon, Krishna P. Gummadi, Peter Druschel, Bobby Bhattacharjee. Measurement and analysis of online social networks [A]. In: Proceedings of the 7th ACM SIGCOMM conference on Internet measurement [C], 2007:29-42.
    [12]M. Cha, A. Mislove, and K. P. Gummadi. A measurement-driven analysis of information propagation in the Flickr social network [A]. In:Proc. of the 18th intl. conf. on World Wide Web[C], Madrid, Spain,2009:721-730.
    [13]J. Leskovec, L. Backstrom, R. Kumar, and A. Tomkins. Microscopic evolution of social networks [A]. In:Proc. of the 14th ACM SIGKDD intl. conf. on Knowledge discovery and data mining[C], Las Vegas, Nevada, USA,2008:462-470.
    [14]Vicenc Gomez, Andreas Kaltenbrunner, Vicente Lopez. Statistical analysis of the social network and discussion threads in Slashdot [A]. In:Proceeding of the 17th international conference on World Wide Web [C],2008:645-654.
    [15]Kou Zhongbao, Zhang Changshui. Reply networks on a bulletin board system [J]. Physical Review E.2003,67(3).
    [16]Lars Backstrom, Dan Huttenlocher, Jon Kleinberg. Group formation in large social networks:membership, growth, and evolution [A]. In:Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining [C],2006:44-54.
    [17]K.-I. Goh, Y.-H. Eom, H. Jeong, B. Kahng, D. Kim. Structure and evolution of online social relationships:Heterogeneity in unrestricted discussions [J]. Physical Review E.2006,73(6).
    [18]Jordi Duch, Alex Arenas. Community detection in complex networks using extremal optimization [J]. Physical Review E.2005,72(2).
    [19]Leon Danon, Albert Diaz-Guilera, Jordi Duch, Alex Arenas. Comparing community structure identification [J]. Journal of Statistical Mechanics:Theory and Experiment,2005.
    [20]Danyel Fisher, Marc Smith, Howard T. Welser. You are who you talk to:Detecting roles in usenet newsgroups [A]. In:Proceedings of the 39th Annual Hawaii International Conference on System Sciences [C],2006.
    [21]Blaz Fortunak, Eduarda Mendes Rodrigues, Natasa Milic-Frayling. Improving the classification of newsgroup messages through social network analysis [A]. In: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management [C],2007:877-880.
    [22]Hyunwoo Chun, Haewoon Kwak, Young-Ho Eom. Comparison of online social relations in volume vs interaction:a case study of cyworld [A]. In:Proceedings of the 8th ACM SIGCOMM conference on Internet measurement [C],2008:57-70.
    [23]Ravi Kumar, Jasmine Novak, Andrew Tomkins. Structure and evolution of online social networks [J]. Link Mining:Models, Algorithms, and Applications.2010: 337-357.
    [24]Ken Wakita, Toshiyuki Tsurumi. Finding community structure in mega-scale social networks [A]. In:Proceedings of the 16th international conference on World Wide Web [C],2007.
    [25]Knapp, E. (2006). A Parent's Guide to Myspace. DayDream Publishers.
    [26]Steve Rosenbush (2005). News Corp.'s Place in MySpace, BusinessWeek, July 19, 2005.
    [27]"Social graph-iti":Facebook's social network graphing:article from The Economist's website. Retrieved on January 19,2008.
    [28]Facebook统计:www.facebook.com/press/info.php?statistics
    [29]Milgram, Stanley. The Small World Problem [J]. Psychology Today.1967,1(1):60-67.
    [30]Jure Leskovec, Eric Horvitz. Planetary-scale views on a large instant-messaging network [A]. In:Proc. of the 17th intl. conf. on World Wide Web[C],2008: 915-924.
    [31]Przemyslaw Kazienko and Katarzyna Musial. Recommendation Framework for Online Social Networks [J]. Advances in Web Intelligence and Data Mining.2006, 23:111-120.
    [32]Vikas Bahirwani, Doina Caragea, Waleed Aljandal and William H. Hsu.Ontology engineering and feature construction for predicting friendship links in the live journal social network [A]. In:Proceedings of the 2nd SNA-KDD Workshop [C], 2008.
    [33]William H. HSU, Tim Weninger, Martin S.R. Paradesi. Predicting links and link change in friends networks:supervised time series learning with imbalanced data [J]. Artificial Neural Networks in Engineering,2008.
    [34]William H. Hsu, Andrew L. King, Martin S. R. Paradesi, Tejaswi Pydimarri, Tim Weninger. Collaborative and Structural Recommendation of Friends using Weblog-based Social Network Analysis [A]. In:Proceedings of Computational Approaches to Analyzing Weblogs,(AAAI) [C],2006.
    [35]Doina Caragea, Vikas Bahirwani, Waleed Aljandal and William H. Hsu. Ontology based Link-prediction in the Live Journal Social Network [A]. In:Proceedings of Association for the Advancement of Artificial Intelligence [C],2009.
    [36]Waleed Aljandal, Vikas Bahirwani, Doina Caragea, William H. Hsu. Ontology-Aware Classification and Association Rule Mining for Interest and Link Prediction in Social Networks [J]. AAAI Spring Symposium on Social Semantic Web.2009.
    [37]Indika Kahanda and Jennifer Neville. Using Transactional Information to Predict Link Strength in Online Social Networks [A]. In:Proceedings of International Conference on Weblogs and Social Media [C],2009.
    [38]Ben Taskar, Ming-fai Wong, Pieter Abbeel and Daphne Koller. Link Prediction in Relational Data [J]. Neural Information Processing Systems,2003.
    [39]David Liben-Nowell, Jon Kleinberg. The link-prediction problem for social networks [J]. Journal of the American Society for Information Science and Technology.2007,58(7):1019-1031.
    [40]David Liben-Nowell, Jon Kleinberg. The link prediction problem for social networks [A]. In:Proceedings of the twelfth international conference on Information and knowledge management [C],2003:556-559.
    [41]Jennifer Golbeck, James Hendler. FilmTrust:Movie Recommendations using Trust in Web-based Social Networks [A]. In:Proceedings of the IEEE Consumer communications and networking conference [C],2006,42:43-47.
    [42]Jennifer Golbeck. Trust and Nuanced Profile Similarity in Online Social Networks [J]. ACM Transactions on the Web (TWEB).2009,3(4):1-33.
    [43]Wan-Shiou Yang, Jia-Ben Dia, Hung-Chi Cheng, Hsing-Tzu Lin. Mining Social Networks for Targeted Advertising [A]. In:Proceedings of the 39th Hawaii International Conference on System Sciences [C],2006.
    [44]Nielsen Online Report. Social networks & blogs now 4th most popular online activity [N],2009,3.
    [45]Marcelo Maia, Jussara Almeida, Virgilio Almeida. Identifying user behavior in online social networks [A]. In:Proceedings of the 1st workshop on Social network systems [C],2008:1-6.
    [46]Scott Golder, Dennis Wilkinson, and Bernardo Huberman. Rhythms of social interaction:messaging within a massive online network [J]. Communities and Technologies.2007:41-66.
    [47]Yadong Zhou, Xiaohong Guan, Zhefei Zhang, Beibei Zhang. Predicting the tendency of topic discussion on the online social networks using a dynamic probability model [A]. In:Proceedings of the hypertext 2008 workshop on Collaboration and collective intelligence [C],2008:7-11.
    [48]Xiaolin Shi, Jun Zhuz, Rui Cai, Lei Zhang. User grouping behavior in online forums [A]. In:Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining [C],2009:777-786.
    [49]Judith B. Pena-Shaff, Craig Nicholls. Analyzing student interactions and meaning construction in computer bulletin board discussions [J]. Computers & Education. 2004,42(3):243-265.
    [50]B. A. Williamson. Social network marketing:ad spending and usage. EMarketer Report,2007.
    [51]Moira Burke, Cameron Marlow, Thomas Lento. Feed Me:Motivating Newcomer Contribution in Social Network Sites [A]. In:Proceedings of the 27th international conference on Human factors in computing systems [C],2009: 945-954.
    [52]Pedro Domingos, Matt Richardson. Mining the Network Value of Customers [A]. In:Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining [C],2001:57-66.
    [53]Matthew Richardson, Pedro Domingos. Mining Knowledge-Sharing Sites for Viral Marketing [A]. In:Proceedings of the 8th ACM SIGKDD international conference on Knowledge discovery and data mining [C],2002:70-79.
    [54]Pedro Domingos. Mining Social Networks for Viral Marketing [J]. IEEE Intelligent Systems.2005,20(1):80-83.
    [55]Duncan J. Watts, Jonah Peretti, Michael Frumin. Viral Marketing for the Real World [J]. Harvard Business Review.2007,85(5):22-30.
    [56]Jason Hartline, Vahab S. Mirrokni, Mukund Sundararajan. Optimal Marketing Strategies over Social Networks [A]. In:Proceeding of the 17th international conference on World Wide Web [C],2008:189-198.
    [57]Mani R. Subramani, Balaji Rajagopalan. Knowledge-Sharing and Influence in Online Social Networks via Viral Marketing [J]. Communications of the ACM. 2003,46(12):300-307.
    [58]P. Rodriguez. Web infrastructure for the 21st century [A]. In:Proc. of the 18th intl. conf. on World Wide Web, Keynote [C],2009.
    [59]Fabricio Benevenutoy, Tiago Rodriguesy, Meeyoung Cha, Virgilio Almeida. Characterizing User Behavior in Online Social Networks [A]. In:Proceedings of the 9th ACM SIGCOMM conference on Internet measurement conference [C], 2009:49-62.
    [60]Meeyoung Cha, Haewoon Kwak, Pablo Rodriguez, Yong-Yeol Ahn, Sue Moon. I tube, you tube, everybody tubes:analyzing the world's largest user generated content video system [A]. In:Proceedings of the 7th ACM SIGCOMM conference on Internet measurement [C],2007:1-14.
    [61]豆瓣:www.douban.com
    [62]David Kempe, Jon Kleinberg, Eva Tardos. Maximizing the Spread of Influence through a Social Network [A]. In:Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining [C],2003: 137-146.
    [63]Jure Leskovec,Ajit Singh, Jon Kleinberg. Patterns of Influence in a Recommendation Network [J]. Advances in Knowledge Discovery and Data Mining.2006:380-389.
    [64]Jon Kleinberg. Cascading Behavior in Networks:Algorithmic and Economic Issues [J]. Algorithmic game theory.2007:613-632.
    [65]Nicole Immorlica, Jon Kleinberg, Mohammad Mahdian. The role of compatibility in the diffusion of technologies through social networks [A]. In:Proceedings of the 8th ACM conference on Electronic commerce [C],2007:75-83.
    [66]David Kempe, Jon Kleinberg, Eva Tardos. Influential nodes in a diffusion model for social networks [J]. Automata, Languages and Programming.2005: 1127-1138.
    [67]Eyal Even-Dar, Asaf Shapira. A Note on Maximizing the Spread of Influence in Social Networks [A]. In:Proceedings of the 3rd international conference on Internet and network economics [C],2007:281-286.
    [68]Aram Galstyan, Vahe Musoyan, Paul Cohen. Maximizing Influence Propagation in Networks with Community Structure [J]. Physical Review E.2009,79(5):56-62.
    [69]Wei Chen, Yifei Yuan, Li Zhang. Scalable influence maximization in social networks under the linear threshold model [A]. In:Proceedings of the 10th IEEE International Conference on Data Mining [C],2010.
    [70]Aram Galstyan, Vahe Musoyan, Paul Cohen. Maximizing influence propagation in networks with community structure [J]. Physical Review E.2009,79(5).
    [71]Elchanan Mossel, Sebastien Roch. On the Submodularity of Influence in Social Networks [A]. In:Proceedings of the thirty-ninth annual ACM symposium on Theory of computing [C],2007:128-134.
    [72]H. Peyton Young. The Diffusion of Innovations in Social Networks [J]. Economy as an evolving complex system 3.2006:267-286.
    [73]Sinan Aral, Erik Brynjolfsson, Marshall van Alstyne. Productivity Effects of Information Diffusion in E-Mail Networks [A]. In:Proceedings of International Conference on Information Systems [C],2007.
    [74]Jochen Wirtz, Patricia Chew. The effects of incentives, deal proneness, satisfaction and tie strength on word-of-mouth behavior [J]. International Journal of Service Industry Management.2002,13(2):141-162.
    [75]Robin Cowan, Nicolas Jonard. Network structure and the diffusion of knowledge [J]. Journal of economic Dynamics and Control.2004,28(8):1557-1575.
    [76]C. de Kerchove, G.Krings, R.Lambiotte, P.VanDooren, V.D.Blondel. Role of second trials in cascades of information over networks [J]. Physical Review E.2009, 79(1).
    [77]Marc Lelarge. Efficient control of epidemics over random networks [A]. In: Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems [C],2009:1-12.
    [78]Eytan Adar, Lada A. Adamic. Tracking Information Epidemics in Blogspace [A]. In: Proceedings of the International Conference on Web Intelligence [C],2005: 207-214.
    [79]D. Gruhl, R. Guha, David Liben-Nowell, A. Tomkins. Information diffusion through blogspace [A]. In:Proceedings of the 13th international conference on World Wide Web [C],2004:491-501.
    [80]Jiang Bian, Yandong Liu, Ding Zhou, Eugene Agichtein, Hongyuan Zha. Learning to recognize reliable users and content in social media with coupled mutual reinforcement [A]. In:Proceedings of the 18th international conference on World wide web [C],2009:51-60.
    [81]Lada A. Adamic, Jun Zhang, Eytan Bakshy, Mark S. Ackerman. Knowledge sharing and yahoo answers:everyone knows something [A]. In:Proceeding of the 17th international conference on World Wide Web [C],2008:665-674.
    [82]Lada Adamic, Eytan Adar. How to search a social network [J]. Social Networks. 2005,27(3):187-203.
    [83]Wil M.P. van der Aalst, Minseok Song. Mining Social Networks:Uncovering interaction patterns in business processes [J]. Business Process Management. 2004:244-260.
    [84]David Jensen, Jennifer Neville. Data mining in social networks [J]. Dynamic Social Network Modeling and Analysis:workshop summary and papers.2003:287-302.
    [85]Naohiro Matsumura, David E. Goldberg, Xavier Llora. Mining directed social network from message board [A]. In:Special interest tracks and posters of the 14th international conference on World Wide Web [C],2005:1092-1093.
    [86]Cliff Lampe, Nicole Ellison, Charles Steinfield. A Face (book) in the crowd:Social searching vs. social browsing [A]. In:Proceedings of the 2006 20th anniversary conference on Computer supported cooperative work [C],2006:167-170.
    [87]Claudio Baccigalupo, Enric Plaza. Mining Music Social Networks for Automating Social Music Services [A]. In:Workshop Notes of the ECML/PKDD 2007 Workshop on Web Mining [C],2007:123-134.
    [88]Ralph Gross, Alessandro Acquisti. Information Revelation and Privacy in Online Social Networks [A]. In:Proceedings of the 2005 ACM workshop on Privacy in the electronic society [C],2005:71-80.
    [89]Alessandro Acquisti, Ralph Gross. Imagined communities:Awareness, information sharing, and privacy on the Facebook [A]. In:Proceedings of the 2006 Privacy Enhancing Technologies [C],2006:36-58.
    [90]人人网:www.renren.com
    [91]Michael Mitzenmacher. A brief history of generative models for power law and lognormal distributions [J]. Internet mathematics.2004,1(2):226-251.
    [92]Lun Li, David Alderson, John C. Doyle, Walter Willinger. Towards a theory of scale-free graphs:Definition, properties, and implications [J].2005,2(4): 431-523.
    [93]基因本体数据库:www.geneontotogy.org
    [94]Fudan BBS日月光华社区:http://bbs.fudan.edu.cn/
    [95]Wiki-Cumulative distribution function: http://en.wikipedia.org/wiki/Cumulative distribution function
    [96]E. Forgey. Cluster Analysis of Multivariate Data:Efficiency vs. Interpretability of Classification [J]. Biometrics,1965,21:768.
    [97]J. C. Dunn. A Fuzzy Relative of the ISODATA Process and its Use in Detecting Compact Well Separated Clusters [J]. Cybernetics and Systems,1973,3(3):32-57.
    [98]Michael Steinbach, George Karypis, Vipin Kumar. A Comparison of Document Clustering Techniques [A]. In:KDD workshop on text mining [C],2000:34-35.
    [99]G. Salton, E. A. Fox, H. Wu. Extended Boolean information retrieval [J]. Communications of the ACM,1983,26(11):1022-1036.
    [100]K. Zhang, J. Zi, L. G. Wu. New event detection based on indexing tree and named entity [A]. In:Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval [C],2007:215-222.
    [101]Youtube:www.voutube.com
    [102]Orkut:www.orkut.com
    [103]西安交通大学BBS:http://bbs.xjtu.edu.cn/
    [104]新浪博客:http://blog.sina.com.cn
    [105]Dwi H. Widyantoro, Thomas R. loerger, John Yen. An Incremental Approach to Building a Cluster Hierarchy [A]. In:Proceedings of IEEE International Conference on Data Mining [C],2002:705-708.
    [106]Wiki-Content delivery network: http://en.wikipedia.org/wiki/Content delivery network
    [107]Mark Allman, Vern Paxson. Issues and etiquette concerning use of shared measurement data [A]. In:Proc. of the 7th ACM SIGCOMM conf. on Internet measurement [C],2007:135-140.
    [108]Bruce Schneier, John Kelseyy, Doug Whitingz, David Wagnerx, Chris Hall, Niels Ferguson. Twofish:A 128-bit block cipher [A]. In First Advanced Encryption Standard (AES) Conf. [C],1998.
    [109]Greg Linden, Brent Smith, Jeremy York. Amazon.com recommendations: Item-to-item collaborative filtering [J]. IEEE Internet Computing.2003,7(1): 76-80.
    [110]Abhinandan S. Das, Mayur Datar, Ashutosh Garg, Shyam Rajaram. Google news personalization:Scalable online collaborative filtering [A]. In:Proc. of the 16th intl. conf. on World Wide Web [C],2007:271-280.
    [111]浏览器辅助对象技术:http://baike.baidu.com/view/362533.htm
    [112]Wiki-Collaborative filtering: http://en.wikipedia.org/wiki/Collaborative filtering
    [113]Paul Jaccard. Etude comparative de la distribution orale dans une portion des Alpes et des Jura [J]. Bulletin del la Socit Vaudoise des Sciences Naturelles. 1901,37:547-579.
    [114]P. Indyk, R. Motwani. Approximate Nearest Neighbor:Towards Removing the Curse of Dimensionality [A]. In Proc. of the 30th Annual ACM Symposium on Theory of Computing [C],1998:604-613.
    [115]E. Cohen. Size-Estimation Framework with Applications to Transitive Closure and Reachability [J]. Journal of Computer and System Sciences,1997,55:441-453.
    [116]A. Broder. On the resemblance and containment of documents [J]. Compression and Complexity of Sequences,1998:21-29.
    [117]Li, X., Wang, Y.Y., Acero, A. Learning query intent from regularized click graphs [A]. In:Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval [C],2008: 339-346.
    [118]R. Cooley, B. Mobasher, J. Srivastava. Data preparation for mining world wide web browsing patterns [J]. Knowledge and Information Systems,1999,1:5-32.
    [119]M. Abrams, C. R. Standridge, G. Abdulla, S. Williams, E. A. Fox. Caching proxies:Limitations and potentials [A]. In Proceedings of the Fourth International Conference on World Wide Web [C], Boston, MA, December 1995.
    [120]J. Pitkow, H. Schutze, T. Cass, R. Cooley, D. Turnbull, A. Edmonds, E. Adar, T. Breuel. Personalized search [J]. Commun. ACM,2002,45(9):50-55.
    [121]D. Goldberg, D. Nichols, B. M. Oki, and D. Terry. Using collaborative filtering to weave an information tapestry [J]. Commun. ACM,1992,35:61-70.
    [122]J. L. Herlocker, J. A. Konstan, A. Borchers, J. Riedl. An algorithmic framework for performing collaborative filtering [A]. In:Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval [C],1999:230-237.
    [123]P. Resnick, N. lacovou, M. Suchak, P. Bergstrom, J. Riedl. Grouplens:an open architecture for collaborative filtering of netnews [A]. In:Proceedings of the 1994 ACM conference on Computer supported cooperative work [C],1994: 175-186.
    [124]J. S. Breese, D. Heckerman, C. Kadie. Empirical analysis of predictive algorithms for collaborative filtering [A]. In:Proceedings of the 14th conference on Uncertainty in Artificial Intelligence [C],1998:43-52.
    [125]G. Adomavicius and E. Tuzhilin. Toward the next generation of recommender systems:A survey of the state-of-the-art and possible extensions [J]. IEEE Transactions on Knowledge and Data Engineering,2005,17:734-749.
    [126]T. Hofmann. Latent semantic models for collaborative filtering [J]. ACM Trans. Inf. Syst.,2004,22(1):89-115.
    [127]Facebook:www.facebook.com
    [128]Livejournal:www.livejournal.com
    [129]Introduction to Modern Information Retrieval. McGraw-Hill [M], New York, 1983.
    [130]小鼠突变体信息中心:http://www.idmshanghai.cn/PBmice/
    [131]基因本体:http://www.geneontology.org/
    [132]Christopher P Austin, James F Battey, Allan Bradley, Maja Bucan, Mario Capecchi, Francis S Collins, William Dove, Geoffrey Duyk, Susan Dymecki, Janan T Eppig, Franziska B Grieder, Nathaniel Heintz, et al. The knockout mouse project [J]. Nat. Genet.,2004,36:921-924.
    [133]AmanderT. Clark, Daniel Goldowitz, Joseph S. Takahashi, Martha Hotz Vitaterna, Sandra M. Siepka, Luanne L. Peters, Wayne N. Frankel, George A. Carlson, Janet Rossant, Joseph H. Nadeau, Monica J. Justice. Implementing large-scale ENU mutagenesis screens in North America [J]. Genetica,2004,122: 51-64.
    [134]Aude M. Fahrer, J. Fernando Bazan, Peter Papathanasiou, Keats A. Nelms, Christopher C. Goodnow. A genomic view of immunology [J]. Nature,2001,409: 836-838.
    [135]Hrabe de Angelis, M.H., Flaswinkel,H., Fuchs,H., Rathkolb,B., Soewarto,D., Marschall,S., Heffner,S., Pargent,W., Wuensch,K. et al. Genome-wide, large-scale production of mutant mice by ENU mutagenesis [J]. Nat. Genet.,2000,25,444-447.
    [136]Nolan,P.M., Peters,J., Strivens,M., Rogers,D., Hagan,J., Spurr,N., Gray,I.C., Vizor,L., Brooker,D. et al. A systematic, genome-wide, phenotype-driven mutagenesis programme for gene function studies in the mouse [J]. Nat. Genet., 2000,25,440-443.
    [137]Cooley,L., Kelley,R. and Spradling,A. Insertional mutagenesis of the Drosophila genome with single P elements [J]. Science,1988,239,1121-1128.
    [138]Ping Zhang, Allan C. Spradling. Insertional mutagenesis of Drosophila heterochromatin with single P elements [A]. Proc. Natl. Acad. Sci [C]. USA,1994, 91,3539-3543.
    [139]Amsterdam,A., Burgess,S., Golling,G., Chen,W., Sun,Z., Townsend,K., Farrington,S., Haldi,M. and Hopkins,N. A large-scale insertional mutagenesis screen in zebrafish [J]. Genes Dev.,1999,13,2713-2724.
    [140]Amsterdam,A., Yoon,C., Allende,M., Becker,T., Kawakami,K., Burgess,S., Gaiano,N. and Hopkins,N. Retrovirus-mediated insertional mutagenesis in zebrafish and identification of a molecular marker for embryonic germ cells [J]. Cold Spring Harb. Symp. Quant. Biol.,1997,62,437-450.
    [141]Rodrigo Lopez, Ville Silventoinen, Stephen Robinson, Asif Kibria, Warren Gish. WU-Blast2 server at the European Bioinformatics Institute [J]. Nucleic Acids Res., 2003,31,3795-3798.
    [142]Ensemble基因序列数据库:http://www.ensembl.org/Mus musculus/index.html
    [143]Mouse Genome Informatics (MGI):http://www.informatics.jax.org/
    [144]ExPASy:http://us.expasy.org/
    [145]GFF格式http://www.sanger.ac.uk/Software/formats/GFF/
    [146]GBrowse工具:http://www.gmod.org/wiki/index.php/Gbrowse
    [147]IP地址地理信息数据库:http://ipinfodb.com/
    [148]P. Ganesan, H. Garcia-Molina, and J. Widom. Exploiting hierarchical domain structure to compute similarity [J]. ACM Trans. Inf. Syst.,2003,21(1):64-93.
    [149]Ding S., Wu X., Li G., Han M., Zhuang Y. and Xu T. Efficient transposition of the piggyBac (PB) transposon in mammalian cells and mice [J]. Cell 2005, 122(3):473-483.
    [150]Wu S., Ying G., Wu Q., Capecchi MR:Toward simpler and faster genome-wide mutagenesis in mice [J]. Nat Genet 2007,39(7):922-930.
    [151]FilmTrust:http://trust.mindswap.org/FilmTrust/
    [152]O. Phelan, K. McCarthy, and B. Smyth. Using twitter to recommend real-time topical news [A]. In:Proceedings of the third ACM conference on Recommender systems [C],2009:385-388.
    [153]L. Adamic,O. Buyukkokten, E. Adar. A social network caught in the web. http://www.hpl.hp.com/shl/papers/social/,2002.
    [154]B. Taskar, P. Abbeel, and D. Koller. Discriminative probabilistic models for relational data [A]. In Proc. UAI [C],2002.
    [155]arXiv:www.arxiv.org
    [156]Masuya H., Nakai Y., Motegi H., Niinaya N., Kida Y., Kaneko Y., Aritake H., Suzuki N., Ishii J., Koorikawa K., et al:Development and implementation of a database system to manage a large-scale mouse ENU-mutagenesis program [J]. Mamm Genome 2004,15(5):404-411.
    [157]Donofrio N., Rajagopalon R., Brown D., Diener S., Windham D., Nolin S., Floyd A., Mitchell T., Galadima N., Tucker S., et al:'PACLIMS':a component LIM system for high-throughput functional genomic analysis [J]. BMC Bioinformatics 2005,6:94.
    [158]G. P. C. Fung, J. X. Yu, H. Liu, P. S. Yu. Time-dependent event hierarchy construction [A]. In:Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining [C],2007:300-309.
    [159]D. Trieschnigg, W. Kraaij. Hierarchical topic detection in large digital news archives [A]. In:Proceedings of the 5th Dutch Belgian Information Retrieval workshop [C],2005:55-62.
    [160]G. Carenini, R. Ng, X. Zhou, E. Zwart. Discovery and regeneration of hidden emails [A]. In:Proceedings of the 2005 ACM symposium on Applied computing [C],2005:503-510.
    [161]G. Carenini, R. T. Ng, X. Zhou. Scalable discovery of hidden emails from large folders [A]. In:Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining [C],2005:544-549.
    [162]Giuseppe Carenini, Raymond T. Ng, Xiaodong Zhou. Summarizing email conversations with clue words [A]. In:Proceedings of the 16th international conference on World Wide Web [C],2007:91-100.
    [163]C. Macdonald, I. Ounis. Key blog distillation:ranking aggregates [A]. In: Proceeding of the 17th ACM conference on Information and knowledge management [C],2008:1043-1052.
    [164]D. Metzler, S. Dumais, C. Meek. Similarity measures for short segments of text [J]. Lecture Notes in Computer Science,2007,4425:16.
    [165]R. Nallapati, A. Feng, F. Peng, J. Allan. Event threading within news topics [A]. In:Proceedings of the thirteenth ACM international conference on Information and knowledge management [C],2004:446-453.
    [166]J. Allan, R. Gupta, V. Khandelwal. Temporal summaries of new topics [A]. In: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval [C],2001:10-18.
    [167]D. R. Radev, H. Jing, M. Stys, D. Tam. Centroid-based summarization of multiple documents [J]. Information Processing & Management,2004,40(6): 919-938.
    [168]X. Wu, G.-Q. Wu, F. Xie, Z. Zhu, X.-G. Hu. News filtering and summarization on the web [J]. IEEE Intelligent Systems,2010,25(5):68-76.
    [169]J. Leskovec, L. Backstrom, J. Kleinberg. Meme-tracking and the dynamics of the news cycle [A]. In:Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining [C],2009:497-506.
    [170]W. Cavnar and J. M. Trenkle. N-gram-based text categorization [A]. In SDAIR'94:Proceedings of the 3rd Annual Symposium on Document Analysis and Information Retrieval [C],1994:161-175.
    [171]J. E. Mason, M. Shepherd, J. Duffy, V. Keselj, and C. Watters. An n-gram based approach to multi-labeled web page genre classification [J]. Hawaii International Conference on System Sciences, vol.0, pp.1-10,2010.
    [172]C. C. Yang and X. Shi. Discovering event evolution graphs from newswires [A]. In WWW'06:Proceedings of the 15th international conference on World Wide Web [C],2006:945-946.
    [173]C. C. Chen and M. C. Chen. TSCAN:a novel method for topic summarization and content anatomy [A]. In SIGIR'08:Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval [C],2008:579-586.
    [174]Y. Yang, T. Pierce, and J. Carbonell. A study of retrospective and online event detection [A]. In SIGIR'98:Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval [C], 1998:28-36.
    [175]Twitter API:http://twitter4j.org/en/index,jsp
    [176]J. Kleinberg. Bursty and hierarchical structure in streams [J]. Data Mining and Knowledge Discovery,2003,7(4):373-397.
    [177]I. Subasic and B. Berendt. Web mining for understanding stories through graph visualization [A]. In ICDM'08:Proceedings of the 8th IEEE International Conference on Data Mining [C],2008:570-579.
    [178]J. Allan, R. Papka, and V. Lavrenko. On-line new event detection and tracking [A]. In SIGIR'98:Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval [C],1998:37-45.
    [179]T. Brants, F. Chen, and A. Farahat. A system for new event detection [A]. In SIGIR'03:Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval [C],2003:330-337.
    [180]M. Connell, A. Feng, G. Kumaran, H. Raghavan, C. Shah, and J. Allan. Umass at TDT 2004 [A]. In Working Notes of the TDT-2004 Evaluation,2004.
    [181]C. Wang, M. Zhang, S. Ma, and L. Ru. Automatic online news issue construction in web environment [A]. In WWW'08:Proceeding of the 17th international conference on World Wide Web [C],2008:457-466.
    [182]K. Starbird, L. Palen, A. L. Hughes, and S. Vieweg. Chatter on the red:what hazards threat reveals about the social life of microblogged information [A]. In CSCW'10:Proceedings of the 2010 ACM conference on Computer supported cooperative work [C],2010:241-250.
    [183]H. Kwak, C. Lee, H. Park, and S. Moon. What is twitter, a social network or a news media? [A]. In WWW'10:Proceedings of the 19th international conference on World Wide Web [C],2010:591-600.
    [184]A. Java, X. Song, T. Finin, and B. Tseng. Why we twitter:understanding microblogging usage and communities [A]. In WebKDD/SNA-KDD'07: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis [C],2007:56-65.
    [185]N. Banerjee, D. Chakraborty, K. Dasgupta, S. Mittal, A. Joshi, S. Nagar, A. Rai, and S. Madan. User interests in social media sites:an exploration with micro-blogs [A]. In CIKM'09:Proceeding of the 18th ACM conference on Information and knowledge management [C],2009:1823-1826.
    [186]UCenter Home:http://u.discuz.net/
    [187]MySpace:http://en.wikipedia.org/wiki/Myspace
    [188]日月光华在线社区内容推送系统:http://cscw.fudan.edu.cn/projects/ibbs/ibbs.html
    [189]复旦大学发育生物学研究所:http://idm.fudan.edu.cn/
    [190]基于N-gram的模式提取算法:http://www.codeproiect.com/KB/recipes/Patterns.aspx
    [191]Flickr:www.flickr.com
    [192]未名空间社区:www.mitbbs.com
    [193]未名空间社区访问量统计:http://www.yjbys.com/mqzl/243485.html

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700