基于语义构建个人知识网络相关技术研究

英文题名：Research on Related Technologies in Personal Knowledge Network Based on Semantics
作者：刘名扬
论文级别：博士
学科专业名称：计算机系统结构
中文关键词：知识单元 ; 语义推荐 ; 挖掘 ; 标签 ; 搜索 ; 排序 ; 聚类 ; 发现
英文关键词：knowledge units ; semantic recommendation ; mining ; tags ; search ; ranking ; clustering ; discovery
学位年度：2013
导师：刘淑芬
学科代码：081201
学位授予单位：吉林大学
论文提交日期：2013-06-01
答辩委员会主席：张斌

摘要

本文主要从三个方面切入来对基于语义的构建个人知识网络过程中涉及到的技术进行分析和提出改进方法。分别为知识单元的发现，知识单元的搜索和排序，知识单元的推荐。
     在知识单元的发现方面，我们提出动态关联主题模型来随着时间的变化来对知识单元进行分析，其模型将高维度的观察空间中的词组集映射到低维度的主题潜在空间来提高空间利用率的同时缩小了用户目标空间范围，分解为词组集合的知识单元来源于会议或者期刊论文集等知识库。主题和关系的动态由临时先验概率分布获得，以此来通过关系潜在空间构建层次结构，所有的变量包括词组、主题和相互关系在不同的时间段动态存在，我们提出的模型是非参数化的可以表现更为高效的收敛速度。动态关联主体模型对于发现特定主题中的词组概率分布并预测主题和相互关系的走势方面表现出色。降低的主题空间对提高知识单元的聚类也有很大帮助。
     在知识单元的搜索和排序方面，我们提出基于语义上下文对知识单元的协作搜索，通过对语义上下文来进行概念标注用户本体和知识领域本体，并表明此方法可以明显提高搜索和排序的质量，并将用户的搜索习惯考虑在我们的方法中来构建用户与用户之间的语义关系和分析兴趣相似度，并提出根据用户的属性分为标记知识单元的用户和查询知识单元的用户来进行聚类，通过此来平衡处理时间和空间利用。
     在知识单元的推荐方面，我们提出将领域知识本体集成到用户使用挖掘和推荐过程中，这样的集成方式增加了对领域知识细节的解释能力。领域本体结合用户提供的标签来提供top-n推荐，并且提出基于语义的序列模式挖掘算法，来减少执行时间和降低内存使用。提出包含语义信息的马尔科夫转换概率矩阵来对知识单元进行预测。
It has been seen that the rapid development or World Wide Web has broughtdramatic explosion of information. In the meanwhile, as the amount of informationgrows day by day which is mostly essential for highly effectively management ofinformation. As a matter of fact, by taking intelligent computational algorithms todiscover new and useful information and knowledge especially in the field ofinformation retrieval and data mining has been widely focused on and as a hot topic tobe deeply researched.
     This thesis mainly concentrates on the issues that would occur in the process ofconstructing personal knowledge network that covers the discovery, searching,ranking and recommendation for knowledge based on specific domain. The source ofknowledge more derives from textual form of data, audio form of data, video form ordata where we more focus on textual form. The research issues we work on can beconcerned as text mining, also be including lots of interesting and more challengedproblems and applications which is one branch of information retrieval field.Generally, it means that discover useful patterns, structures and other valuableinformation from unconstructed natural language text. For knowledge discovery, topicdiscovery process is much more concerned about. After the emergence of latentsemantic analysis approach, topic analysis has become one popular hot spot byscholars in computer and statistics fields. The simple idea behind the topic analysis isto deal with the collections of topic instead of the ones of knowledge units. Each topiccontains the terms which form the uncertain possibility distribution. So we cantransform the dimension of large scale of collections of terms into lower dimension ofrepresentative collections of topics. Dynamic topic correlation model has been proposed to analyze the topics of knowledge units over time. The model is inspired byhierarchical Gaussian process latent variable model. It makes high dimensionality ofobserved space of terms to become lower dimension of latent space of topics. Theaforementioned condition is to suppose that there is no exchangeability between terms.And all variants exist dynamically at different time points. This non-parameter modelshows faster convergence rate than others. The posterior inference distributionbetween the topic and correlation exist in dynamic topic correlation model is helpfulfor discovering the dynamic changing between the frequency changing among termsin certain topic. And to predict the trend in the topic and the relations reside in whichlows the dimension of topic space and improves the classification performance ofknowledge units.
     In personal knowledge network where users can build their own information base,build the relationship with each other and the store the personal preference intoindividual profile. Users go interaction with each other in a collaborative way in theknowledge network. Personal basic information, behaviour preference and coorelatedsimilar user information will be denoted as concepts. The reason of relations amongconcepts described by ontologies is for improving the semantic relation in users. Theontologies are in further divided into perosnal ontology and knowledge domainontology. We more refer to the current existing ontology base when it comes toknowledge domain ontology. Perosnal ontologies are constructed and annotatedmainly by knowledge experts and knowlege workers. As a result, the enrichingsemantic information of users have already been existed or derived. The informationcan be collaborated to improve the users’ online experience. We also need onecontext-aware data management mechanism to support user-centric data analysis. Wedenote the goal and challenge exist in collaborative knowledge network and proposecollaboratively searching based on knowledge which includes scores on knowledgeunits. The method of scoring comes from the collaborated knowledge network.Besides, we describe the top-k processing algorithm and consider how to balancebetween the query time and space using. The procedure we take is to apply the tag on knowledge units to make further improvement about the efficiency of searching andranking.
     For the recommendation of knowledge units, we propose the semanticallyrecommended method which integrates the domain ontology and usage mining. Ithighly increases the efficiency of searching process and saving time of knowledgeunits in knowledge network. By modeling users' latent interests (mine users usagedaily log, calculate the how the portion that interested knowledge units take inknowledge units collection) and making recommendation for next target knowledgeunits which is for saving users' time. Semantic recommendation includes the semanticdistance combing semantic information to enrich the usage log. At the same time,semantic distance matrix works with transit possibility matrix coming from Markovmodel. The semantic sequence pattern mining combines with Markov model into theprocess of recommendation. At last, we propose the vector space model to constructknowledge units possibility and correlation matrix combines with the tags usersprovide to produce the top-n recommendation.

引文

[1] S.C.Deerwester, S.T.Dumais, T.K.Landauer, G.W. Furnas, and R.A.Harsh-man.Indexing by latent semantic analysis. Journal of the American Society ofInformation Science,41(6):391–407,1990.
    [2] T. Hofmann. Probabilistic Latent Semantic Indexing. In SIGIR’99: Proceedings ofthe22nd annual international ACM SIGIR conference on Research anddevelopment in information retrieval, pages50–57, Berkeley, California,1999.
    [3] D.M.Blei, A.Y.Ng, and M.I.Jordan. Latent Dirichlet Allocation. J. Mach. Learn.Res.,3:993–1022,2003.
    [4] M.Steyvers, P.Smyth, M. Rosen-Zvi, and T. Griffiths. Probabilistic author-topicmodels for information discovery. In KDD’04: Proceedings of the tenth ACMSIGKDD international conference on Knowledge discovery and data mining,pages306–315, New York, NY, USA,2004，ACM.
    [5] Y.Song, J.Huang, I.G.Councill, J.Li, and C.L. Giles. Generative models for namedisambiguation. In WWW’07: Proceedings of the16th international conferenceon World Wide Web, pages1163–1164, New York, NY, USA,2007.
    [6] J.Sivic, B.C.Russell, A.A.Efros, A.Zisserman, and W. T. Freeman. Discoveringobjects and their localization in images. In ICCV’05: Proceedings of the TenthIEEE International Conference on Computer Vision (ICCV’05) Volume1, pages370–377, Washington, DC, USA,2005.
    [7] D.M.Blei and J.D.Lafferty. Dynamic topic models. In ICML’06: Proceedings ofthe23rd international conference on Machine learning, pages113–120, New York,NY, USA,2006, ACM.
    [8] D.Blei and J.Lafferty. Correlated topic models. Advances in Neural InformationProcessing Systems18, pages147–154,2006.
    [9] N.D.Lawrence and A.J.Moore. Hierarchical gaussian process latent variablemodels. In ICML’07: Proceedings of the24th international conferenceonMachine learning, pages481–488, New York, NY, USA,2007, ACM.
    [10] S. C. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A.Harshman. Indexing by latent semantic analysis.Journal of the American Societyof Information Science,41(6):391–407,1990.
    [11] T. Hofmann. Probabilistic Latent Semantic Indexing. In SIGIR’99: Proceedingsof the22nd annual international ACM SIGIR conference on Research anddevelopment in information retrieval, pages50–57, Berkeley, California,1999.
    [12] http://citeulike.org
    [13] T. Minka and J. Lafferty. Expectation-propagation for the generative aspectmodel. In the18th Conference on Uncertainty in Artificial Intelligence, pages352–359,2002.
    [14] C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for MachineLearning. The MIT Press,2006.
    [15] H.-C. Kim and Z. Ghahramani. Bayesian gaussian process classification with theEM-EP algorithm. IEEE Trans. Pattern Anal. Mach. Intell.,28(12):1948–1959,2006.
    [16] M. Girolami and S. Rogers. Variational bayesian multinomial probit regressionwith gaussian process priors.Neural Comput.,18(8):1790–1817,2006.
    [17] M. Tipping and C. Bishop. Probabilistic principal component analysis. Journal ofthe Royal Statistical Society,6:611–622,1999.
    [18] N. D. Lawrence. Gaussian process latent variable models for visualisation of highdimensional data. In S. Thrun, L. Saul, and B. Sch¨olkopf, editors, Advances inNeural Information Processing Systems16. MIT Press, Cambridge, MA,2004.
    [19] R. Urtasun, D. J. Fleet, and P. Fua.3d people tracking with gaussian processdynamical models. In2006IEEE Computer Society Conference on ComputerVision and Pattern Recognition (CVPR2006),17-22June2006, New York, NY,USA, pages238–245,2006.
    [20] R. Srinivasan. Importance sampling-Applications in communications anddetection. Springer-Verlag,2002.
    [21] J. M. Wang, D. J. Fleet, and A. Hertzmann. Gaussian process dynamical modelsfor human motion. Transactions on Pattern Analysis and Machine Intelligence,30(2):283–298, Feb.2008.
    [22] Margaret E.I.Kipp and D. Grant Campbell. Patterns and inconsistencies incollaborative tagging systems: an examnination of tagging practices. Faculty ofInformation and Media Studies University of Western Ontario,2006.
    [23] Paul Heymann, Georgia Koutrika, and Hector Garcia-Molina. Can socialbookmarking improve web search? In WSDM,2008
    [24] Ricardo A. Baeza-Yates and Berthier A. RibeiroNeto. Modern InformationRetrieval. Addison-Wesley,1999.
    [25] Ronald Fagin. Combining fuzzy information: an overview. SIGMOD Record,32(2),2002.
    [26] Ronald Fagin, Amnon Lotem, and Moni Naor. Optimal aggregation algorithmsfor middleware. JCSS,66(4),2003.
    [27] Norbelt Fuhr and Thomas Rolleke. A probabilistic relational algebra for theintegration of information retrieval and database systems. ACM TOIS,15(1),1997.
    [28] Ihab F.Ilyas, Walid G. Aref, and Ahemd K. Elmagarmid. Joining ranked inputs inpractice. In VLDB,2002.
    [29] Sihem Amer-Yahia, Michael Benedikt, Laks V.S. Lakshmanan, and JuliaStoyanovich. Efficient network-aware search in collaborative tagging sites.VLDB,1(1),2008.
    [30] Ronald Fagin, Ravi Kumar, and D.Sivakumar. Comparing top-k lists. SIAMDM,17(1),2003.
    [31] Kalervo Jarvelin and Jaana Kekalainen. Cumulated gain-based evaluation of IRtechniques. ACM TOIS,20(4),2002.
    [32] Inderjit Dhillon, Yuqiang Guan, and Brian Kulis. Weighted graph cuts withouteigenvectors: a multilevel approach. PAMI,29(11),2007.
    [33] Michael J.Carey and Donald Kossmann. On saying "enough already!" in SQL. InSIGMOD,1997.
    [34] Amelie Marian, Sihem Amer-Yahia, Nick, and Divesh Srivastava. Adaptiveprocessing of top-k queries in XML. In ICDE,2005.
    [35] Sebastian Michel, Peter Triantafillou, and Gerhard Weikum. KLEE: a frameworkfor distributed top-k query algorithms. In VLDB,2005.
    [36] Seung-Won Hwang and Kevin Chen-Chuan Chang. Probe minimization byschedule optimization: supporting top-k queries with expensive predicates. IEEETKDE,19(5),2007.
    [37] Alan Mislove, Krishna P. Gummadi, and Peter Druschel. Exploiting socialnetworks for Internet search. In HotNets,2006.
    [38] Ding Zhou, Jiang Bian, Shuyi Zheng, Hongyuan Zha, and C.Lee Giles. Exploringsocial annotations for information retrieval. In WWW,2008.
    [39] Xin Li, Lei Guo, and Yihong Eric Zhao. Tag-based social interest discovery. InWWW,2008.
    [40] Al M. Rashid, Kimberly Ling, Regina D. Tassone, Paul Resnick, Robert Kraut,and John Riedl. Motivating participation by displaying the value of contribution.In CHI,2006.
    [41] Pei-Yu Chen, Samita Dhanasobhon, and Michael Smith. All reviews are notcreated equal: the disaggregate impact of reviews and reviewers at Amazon.com.In ICIS,2007.
    [42] Judith A. Chevalier and Dina Mayzlin. The effect of word of mouth on sales:online book reviews. Journal of Marketing Research, August2006.
    [43] Anindya Ghose and Panagiotis G.Ipeirotis. Designing raking systems forconsumer reviews: the impact of review subjectivity on product sales and reviewquality. In WITS,2006.
    [44] Ana-Maria Popescu and Oren Etzioni. Extracting product features and opinionsfrom reviews. In HLT,2005.
    [45] Bo Pang and Lillian Lee. A sentimental education: sentiment analysis usingsubjectivity summarization based on minimum cuts. In ACL,2004.
    [46] Bo Pang and Lillian Lee. Seeing stars: exploiting class relationships forsentiment categorization with respect to rating scales. In ACL,2005.
    [47] Barbara Bickart and Robert M. Schindler. Internet forums as influential sourcesof consumer information. Journal of Interactive Marketing,15(3),2001.
    [48] Anindya Ghose and Panagiotis G.Ipeirotis. Designing ranking systems forconsumer reviews: the economic impact of customer sentiment in electronicmarkets. In ICDSS,2007.
    [49] Soo-Min Kim, Patrick Pantel, Timothy Chklovski, and Marco Pennacchiotti.Automatically assessing review helpfulness. In EMNLP,2006.
    [50] Jingjing Lio, Yunbo Cao, Chin Y.Lin, Yalou Huang, and Ming Zhou. Low-qualityproduct review detection in opinion summarization. In Emnlp CoNLL,2007.
    [51] Xian Wu, Lei Zhang, and Yong Yu. Exploring social annotations for the SemanticWeb. In World Wide Web Conference Committee (IW3C2),2006.
    [52] Robert Bell, Yehuda Koren, and Chris Volinsky. Modeling relationships atmultiple scales to improve accuracy of large recommender systems. In KDD,2007.
    [53] Seung-Taek Park and David M. Pennock. Applying collaborative filteringtechniques to movie search for better ranking and browsing. In KDD,2007.
    [54] Eugene Agichtein, Eric Brill, and Susan Dumais. Improving Web search rankingby incorporating user behavior information. In SIGIR,2006.
    [55] Scott A. Golder and Bernardo A. Huberman. The structure of collaborativetagging systems. Information Dynamics Lab, HP Labs,2006.
    [56] Pei, J., Han, J., Mortazavi-Asl, B., and Zhu, H. Mining access patterns efficientlyfrom web logs. In PADKK '00: Proceedings of the4th Pacific-Asia Conferenceon Knowledge Discovery and Data Mining, Current Issues and NewApplications, pages396-407, London, UK, Springer-Verlag,2000.
    [57] Ezeife, C.I. and Lu, Y. Minging web log sequential patterns with position codedpreorder linked wap-tree. Data Mining and Knowledge Discovery,10(1):5-38,2005.
    [58] EI-Sayed, M., Ruiz, C., and Rundensteiner, E.A. Fs-miner: efficient andincremental mining of frequent sequence patterns in web logs. In WIDM '04:Proceedings of the6th annual ACM international workshop on Web informationand data management, pages128-135, New York, NY, USA. ACM,2004.
    [59] Wang, J. and Han, J. Bide: Efficient mining of frequent closed sequences. InICDE '04: Proceedings of the20th International Conference on Data Engineering,page79, Washington, DC, USA. IEEE Computer Society,2004.
    [60] Goethals, B. Frequent set mining. In Maimon, O. and Rokach, L., editors, TheData Mining and Knowledge Discovery Handbook, pages377-397. Springer,2005.
    [61] Facca, F.M. and Lanzi, P.L. Mining interesting knowledge from weblogs: asurvey. Data and Knowledge Engineering,53(3):225-241,2005.
    [62] Ivancsy, R. and Vajk, I. Frequent pattern mining in web log data. ActaPolytechnica Hungarica, Journal of Applied Science at Budapest Tech Hungary,Special Issue on Computational Intelligence,3(1):77-90,2006.
    [63] Fenstermacher, K.D. and Ginsburg, M. Mining client-side activity forpersonalization. In WECWIS '02: Proceedings of the Fourth IEEE InternationalWorkshop on Advanced Issues of E-Commerce and Web-Based InformationSystems (WECWIS'02), page205, Washington, DC, USA,2002.
    [64] Lu, H., Luo, Q., and Shun, Y.k. Extending a web browser with client-side mining.In Zhou, X., Zhang, Y., and Orlowska, M.E., editors, APWeb, volume2642ofLecture Notes in Computer Science, pages166-177,2003.
    [65] Pitkow, J. and Pirolli, P. Mining longest repeating subsequences to predict wwwsurfing. In Proceedings of the2nd USENIX Symposium on InternetTechnologies and Systems2, pages13-21,1999.
    [66] Meo, R., Lanzi, P.L., Matera, M., and Esposito, R. Integrating web conceptualmodeling and web usage mining. In Mobasher, B., Nasraoui, O., Liu, B., andMasand, B.M., editors, WebKDD, volume3932of Lecture Notes in ComputerScience, pages135-148,2004.
    [67] Pabarskaite, Z. and Raudys, A. A process of knowledge discovery from web logdata: Systematization and critical review. Journal of Intelligent InformationSystems,28:9-114,2007.
    [68] Huntington, P., Nicholas, D., and Jamali, H. R. Website usage metrics: Areassessment of session data. Information Processing and Management: anInternational Journal,44:358-372,2008.
    [69] Rymon, R. Search through systematic set enumeration. In Proc. of Third Int'lConf. on Principles of Knowledge Representation and Reasoning, pages539-550,1992.
    [70] Yang, Z., Wang, Y., and Kitsuregawa, M. An effective system for mining web log.In Zhou, X., Li, J., Shen, H.T., Kitsuregawa, M., and Zhang, Y., editors, APWeb,volume3841of Lecture Notes in Computer Science, pages40-52,2006.
    [71] Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., and Hsu, M.Prefixspan: Mining sequential patterns by prefix-projected growth. InProceedings of the17th International Conference on Data Engineering, pages215-224. IEEE Computer Society,2001.
    [72] Srivastava, J., Cooley, R., Deshpande, M., and Tan, P.-N. Web usage mining:discovery and applications of usage patterns from web data. ACM SIGKDDExplorations Newsletter,1(2):12-23,2000.
    [73] Facca, F. M. and Lanzi, P.L. Recent developments in web usage mining research.In Kambayashi, Y., Mohania, M.K., and W., volume2737of Lecture Notes inComputer Science, pages140-150,2003.
    [74] Teisseire, F.M.M. and Poncelet, P. Sequential pattern mining: A survey on issuesand approaches. In in Encyclopedia of Data Warehousing and Mining,Information Science Publishing, pages3-29. Oxford University Press,2005.
    [75] Stumme, G., Hotho, A., and Berendt, B. Semantic web mining: State of the artand future directions. Journal of Web Semantics: Science, Services and Agentson the World Wide Web,4(2):124-143,2006.
    [76] Berners-Lee, T., Hendler, J., and Lassila, O. The semantic web. ScientificAmerican,284(5):34-43,2001.
    [77] Hu, B., Dasmahapatra, S., and Lewis, P.H. Semantic metrics. Intl Journal ofMetadata, Semantics and Ontologies,2(4):242-258,2007.
    [78] Maedche, A., Pekar, V., and Staab, S. Ontology learning part one-on discoveringtaxonomic relations from the web. In Proceedings of the Web Intelligenceconference, pages301-322. Springer Verlag,2002.
    [79] Cooley, R., Mobasher, B., and Srivastava, J. Data preparation for mining worldwide web browsing patterns. Journal of Knowledge and Information Systems,1(1):5-32,1999.
    [80] Doan, A., Domingos, P., and Levy, A. Learning source descriptions for dataintegration. In presented at the International Workshop on The Web andDatabases (WebDB), pages81-86,2000.
    [81] Hastings, P.M. Automatic acquisition of word meaning from context. Ph.D thesis,University of Michigan, Ann Arbor, MI, USA,1994.
    [82] Hahn, U. and Schnattinger, K. Ontology engineering via text understanding. InProceedings of the15th World Computer Congress 'The Global InformationSociety on The Way to The Next Millenium'(IFIP'98, pages429-442),1998.
    [83] Hearst, M. Automated discovery of WordNet relations. WordNet: an electroniclexical database, pages131-151,1998.
    [84] Caraballo, S.A. Automatic construction of a hypernym-labeled noun hierarchyfrom text. PhD thesis, Brown University, Providence, RI, USA,2001.
    [85] Aussenac-gilles, N. Supervised text analysis for ontology and terminologyengineering. In Proceedings of the Dagstuhl Seminar on Machine Learning forthe Semantic Web, pages35-46,2005.
    [86] Manning, C.d. and Schuetze, H. Foundations of Statistical Natural LanguageProcessing. MIT Press,1999.
    [87] Sanderson, M. and Croft, B. Deriving concept hierarchies from text. InProceedings of the22nd annual international ACM SIGIR conference onResearch and development in information retrieval, SIGIR '99, pages206-213,New York, NY, USA. ACM,1999.
    [88] Zhou, L. Ontology learning: state of the art and open issues. InformationTechnology and Management,8:241-252,2007.
    [89] Xu, F. Term extraction and mining of term relations from unrestricted texts in thefinancial domain. In Proceedings of BIS,2002.
    [90] Rydin, S. Building a hyponymy lexicon with hierarchical structure. InProceedings of the ACL-02workshop on Unsupervised lexicalacquisition-Volume9, ULA '02, pages26-33, Stroudsburg, PA, USA. Associationfor Computational Linguistics,2002.
    [91] Inkpen, D.Z. and Hirst, G. Automatic sense disambiguation of the near-synonymsin a dictionary entry. In Proceedings of the4th international conference onComputational linguistics and intelligent text processing, CICLing'03, pags258-267, Berlin, Heidelberg. Springer-Verlag,2003.
    [92] Cimiano, P., Hotho, A., and Staab, S. Comparing conceptual, divisive andagglomerative clustering for learning taxonomies from text. In EuropeanConference on Artificial Intelligence, volume16, page435,2004.
    [93] Eilbeck, K., Lewis, S., Mungall, C., Yandell, M., Stein, L., Durbin, R., andAshburner, M. The sequence ontology: a tool for the unification of genomeannotations. Genome biology,6(5):R44,2005.
    [94] Joachims, T., Freitag, D., and Mitchell, T. M. Web watcher: A tour guide for theworld wide web. In IJCAI (1), pages770-777,1977.
    [95] Eirinaki, M. and Vazirgiannis, M. Web mining for web personalization. ACMTrans. Internet Technol.,3(1):1-27,2003.
    [96] Billsus, D., Brunk, C. A., Evans, C., Gladish, B., and Pazzani, M. Adaptiveinterfaces for ubiquitous web access. ACM,45(5):34-38,2002.
    [97] Zhang, Y., Callan, J., and Minka, T. Novelty and redundancy detection inadaptive fltering. InSIGIR '02: Proceedings of the25th annual internationalACM SIGIR conference on Research and development in information retrieval,pages81-88, New York, NY, USA. ACM,2002.
    [98] Berendt, B. Using site semantics to analyze, visualize, and support navigation.Data Min. Knowl. Discov.,6(1):37-59,2002.
    [99] Adomavicius, G. and Tuzhilin, A. Toward the next generation of recommendersystems: a survey of the state-of-the-art and possible extensions. Knowledge andData Engineering, IEEE Transactions on,17(6):734-749,2005.
    [100] Dai, H. and Mobasher, B. A road map to more efective web personalization:Integrating domain knowledge with web usage mining. InProceedings of theInternational Conference on Internet Computing,2003.
    [101] Sarwar, B. M., Karypis, G., Konstan, J. A., and Riedl, J. Item-basedcollaborative fltering recommendation algorithms. InWWW, pages285-295,2001.
    [102] Mobasher, B., Dai, H., Luo, T., and Nakagawa, M. Discovery and evaluation ofaggregate usage profiles for web personalization.Data Min. Knowl. Discov.,6(1):61-82,2002.
    [103] Hofmann, T. Collaborative fltering via gaussian probabilistic latent semanticanalysis. InSIGIR '03: Proceedings of the26th annual international ACMSIGIR conference on Research and development in informaion retrieval, pages259-266, New York, NY, USA. ACM,2003.
    [104] Hofmann, T. Latent semantic models for collaborative fltering.ACM Trans. Inf.Syst.,22(1):89-115,2004.
    [105] Li, J. and Zaiane, O. Combining usage, content, and structure data to improveweb site recommendation.E-Commerce and Web Technologies, LNCS3182:305-315,2004.
    [106] Anand, S. S., Kearney, P., and Shapcott, M. Generating semantically enricheduser profiles for web personalization.ACM Trans. Internet Tech.,7(4),2007.
    [107] Sandvig, J. J., Mobasher, B., and Burke, R. Robustness of collaborativerecommendation based on association rule mining. InRecSys '07: Proceedingsof the2007ACM conference on Recommender systems, pages105-112, NewYork, NY, USA. ACM,2007.
    [108] Melville, P., Mooney, R. J., and Nagarajan, R. Content-boosted collaborativefiltering for improved recommendations. InAAAI/IAAI, pages187-192,2002.
    [109] Jin, X., Zhou, Y., and Mobasher, B. A maximum entropy web recommendationsystem: combining collaborative and content features. In KDD '05: Proceedingsof the eleventh ACM SIGKDD international conference on Knowledgediscovery in data mining, pages612-617, New York, NY, USA. ACM,2005.
    [110] Linden, G., Smith, B., and York, J. Amazon.com recommendations:item-to-item collaborative filtering. Internet Computing, IEEE,7(1):76-80,2003.
    [111] Cunningham, P., Bergmann, R., Schmitt, S., Traphoner, R., Breen, S., andSmyth, B. Websell: Intelligent sales assistants for the world wide web.Kunstliche Intelligenz,15(1):28-32,2001.
    [112] Schwab, I., Kobsa, A., and Koychev, I. Learning user interests through positiveexamples using content analysis and collaborative filtering. Internal Memo,GMD,2001.
    [113] Baumgarten, M., Buchner, A. G., Anand, S. S., Mulvenna, M. D., and Hughes, J.G. User-driven navigation pattern discovery from internet data. InWEBKDD'99: Revised Papers from the International Workshop on Web UsageAnalysis and User Profiling, pages74-91, London, UK. Springer-Verlag,2000.
    [114] Salton, G. and Buckley, C. Term-weighting approaches in automatic textretrieval. Inf. Process. Manage.,24(5):513-523,1988.
    [115] Middleton, S. E., Roure, D. D., and Shadbolt, N. R. Ontology-basedrecommender systems. In Staab, S. and Studer, R., editors, Handbook onOntologies, International Handbooks Information System, pages779-796.Springer Berlin Heidelberg,2009.
    [116] Cooley,R., Mobasher, B., and Srivastava, J. Data preparation for mining worldwide web browsing patterns.Journal of Knowledge and Information Systems,1(1):5-32,1999.
    [117] Acharyya, S. and Ghosh, J. Context-sensitive modeling of web-surfing behaviorusing concept trees. InProc. of the5th WEBKDD Workshop,2003.
    [118] Eirinaki, M. and Vazirgiannis, M. Web mining for web personalization. ACMTrans. Internet Technol.,3(1):1-27,2003.
    [119] Middleton, S. E., Shadbolt, N., and Roure, D. D. Ontological user profiling inrecommender systems. ACM Trans. Inf. Syst.,22(1):54-88,2004.
    [120] Ziegler, C.-N., McNee, S. M., Konstan, J. A., and Lausen, G. Improvingrecommendation lists through topic diversification. In Ellis, A. and Hagino, T.,editors, pages22-32. ACM,2005.
    [121] Oberle, D., Berendt, B., Hotho, A., and Gonzalez, J. Conceptual user tracking.In Ruiz, E. M., Segovia, J., and Szczepaniak, P. S., editors, AWIC, volume2663of Lecture Notes in Computer Science, pages155-164. Springer,2003.
    [122] Berendt, B. and Spiliopoulou, M. Analysis of navigation behaviour in web sitesintegrating multiple information systems. The VLDB Journal,9(1):56-75,2000.
    [123] Witten, I. and Frank, E. Data Mining: Practical machine learning tools andtechniques with java implementations. Morgan Kaufmann Pub.34,1999.
    [124] Hayes, C., Avesani, P., and Veeramachaneni, S. An analysis of the use of tags ina blog recommender system. In Veloso, M. M., editor,IJCAI, pages2772-2777,2007.
    [125] Diederich, J. and Iofciu, T. Finding communities of practice from user profilesbased on folksonomies. In Tomadaki, E. and Scott, P. J., editors,EC-TELWorkshops, volume213ofCEUR Workshop Proceedings. CEUR-WS.org,2006.
    [126] Zanardi, V. and Capra, L. Social ranking: uncovering relevant content usingtag-based recommender systems. In Proceedings of the2008ACM conferenceon Recommender systems, RecSys '08, pages51-58, New York, NY, USA.ACM,2008.
    [127] Niwa, S., Doi, T., and Honiden, S. Web page recommender system based onfolksonomy mining for itng '06submissions. In ITNG, pages388-393. IEEEComputer Society,2006.
    [128] Cantador, I., Bellogin, A., and Vallet, D. Content-based recommendation insocial tagging systems. In Proceedings of the fourth ACM conference onRecommender systems, RecSys '10, pages237-240, New York, NY, USA.ACM,2010.
    [129] Guan, Z., Wang, C., Bu, J., Chen, C., Yang, K., Cai, D., and He, X. Documentrecommendation in social tagging services. In Proceedings of the19thinternational conference on World wide web, WWW '10, pages391-400, NewYork, NY, USA. ACM,2010.
    [130] Deshpande, M. and Karypis, G. Selective markov models for predicting webpage accesses. Transactions on Internet Technology,4(2):163-184,2004.
    [131] Handschuh, S. and Staab, S. Authoring and annotation of web pages in cream.In WWW '02: Proceedings of the11th international conference on World WideWeb, pages462-473, New York, NY, USA. ACM,2002.
    [132] Vargas-Vera, M., Motta, E., Domingue, J., Lanzoni, M., Stutt, A., and Ciravegna,F. Mnm: Ontology driven semi-automatic and automatic support for semanticmarkup. In Proceedings of the13th Intl. Conf. on Knowledge Engineering andKnowledge Management: Ontologies and the Semantic Web, pages379-391,2002.
    [133] Domingue, J. B. Tadzebao and webonto: Discussing, browsing and editingontologies on the web. In Proceedings of the Knowledge Acquisition Workshop,1998.
    [134] Luke, S., Spector, L., Rager, D., and Hendler, J. Ontology-based web agents. InProceedings of the1st Intl. Conf. on Autonomous Agents, pages59-66,1997.
    [135] Wu, Z. and Palmer, M. S. Verb semantics and lexical selection. In Proceedingsof the32nd annual meeting on Association for Computational Linguistics,pages133-138. Association for Computational Linguistics,1994.
    [136] Agrawal, R., Imielinski, T., and Swami, A. Mining association rules betweensets of items in large databases. In SIGMOD '93: Proceedings of the1993ACMSIGMOD international conference on Management of data, pages207-216,New York, NY, USA.ACM,1993.
    [137] Srikant, R. and Agrawal, R. Mining sequential patterns: Generalizations andperformance improvements. In Proceedings of the5th Int'l Conference onExtending Database Technology: Advances in Database Technology, pages3-17,1996.
    [138] Miller, G., Beckwith, R., Fellbaum, C., Gross, D., and Miller, K. Introduction toWordnet: An on-line lexical database.International Journal of lexicography,3(4):235,1990.
    [139] Porter, M. F. An algorithm for sux stripping.Program,14(3):130-137,1980.
    [140] Simpson, T. and Dao, T. Wordnet-based semantic similarity measurement.http://www.codeproject.com/KB/string/semanticsimilaritywordnet.aspx,2010.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700