详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
As a method of knowledge discovery, data mining has been widely used, and was the most active domain of database. Web mining is to use the traditional data mining technologies to extract information and knowledge in the Web environment. The web usage mining is the most wide used method, which is used in the field of e-commerce, internet ads, intelligent recommendation system, internet marketing, and intelligent decision support. A good model of web mining is the key to the success of web usage mining, this dissertation will do some research.The dissertation will improve and implement several methods and arithmetic based on the research of the theory and achievement, which is about web user access information mining. This dissertation will design the database to present corresponding data. Then construct a Web user access information mining system model bade on database, and realize several functional module.Data preprocessing is the preparation of web mining. This dissertation will realize data cleaning in SQLServer2000, and introduce method of data cleaning based on the character matching of the crawler. In the phase of user identifying, method based on Cookie, ip, and agent is used. This dissertation gives the concrete arithmetic of session identification and transaction identification, which uses maximum forward path.Pattern discovery is the key to web mining. This dissertation first constructs data presentation of the user access interesting dimension, uses concept hierarchy to induct the page data, then educes the data set suitable to BP networks, finally uses BP networks to constructs a classifier. Then this dissertation introduces and realizes arithmetic of
    frequent access path based on association rules and sequential mode. At last, this dissertation creates a Matlab arithmetic, which is extensible and practicable, to calculate the relation matrix and statistic analysis.On the ground of work above, this dissertation presents a Web mining system model bade on database, and describes and analyses every module. This model allows that all the operation be based on database. All pattern discovered should be involved in database so that we can manage and apply pattern discovered easily. This dissertation applies web user access information mining to shanghai agriculture information, and finds several useful patterns. The experience data proves that web user access information mining system is practical and effective.The dissertation uses SQL server 2000 as database system, and uses SQL sentence to implement data preprocess. The dissertation uses C++ and Matlab to develop all the function. Web user access information mining is the widely used web mining technique. It can know the interest of users, improve site structure, provide customized service, better marketing policy, recommend and predict the user's behavior. The model given in this dissertation is applicable. Research of this dissertation has theoretical importance and practical value to web user access information mining.
[1] Jiawei Han, Micheline Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers. Inc, 2001
    [2] Raymond Kosala, Hendrik Blockeel, Web Mining Research: A Survey, In SIGKDD, 2000.07, 1-15
    [3] Gordon S. Linoff, Michael J. A. Berry, Mining the Web: Transforming Customer Data into Customer Value, John Wiley & Sons Inc, 2001
    [4] 李亚飞,刘业政,Web挖掘的体系研究,合肥工业大学学报(自然科学版),2004.03,305-309
    [5] Bamshad Mobasher, Honghua Dai, Tao Luo, etc. Jim Wiltshire. Discovery of Aggregate Usage Profiles for Web Personalization, 2000.1-14
    [6] Ajith Abraham, Vitorino Ramos, Web Usage Mining Using Artificial Ant Colony Clustering and Linear Genetic Programming, In CEC03-Congress on Evolutionary Computation, IEEE Press, Canberra, Australia, Dec. 2003.8-12
    [7] Georgios Paliouras, Christos Papatheodorou, Vangelis Karkaletsis, etc. Clustering the Users of large web site into communities, In ICML2000
    [8] 邢东山,沈钧毅,宋擒豹,从Web日志中挖掘用户浏览偏爱路径,计算机学报,2003.11,1518-1523.
    [9] 周则顺,水俊峰,夏红霞等,基于Web日志挖掘的智能站点体系,武汉理工大学学报,2003.12,72-75
    [10] 李代平,章文,中文SQLServer2000数据库应用基础,北京:冶金工业出版社,2002
    [11] 朱扬勇,左子叶等译,数据挖掘实践,机械工业出版社,2003
    [12] J. Kleinberg, Authoritative sources in hyperlinked environment, In 9th ACM-SIAM Symposium on Discrete Algorithms, 1998
    [13] S.Brin, L.page, The anatomy of a large-scale hypertextual Web search engine, In 7th International World Wide Web Conference, Brisbane, Australia, 1998
    [14] 杨炳儒,李岩,陈新中,王霞,Web结构挖掘,计算机工程,2003.20,28-30
    [15] 凌志泉,搜索引擎中的网络数据挖掘技术,计算机工程与设计,2003.09,70-72
    [16] 李国辉,汤大权,武德峰,信息组织与检索,北京:科学出版社,2003
    [17] 涂承胜,鲁明羽,陆玉昌,Web内容挖掘技术研究,计算机应用研究,2003.11,5-9
    [18] 张兴华,搜索引擎技术及研究,现代情报,2004.04,142-145
    [19] Bettina Berendt, Andreas Hotho, and Gerd Stumme, Toward Semantic Web Mining, The First International Semantic Web Mining Conference(ISWC2002), Sardinia, Italy, 9-12th June, 2002, pages 264-278
    [20] 张娥,郑斐峰,冯耕中,Web日志数据挖掘的数据预处理方法研究,计算机应用研究,2004.2,58-60
    [21] 陈宝树,党齐民,Web数据挖掘中的数据预处理,计算机工程,2002.07,125-127
    [22] 张维明主编,数据仓库原理与应用,北京:电子工业出版社,2002
    [23] 李煊,庄镇泉,Web访问挖掘中预处理的用户识别算法,计算机工程与应用,2002.07,173-176
    [24] 易敏昕,汪胜,张有仁等,Web使用数据挖掘中数据预处理的研究,计算机工程与应用,2003.24,154-157
    [25] 汤明伟,浅谈COOKIE技术,常州信息职业技术学院学报,2005.03,46-48
    [26] 邓英,李明,用户访问模式挖掘中数据预处理问题的研究,计算机工程与应用,2002.01,188-190
    [27] 王熙照,王丽娟,袁方等,Web用户访问模式挖掘,河北大学学报(自然科学版),2003.04,404-409
    [28] 董恒庆,梅清,Web日志挖掘数据预处理研究,现代计算机,2004.03,6-9
    [29] 胡海璐,周海涛,Visual C++.NET高级编程技术与范例,北京:电子工业出版社,2002
    [30] 裘宗燕译,C++程序设计语言,北京:机械工业出版社,2005
    [31] 郭伟刚,电子商务网站用户访问模式挖掘中的预处理技术,计算机应用,2005.03,691-694
    [32] 张健沛,刘建东,杨静,基于Web的日志挖掘数据预处理方法的研究,计算机工程与应用,2003.10,191-193
    [33] Magdalini Eirinaki, Michalis Vazirgiannis, Web Mining for Web Personalization, ACM Transactions on Internet Technology, 2003.01, 1-17
    [34] A. Joshi, C. Punyapu, P. Karnam, Personalization and a synchronicity to support mobile web access, in Proc. Workshop on Web Information and Data Management, 7th Intl. Conf. on Information and Knowledge Management, November 1998,
    [35] 邓英,李明,Web数据挖掘技术及工具研究,计算机工程与应用,2001.20,92-94
    [36] W. Fan, M. D. Gordon, P. Pathak, Effective profiling of consumer information retrieval needs: a unified framework and empirical comparison, in press, Decision Support System, Elsevier Science B. V. 2004, 1-21
    [37] Mike Perkowitz, Oren Etzioni. Towards adaptive Web sites: Conceptual framework and case study, in press, Artificial Intelligence, Elsevier Science B.V, 2000, 245-275
    [38] Bettina Berendt, Andreas Hotho, Gerd Stumme, Toward Semantic Web Mining. The First International Semantic Web Mining Conference(ISWC2002), Sardinia, Italy, 9-12th June, 2002, pages 264-278.
    [39] A. G. Buchner, M. Baumgarten, S. S. Anand, etc. Navigation pattern discovery from internet data, In MIMIC—Mining the Internet for Marketing Intelligence, 2000
    [40] 袁曾任,人工神经元网络及其应用,北京:清华大学出版社,1996
    [41] 陈丽雯,基于神经网络的数据挖掘模型研究与应用,[学位论文],大连,大连海事大学,2004
    [42] 丛爽,典型人工神经网络结构、功能及其在智能系统中的应用,信息与控制,2001.02,97-103
    [43] 王文剑,BP神经网络模型的优化,计算机工程与设计,2000.06,8-10
    [44] 李宏东,姚天祥等译,模式分类,北京:机械工业出版社,2003
    [45] 闻新,周露,Matlab神经网络应用设计,北京:科学出版社,2000
    [46] 徐宗本,张讲社,郑亚林,计算智能中的仿生学,北京:科学出版社,2003
    [47] 郭晶,杨章玉,Matlab6.5辅助神经网络分析与设计,北京:电子工业出版社,2003
    [48] 张立明,人工神经网络的模型及应用,上海:复旦大学出版社,1994
    [49] 蒋宗礼,人工神经网络导论,北京:高等教育出版社,2001
    [50] 高文忠,顾树生,前馈神经网络的新算法及其收敛性,控制与决策,1995.03,284-288
    [51] Jude W. Shavlik, G. G. Towell, An approach to combining explanation-based and neural learning algorithms, Connection Science, 1989.3, 231—253
    [52] Lawrence O. Hall, Steve G. Romaniuk, A Hybrid Connectionist, Symbolic Learning System, AAAI 1990, 783-788
    [53] 孟祥武,优化神经网络结构,计算机研究与发展,1997.8,594-598
    [54] 张立明,人工神经网络的模型及应用,上海:复旦大学出版社,1994
    [55] 戚得虎,BP神经网络的设计,计算机工程与设计,1998.2,48-50
    [56] 苏金明,阮沈勇,Matlab6实用指南,北京:电子工业出版社,2002.
    [57] 董长虹,Matlab神经网络与应用,北京,国防工业出版社,2005
    [58] 冯艳,王坚强,数据挖掘技术在电子商务上的应用,湖南商学院学报(双月刊),2002.03,17-20
    [59] Long Wang, ristoph Meinel, Behaviour Recovery and Complicated Pattern Definition in Web Usage Mining, In ICWE 2004,531-543
    [60] R. Agrwal, R. Srikant, Fast algorithms for mining association rules, In Proc of the 20th VLDB conference, pages 1994
    [61] W. Fan, M. D. Gordon, P. Pathak, Effective profiling of consumer information retrieval needs: a unified framework and empirical comparison, in press, Decision Support System, Elsevier Science B.V. 2004, 1-21
    [62] 张龙翔,一种基于Web日志挖掘的频繁访问页组加强算法,临沂师范学院学报,2004.06,100-103
    [63] 施建生,伍卫国,陆丽娜等,Web日志中挖掘用户浏览模式的研究,西安交通大学学报,2001.06,621-624
    [64] D. Hanselman, B. Littlefield, Mastering Matlab6: A Comprehensive Tutorial Reference, Prentice Hall, Inc.2001
    [65] 费爱国,王新辉,一种基于Web日志文件的信息挖掘方法,计算机应用,2004.06,57-59
    [66] 林宇等,数据仓库原理与实践,人民邮电出版社,2003
    [67] Igor Cadez, David Heckerman, Christopher Meek, Padhraic Smyth. Steven White. Model-Based Clustering and Visualization of Navigation Patterns on a Web Site, In WA98052, 2001.09, 1-33
    [68] 王艳清,李海峰,基于XML的网络日志分析,北京化工大学学报,2004.06,98-100
    [69] 崔杰,张颍,数据挖掘技术在CRM中的应用,辽宁工学院学报,2002.06,8-9