用户名: 密码: 验证码:
基于Web和数据挖掘的个性化服务系统的设计与实现
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
"信息手段革命"转向"信息内容革命",引发了全球性数字图书馆建设浪潮。为满足为用户服务这一基本功能,数字图书馆应定位于信息服务,以图书馆的网络化服务为系统设计的出发点和最终目的,既要能够满足用户的一般需求,又要能够针对用户的类型、需求特征等确定数字图书馆的资源和服务,从而实现向智能化知识网络方向发展。个性化服务是解决这一问题的途径之一。
    本文所介绍的个性化服务系统为注册用户提供定制馆藏数字资源的能力,并采用关联规则挖掘算法,通过挖掘用户定制资源的事务数据库,得到注册用户定制内容的组合模式,即关联规则,对用户进行定制推荐,提供个性化服务。同时,本系统通过对注册用户的访问历史进行统计,为图书馆员指引资源收集方向。
    本文详细介绍了北京工业大学图书馆的个性化服务系统的设计和实现过程,对目前实现个性化服务所依赖的主要技术作了介绍,对主要技术的各个方法作了分析、比较和选择。本文详细讨论了本个性化服务系统实现中的关键技术,重点介绍了韩家炜教授提出的FP-增长算法,讨论了该方法的原理、优点及实现过程。同时,还讨论了系统的可扩展性和效率等问题。文章最后就一个实例演示了本系统的应用,并提出了对本个性化服务系统的改进意见,对个性化服务的发展作了展望。
The turning from "Information means revolution" to "Information content revolution" has triggered a worldwide wave of building digital library. Digital library will be an information service-based system which basic function is service. It should not only satisfy user's common request, but also provide special resource and service according to different users' type and request. Personalization information service is one of the ways to help it to be an intelligentized knowledge network.
    This personalization service system described in this paper can give registered users abilities to customize library's digital resource, and provide personalization service for users by using association rules mining algorithm. It mines transaction database of customized digital resources, and discovers mode of customized content, which are association rules. System can give registered users personalization service and recommendation on the basis of association knowledge. Also, with registered users' access history statistic, it can direct librarian what kind of resource to collect.
    The paper introduces the design and implementation of library personalization service system of Beijing University of Technology in detail. It introduces some main dependable technologies at present, then analyses, compares and selects different methods of these technologies. The paper detailly discusses some crucial technologies of system implementation. It stresses the FP-growth algorithm which is presented by Processor Jiawei Han, discusses its theory, merits and implementation. Also the paper discusses the aspects of system expansibility and efficiency. Finally, the paper demonstrates the use of this system, and gives ideas of system improvement, anticipates the prospect of personalization service.
引文
1 孙承鉴, 刘 刚. 中国数字图书馆建设的起步与发展. http://www.nlc.gov.cn/shuzi/forum2.htm
    2 刘炜. 服务驱动的数字变革. 计算机世界报 第44期 D10、D11
    3 夏年军. 图书馆网站建设中的个性化信息服务. 图书馆论坛. 2002(4):79(81
    4 陈海英. 数字图书馆的个性化服务. 图书馆建设 2002(4) :72(73 
    5 罗琳. 个性化服务与数字图书馆的发展. 图书情报知识 2000(4) :55(57
    6 薛崧.基于Web数据库平台的图书馆个性化服务:MyLibrary.《图书情报工作》2002.8
    7 曾春,邢春晓,周立柱. 个性化服务技术综述.软件学报 2002,13(10) :1952(1961
    8 苗放,范敏,潘伟.以人为本的Web技术.成都理工学院院报. 2002(8):458(464
    9 http://www-900.ibm.com/developerWorks/cn/wsdd/hvws/personalize.shtml
    10 Jiawei Han, Micheline Kamber著,范明、孟小峰等译.数据挖掘概念与技术.机械工业出版社,2001
    11 Kodratoff Y. Rating the interest of rules induced from data and within texts. Database and Expert Systems Applications, 2001. Proceedings. 12th International Workshop on : 265 -269
    12 Aijun An, Yuanyuan Wang. Comparisons of classification methods for screening potential compounds. Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on : 11-18
    13 Ansari S, Kohavi R, Mason L, Zijian Zheng. Integrating e-commerce and data mining: architecture and challenges. Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on : 27-34
    14 W.H.Inmon著,王志海等译.数据仓库. 机械工业出版社,2002
    15 Bharat K, Bay-Wei Chang, Henzinger M, Ruhl M. Who links to whom: mining linkage between Web sites. Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on : 51-58
    16 Kyburg H.E., Jr. Statistical considerations in learning from data. Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on : 321-328
    17 叶松云,许龙飞. 基于J2EE的数据挖掘系统的设计与实现 计算机工程与应用2006(6):201(206
    18 Ed Roman著.Mastering EJB.电子工业出版社, 2002
    19 施汝军 编著. 网站JSP后台解决方案. 人民邮电出版社,1998
    20 廖若雪 编著. JSP高级编程.机械工业出版社,2001
    
    
    21 Cay S.Horstmann,Gany Cornell著,朱志,王怀,赵伟等译. Java 2核心技术卷II高级特性.机械工业出版社,2002
    22 黄斐编著. Java程序设计与应用技术教程. 北京希望电子出版社. 2002
    23 Khawar Zaman Ahmed,Cary E.Umrysh著,康博译. 用J2EE和UML开发Java企业级应用程序. 清华大学出版社, 2002
    24 Wenmin Li, Jiawei Han, Jian Pei. CMAR: accurate and efficient classification based on multiple class-association rules. Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on : 369-376
    25 Bing Liu, Yiming Ma, Lee R. Analyzing the interestingness of association rules from the temporal dimension. Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on : 377-384
    26 W.Boggs, M.Boggs. Mastering UML with Rational Rose. 邱仲潘等译. 电子工业出版社,2000:424(460
    27 Kevin著. Java数据库应用程序编程指南. 电子工业出版社,2002
    28 Zaiane O.R, El-Hajj M, Lu P. Fast parallel association rule mining without candidacy generation. Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on : 665-668
    29 Qinghua Zou, Wesley Chu, Johnson D, Chiu H. A pattern decomposition (PD) algorithm for finding all frequent patterns in large datasets. Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on : 673-674
    30 John Bell著. JavaServlets2.3编程指南. 电子工业出版社,2002
    31 肖晓军,杨岳湘,瞿国平.一个基于因特网的个性化信息服务系统的设计和实现. 计算机工程与科学2002(1):59(62
    32 Seno M, Karypis G. LPMiner: an algorithm for finding frequent itemsets using length-decreasing support constraint. Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on : 505-512
    33 Ning Zhong, Ohshima M, Yao Y.Y, Ohsuga S. Interestingness, peculiarity, and multi-database mining. Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on : 566-573
    34 石晶,龚振宇,裘杭萍.基于Web使用挖掘的个性化服务系统.电子科技大学学报. 2002(8):399(403
    35 Jian Pei, Jiawei Han, Hongjun Lu, Shojiro Nishio, Shiwei Tang, Dongqing Yang. H-mine: hyper-structure mining of frequent patterns in large databases. Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on : 441-448
    36 Richards G, Rayward-Smith V.J. Discovery of association rules in tabular data. Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on : 465-472
    
    
    37 邢东山,沈钧毅,原野. 基于Web使用挖掘技术的个性化教育网站构筑.
    38 郝聃.WebSphere Studio Application Developer的J2EE实用开发经验. 2002. 12
    http://www-900.ibm.com/developerWorks/cn/wsdd/library/techarticles/haodan/
    index.shtml
    39 Gouda K, Zaki M.J. Efficiently mining maximal frequent itemsets. Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on : 163-170
    40 Chang-Hung Lee, Cheng-Ru Lin, Ming-Syan Chen. On mining general temporal association rules in a publication database. Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on : 337-344
    41 Jiuyong Li, Hong Shen, Topor R. Mining the smallest association rule set for predictions. Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on : 361-368
    42 Jamil H.M. Ad hoc association rule mining as SQL3 queries. Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on : 609-612
    43 Viet Phan-Luong. The representative basis for association rules. Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on : 639-640
    44 http://www-900.ibm.com/developerWorks/cn/wsdd/library/index.shtml

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700