元搜索引擎技术的研究与应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
元搜索引擎是基于搜索引擎基础之上的搜索引擎,它可以同时检索多个成员搜索引擎,对成员引擎返回的结果信息进行融合、再加工后二次陈列给用户。元搜索引擎是当今学术界研究的热门领域之一。
     本文首先对搜索引擎和元搜索引擎的发展和搜索原理等进行了概述,然后分别对元搜索引擎的几个关键技术,包括成员引擎的调度、搜索结果的整合、个性化服务的实现等,进行了研究和分析,并在此基础上提出了本文设计的算法。本文主要的研究工作如下:
     (1) 成员引擎调度算法的分析,并在此基础上根据本文的成员引擎的特点提出本文使用的成员引擎调度算法。
     (2) 跟踪用户的搜索行为(包括隐式的点击浏览和显式的投票),并对用户行为进行分析,动态地修改用户模型。这为成员搜索引擎的调度和搜索结果的整合与排名提供了依据。
     (3) 提出了基于用户行为的搜索结果合并算法。它根据对用户行为的分析进行搜索结果的排名值计算,从而获得贴近用户偏好的搜索结果和排名。
     最后,本文设计了一个基于用户搜索行为分析基础之上的元搜索引擎。相较于其它的元搜索引擎,该引擎具有友好的用户界面,为用户提供了一个快速查看网页内容的捷径,并且由于是基于用户行为分析进行的成员搜索引擎调度和搜索结果整合,因此更贴近用户对搜索引擎的偏好。
Meta search engine is base on component search engines. It sends the user query to a number of component search engines simultaneously, then merges the results lists returned from them into a single ranked list and presents the merged results to users. It has become a main prospect of research.
    First, the state-of-the-art of the traditional search engines and the Meta search engines are overviewed, then analysis of the several main technologies of Meta search engine are proposed, including the scheduling of component search engines, the merging of search results, and personalized service. Based on these researches, the algorithm of the Meta search engine in this paper is proposed. The main work of this paper includes:
    (1) Analyze the scheduling of component search engines, and select a proper scheduling based on the characteristics of the component search engines in this paper.
    (2) Track the users' behavior (including clicking and voting), and upon that, we analyze the behaviors and modify the user model continuously. This user model provides the foundation of component search engine scheduling and results merging.
    (3) Propose the results merging algorithm base on the users' behavior. It computes the rank value of a document to a user query, and removes repeated results away, so as to get good search results close to users' favor.
    In the end, we designed a Meta Search Engine on the basis of analyzing users' behavior. Comparing to the existing Meta search engine, it has a friendly user interface, and provide a convenient way of checking the rough content of a webpage quickly. Also, as its component search engine scheduling algorithm and result merging algorithm are based on the users' behavior, it's more prone to users' favor of using search engine.
引文
[1] Zonghuan Wu, Automatic discovery and selection of text resources on the web, towards building a very large-scale and effective meta-search engine, webscales, ProQuest Information and Learning Company, 2002, P_1
    [2] 薛云,Internet上元搜索引擎的研究与设计,太原理工大学硕士学位论文,2003.4.
    [3] 索金琳 王志坚,基于桌面的特定领域meta-search系统的研究,河海大学硕士学位论文,2002.3
    [4] Dell Zhang, Yisheng Dong; An Efficient Algorithm to Rank Web Resources: Department of Computer Science & Engineering, Southeast University, Nanjing, 210096, China
    [5] 滕跃,基于用户兴趣的个性化WEB检索,清华大学硕士学位论文,2004.3.
    [6] 张俭恭,扩展元搜索引擎(EMSE)的系统涉及,中国科学院研究生院硕士学位论文,2002.1.
    [7] 黄小凯,http://www.qiandu.net/seo/info/576.html,Alexa排名影响因素,2005.5.11
    [8] 袁黄琳,http://www.kreny.com/pagerank_cn.htm,Google的秘密—PageRank彻底解说 中文版,2004.1.24
    [9] Bharath Kumar Mohan, Searching Association Networks for Nurturers, IEEE Computer Society, 0018-9162/05/$20.00(?)2005 IEEE
    [10] King-Lup Liu, Clement Yu, Weiyi Meng, Wensheng Wu, Naphtali Rishe; A Statistical Method for Estimating the Usefulness of Text Databases; IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 14, NO. 6, NOVEMBER/DECEMBER 2002
    [11] M. M. Sufyan Beg, A subjective measure of web search quality, Information Sciences 169 (2005) 365-381
    [12] Bernard J. Jansen, Amanda Spink, How are we searching the World Wide Web? A comparison of nine search engine transaction logs, Information Processing and Management 42 (2006) 248-263
    [13] http://www.qiandu.net/seo/info/1285.html,黄小凯,百度竞价排名相关介绍,2006-2-27
    [14] Ryan Asleson, Nathaniel T. Schutta; Foundations of Ajax; POSTS & TELECOM PRESS; P_(14-21)
    [15] 柯自聪, http://dev2dev.bea.com.cn/bbsdoc/20051114124.html,AJAX开发简略
    [16] 樊康斯,基于服务器端的个性化元搜索引擎的研究与设计,苏州大学硕士学位论文,2005.4.
    [17] Kelly Boutilier, Mark Ross, Alexandre V. Podtelejnikov, Chris Orsi, Rod Taylor, Paul Taylor, Daniel Figeys; Comparison of different search engines using validated MS/MS test datasets; Analytica Chimica Acta 534 (2005) 11-20
    [18] Dr Eugene Schultz: Search engines: a growing contributor to security risk; Computers & Security (2005) 24, 87-88
    [19] Adam Magos, Pietro Gambadauro; Dcsktop search engines: a modern way to hand search in full text; www.thelancet.com Vol 366 July 16, 2005
    [20] 陈大平,集成搜索引擎与元搜索引擎比较研究,大学图书情报学刊,2005年2月 第23卷 第1期
    [21] http://blog.csdn.net/cissyring/archive/2005/09/18/4840999.aspx,搜索引擎原理,2005.9.18
    [22] 文坤梅 卢正鼎 邓曦 陈莉,元搜索引擎中检索结果排序的优化办法,华中科技大学学报(自然科学版),2003年3月 第31卷 第3期
    [23] http://blog.csdn.net/fxsjy/archive/2006/01/11/576707.aspx,如何实现小型WEB搜索引擎(C#+SQLServer全文检索+Asp.net),2006.1.11

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700