基于WWW的新闻搜索引擎的设计与实现
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
如何从网海中快速找到自己想要的新闻是一个棘手的问题。本论文结合高速列车信息服务系统的课题,通过最新的相关关键技术设计出了实用的新闻搜索引擎,从而使旅客即使身处在高速行驶的列车上也能够动态的接收到实时性较强的新闻信息,从而可以有效的提高列车服务质量,提升我国高速列车的档次。
     论文以通用型搜索引擎—专题性搜索引擎—新闻搜索引擎为论述主线,详细介绍了新闻搜索引擎。论文由搜索引擎引出了新闻搜索引擎,并提出了一种新闻搜索的高效抽取算法,通过给出该算法的流程图,清晰地表示了该算法在新闻搜索中的抽取过程,这是本论文的一个重点。同时,论文还通过程序的实现证明了该算法的可行性。作者通过动态链接库技术将该算法绑定在程序中。为了实现用户个性化的服务,采用了定时更新新闻的方法将实时的新闻主动地呈现在用户的面前,实现了信息的主动服务功能。
     论文以相关的搜索引擎理论为基础,设计实现了新闻搜索引擎,软件经过测试,达到了预期的设计效果。
It is a thorny question to find the desired news quickly in the sea of Internet. Based on the project for the information service system of high-speed train, the present paper has designed a practical news search engine by means of the latest relevant key technology. And this engine not only enables the passengers traveling with the high-speed train to receive the dynamic real-time news but also substantially improve the quality of service on the train.
    Taking the general search engine, the special search engine and the news search engine as the mainline, the paper has introduced the news search engine in detail. First, the paper ushers in the search engine, then the news search engine and puts forward an extracting algorithm for the news search engine. By offering the flow chart of this algorithm, the paper clearly outlines the algorithm for the extracting process in the news search engine. This algorithm is the kernel of the present paper. Furthermore, the paper has proven the feasibility of the algorithm through the realization of the program. Then the algorithm is tied up with the program through dynamic link library technology. In order to realize the personalized service for the users, the software adopts a method that is able to update the news regularly and present the real-time news to customers. The software has realized the active service function for information.
    On the basis of the relevant theory of search engine, the paper has designed and realized the news search engine that has passed the test and achieved the anticipated design effect.
引文
[1] 李志义.搜索引擎发展中的问题与对策.情报科学,2002
    [2] 刘艳.网络搜索引擎与智能代理技术.图书馆(Library),2002
    [3] 陈定权.Web信息检索技术最新进展.现代图书情报技术,2002
    [4] 康桂英、刘春平.新一代中文智能搜索引擎研究.东南大学学报(哲学社会科学版),2002.03
    [5] 程红莉.搜索引擎对网络信息资源的文献控制方法研究.情报科学,2002
    [6] 朱俊卿.搜索引擎Google研究.现代图书情报技术,2002
    [7] 严武军、马小燕.智能搜索Agent的实现方法.山西教育学院学报,2002.03
    [8] 何凌云、孙恒、王命延.Web信息自动搜索系统的设计与研究.计算机与现代化,2002
    [9] Matthias Schonlau、William DuMouchel、Wen-Hua Ju、Alan F. Karr、Martin Theus、Yehuda Vardi. Computer Intrusion:Detecting Masquerades. http://citeseer.nj.nec.com/schonlaulcomputer.html, 2001
    [10] Zhang Bei、Zhao Zhongmeng、Weng Liping. Building a specialized search engine of special subject. TENCON' 02. Proceedings. 2002 IEEE Region 10 Conference on Computers, Communications, Control and Power Engineering, Volume:1,2002 Page(s): 69-72
    [11] Peng Honghui、Lin Zuoquan. Learning and cooperating in information gathering. Information Technology Interfaces, 2002. ITI 2002. Proceedings of the 24th International Conference on, 2002 Page(s): 211-216 vol. 1
    [12] LineEikvil. Information Extration from World WideWeb-A survey, 1999.07
    [13] Duarte, E. M、Braga, A.P.、Braga、J.L.. Internet economic news gathering and classification: a neural network software agent based approach. Neural Networks, 2002. SBRN 2002. Proceedings. Ⅶ Brazilian Symposium on, 2002 Page(s): 112
    [14] Taylor. K, Dalton. B. Internet robots: a new robotics niche. IEEE
    
    Robotics &Automation Magazine, 2000.07
    [15] O'Meara. T, Patel. A. A topic-specific Web robot model based on restless bandits. IEEE Internet Computing, 2001.05
    [16] Ohgaya, R.、Takagi, T.、Fukano, K.、Taniguchi, K.、Aizawa, A,.Conceptual fuzzy sets-based navigation system for Yahoo!Fuzzy Information Processing Society, 2OO2. Proceedings. NAFIPS. 2002 Annual Meeting of the North American, 2002 Page(s):274-279
    [17] Larsen, H.L.. Enhancing search engines through utilization of visually emphasized terms. Fuzzy Information Processing Society, 2002. Proceedings. NAFIPS. 2002 Annual Meeting of the North American, 2002 Page(s): 529-534
    [18] David Pallmann. Programming Bots Spiders and Intelligent Agent in Microsoft Visual C++.北京希望电子出版社,1999
    [19] Tirri, H.. Search in vain: challenges for Internet search Computer, Volume: 36 Issue: 1, Jan 2003 Page(s): 115-116
    [20] Balachander Krishnamurthy、Jeffrey. C、Mogul、David M. Knstol. Key differences between HTTP/1.0 and HTTP/1.1. the Eighth International World wide Web Conference, Toronto, Canada, 1999
    [21] JunjieChen、Lizhen Liu、HantaoSong、Xueli Yu. An intelligent information retrieval system model. Intelligent Control and Automatiqn, 2002. Proceedings of the 4th World Congress on, Volume: 3, 2002 Page(s): 2500-2503 vol. 3
    [22] Yah Li、Xin-Zhong Chen、Bing-Ru Yang. Research on web mining-based intelligent search engine. Machine Learning and Cybernetics, 2002. Proceedings. 2002 International Conference on, Volume: 1, 2002 Page(s): 386-390
    [23] Stern, R.H.. Challenging search engines under copyright law: Part 1 Micro, IEEE, Volume:22 Issue:3, May/Jun 2002 Page(s):6-7
    [24] Tsoi, K.H.、Lee, K.H.、Leong, P.H.W. Amassively parallel RC4 key search engine. Field-Programmable Custom Computing Machines, 2002. Proceedings. 10th Annual IEEE Symposium on, 2002 Page(s):13-21
    [25] Shkapenyuk, V.、Suel, T. Design and implementation of a high-performance distributed Web crawler. Data Engineering, 2002.
    
    Proceedings. 18th International Conference on, 2002 Page(s):357-368
    [26] Ah-Hwee Tan. Personalized information management for Web intelligence. Fuzzy Systems, 2002. FUZZ-IEEE'02. Proceedings of the 2002 IEEE International Conference on, Volume:2,2002 Page(s):1045-1050
    [27] Huaiyu Xu、Mira, Y.、Shibata, T. Intelligent Internet search applications based on VLSI associative processors. Applications and the Internet, 2002. (SAINT 2002).Proceedings. 2002 Symposium on, 2002 Page(s):230-237
    [28] Eugene Olafsen Kenn Scribner.MFC Visual C++6编程技术内幕.机械工业出版社,2000
    [29] Jim Beveridge & Robert Wiener.侯捷 译.Win32多线程程序设计.华中科技大学出版社,2000
    [30] Jeffrey Richer著郑全战等译.Windows高级编程技术.清华大学出版社
    [31] 曾志、李舒平.Win32高级图形编程技术.电子科技大学出版社,1998
    [32] 谢希仁.计算机网络.第三版.大连理工大学出版社,2000
    [33] 陈建春.Visual C++开发GIS系统.电子工业出版社,2000
    [34] 王晖等.精通Visual C++6.0.电子工业出版社,2000
    [35] 王华、叶爱亮、祁立学、曹凌云.Visual C++6.0编程实例与技巧.机械工业出版社,1999
    [36] 黄斯伟、王玮.HTML4.0动态网页制作-HTML4.0.人民邮电出版社,2000
    [37] 钱能.C++程序设计教程.清华大学出版社,2000
    [38] 谭浩强.C程序设计.清华大学出版社,2000
    [39] 潘爱民、王国印.Vsual C++技术内幕.清华大学出版社,1999
    [40] 郑人杰.软件工程.清华大学出版社,1999
    [41] 刘鑫.Windows 98开发人员指南.机械工业出版社,1999
    [42] 陈坚、陈伟.Visual C++网络高级编程.人民邮电出版社,2001
    [43] 韩泉叶、杨晓健.文本信息搜索模型研究.铁道学院学报(自然科学版),2002.02
    [44] 陈建秋、邓飞其、刘发贵.智能化搜索引擎分析与探讨.广州大学学报(自然科学版),2002.05
    [45] 尹浩.基于内容检索的图像系统及其应用.现代计算机,2002.12
    [46] 张琪玉.网络信息检索用语言的发展趋势.图书馆杂志,2001
    [47] 李蕾、王楠、张剑、钟义信、郭祥昊、贾自燕.中文搜索引擎概念检索
    
    初探.计算机工程与应用,2000.06
    [48] 闫琪.用户搜索请求中限定成分的识别及提取.计算机工程与科学,2000
    [49] 陈敏.中文智能搜索引擎:思路、设计与系统.软件世界,2000
    [50] 诸亚萍、张华.搜索引擎的现状与分析.计算机与现代化,2001
    [51] 钟涛、陈新明、万均、张世勇.中文文本WEB搜索引擎的设计与实现.计算机工程与应用,2001
    [52] 李勇.网络文本数据搜索引擎与搜索技术.情报理论与实践,2001
    [53] 叶冰、陈鹰.行业产品的Internet信息搜索研究.计算机集成制造系统-CIMS,2002.06
    [54] 张成洪、肖军建、张诚.Web内容抽取及其数据管理方法.复旦学报自然科学版),2001
    [55] 魏子忠、张尧学.一种基于Agent的因特网信息获取系统.计算机工程与设计,2001
    [56] 殷信义.智能网站Agents的研究.计算机应用研究,2002
    [57] 燕惠兰、桂筏丹.网络环境下信息资源的组织与检索.情报科学,2001.09
    [58] 降焕利.周连喆.刘寒梅.计小宇.基于概念检索的中文搜索引擎.吉林工学院学报,2002.03
    [59] 丁永生.HTML文档的模糊检索模型.计算机工程与应用,2001
    [60] 武海燕、甘利人.智能搜索蜘蛛.信息系统,2001.06
    [61] 皱海山、吴勇.中文搜索引擎中的中文信息处理技术.计算机应用,2000.12
    [62] 吴晓波、王永成.一种针对中文搜索引擎改进的缓存策略.计算机工程,2002.03
    [63] 韩彬斌、王培康.Web网页识别算法研究.情报学报,2001
    [64] 张晓刚、李明树.智能搜索引擎技术的研究与发展.计算机工程与应用,2001.24
    [65] 赵一唯、王和珍.WWW信息检索综述.南京大学学报自科版,2001
    [66] 王胜海、沈英.网络智能知识服务系统的建设.现代图书情报技术,2002
    [67] 李岩、陈新中、杨炳儒.基于Web挖掘的智能门户搜索引擎的研究.计算机工程与应用,2002.04
    [68] 陈敏、曹阳、刘羽中.一种WWW搜索引擎的设计与实现.计算机工程与应用,2002.07
    [69] 冯永杰.Agent在智能信息检索中的应用研究.计算机应用研究,2002

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700