基于网络信息检索技术的信息过滤方法的应用研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
当前,运用网络搜索引擎的查找,从而获得一些自己所需的相关信息,已经成为众多网民的惯常举动。其中,使用最为频繁的当属一些著名的综合性搜索网站,如GOOGLE、百度、雅虎等搜索网站。这几个网站正是凭借自身强大的搜索能力而始终处于全球网站点击量排名的前茅。有时为了解决一些似是而非的问题,许多网民也通过此类网站的搜索功能而获得相关的参考信息。通过互联网获得足够自身使用的信息,已经成为网络活动中不可或缺的重要内容。可以说,网络的信息检索和过滤技术已经成为众多网民离不开的基本技术。
     面对全球数以亿计的庞大网民群体,就其个体而言,其检索信息、获得所需内容的需求必然呈现出千差万别、多样化的特性。GOOGLE、百度、雅虎等综合性搜索网站针对用户的需求变化,一直不断改进他们的搜索技术,提高自身的搜索能力,从而提高网站自身的知名度和实用性,以获取网站稳定提升的点击量。然而,无论他们如何改进技术,其技术指导原则必然是不断寻找众多网民需求的共性,也即总是从宏观考虑入手,这就必然对网民的个性需求无法顾及。因此,一些搜索面并不宽,但搜索深度足够高的专门性搜索软件便应运而生。如:随心信息搜索软件、网络信息采集专家、企业信息搜索王、小蜜蜂采集器、火车采集器,等等。
     本文所要探讨的是针对网民个性需求,从某些非常特殊的专业岗位面对的网络困境出发,运用独立而非综合的信息采集技术(软件),实现对网络专门信息的检索、过滤、抓取的快速性、准确性、全面性和自动化。
Currently, the use of Web search engines to find, they need to get some information, Many Internet users have become the usual moves. Of these, undoubtedly the most frequently used comprehensive search of some famous sites such as GOOGLE, Baidu, Yahoo and other search sites. These sites is by virtue of its powerful global search ability is always in the forefront of website traffic ranking. Sometimes in order to solve some paradoxical problems, many users also search through such sites and access to relevant reference information. Their use of the Internet to obtain sufficient information, the network activity has become an indispensable content. It can be said, the network information retrieval and filtering technology has become inseparable from the basic technology of many Internet users.
     The face of hundreds of millions of large groups of users on its individual, its retrieve information, obtain the required content is inevitably show different and diverse features. GOOGLE, Baidu, Yahoo and other search sites for comprehensive changes in the needs of users, constantly improving their search technology, improve their search capabilities, thereby enhancing the visibility and usefulness of the site itself, in order to enhance the stability of access to site traffic. However, no matter how they improve the technology, its technical guidelines must be constantly looking for the common needs of many users, which are always considered to start from the macro, which is bound to the individual needs of users, can not be taken into account. Therefore, some search area is not wide, but the search depth is high enough specialized search software have come into being. Such as:heart information search software, network information gathering experts, business information search king, bee collector, train collector, and so on.
     This paper is to explore the demand for individual users, from some very special network of professional positions in the face of difficulties starting the use of separate rather than consolidated information collection technology (software) to realize the network of specialized information retrieval, filtering, crawl rapidity, accuracy, comprehensiveness and automation.
引文
1董全中.当代搜索引擎存在的问题及其改进.图书馆理论与实践,2007,(5):39-40.
    2李晶,陈恩红.Web信息抽取.计算机科学,2003,30(6):78-81.
    3张成洪,古晓洪,白延红.Web数据抽取技术研究进展.计算机科学,2004,31(2):129-131.
    4王磊,蒋建中,郭军利.基于扩展DOM树的Web页面信息抽取.计算机应用与软件,2007,24(6):137-139.
    5朱永盛,武港山.基于Web的新闻信息抽取.计算机工程,2006,32(10):74-76.
    6周津,朱明,郑全.基于XML的网页信息自动抽取.计算机应用,2004,24(51):225-227.
    7北京市科学技术委员会.技术转移北京的实践,北京科学技术出版社,2007:25-28
    8程庆梅.现代网络信息技术.北京科学技术出版社,2004:16-17
    9丁冰.近年来我国网络计量学研究综述.图书馆学刊,2008(4):26-28
    10彭学.情报量的计算方法.情报学刊,1983(4):54-57
    11雪松.Lucene+nutch搜索引擎开发.人民邮电出版社,2008
    12俊英.垂直搜索引擎的研究与实现[D].哈尔滨工业大学,2004.
    13肖冬梅.垂直搜索引擎研究[J].图书馆学研究,2003(2)
    14陈禹等.信息系统管理工程师教程.清华大学出版社,2006:58-59
    15陈新颜.垂直搜索引擎辨析[J].现代情报,2004(9)
    16(美)John kauffman等ASP.NET2.0数据库入门经典.清华大学出版社,2006:28-29
    17董吉文,徐龙玺主编.计算机网络技术与应用.电子工业出版社,2010
    18(美)汤普森等著.陈丽华等译.信息技术与管理.第2版.北京大学出版社,2006
    19张晓凌等著.技术转移联盟导论.知识产权出版社,2009
    20[美]哈格等著.信息时代的管理信息系统.严建援等译.机械工业出版社,2004
    21顾君忠.计算机支持的协同工作导论[M].清华大学出版社,2002
    22黄圣杰等HTML亲密接触.北京希望电子出版社,2001:2527
    23Sue Spielman,Meeraj Kunnumpurath.Pro J2EE1.4:From Professional to Expert. Apress,2004:22-23
    24J.S.White.Responses to Current Research in MT by H.Somers.Machine Translation.1993,7(4):303-307.
    25Floyd Marinescu.EJB Design Patterns:Advanced Patterns,Processes,and Idioms.John Wiley&Sons Inc,2002:13-14
    26Craig A.Berry,Johncarnell.J2EE Design Patterns Applied.Wrox Press.2002:22-23
    27James W.Cooper.The Design Patterns Java Companion. October2,1998:28-29
    28Samuel Kounev,Alejandro Buchmann.Performance Modeling and Evaluation of Large-Scale J2EE Applications.Proc.of the29th International Conference of the Computer Measurement Group,2003:89-91
    29Bernstein P, Newcomer E. Principles of Transaction Processing. Morgan Kaufmann Publishers,1997:25-26
    30Deepak Alur,Dan Malks,John Crupi.Core J2EE Patterns:Best Practices and Design Strategies.Prentice Hall PTR,2003:17-18
    31Jakub Rudzki.How Design Patterns Affect Application Performance-A Case of a Multi-tier J2EE Application.LECTURE NOTES IN COMPUTER SCIENCE.2005:14-15
    32Sue Spielman,Meeraj Kunnumpurath.Pro J2EE1.4:From Professional to Expert. Apress,2004:14-15
    33S. Ramamurthy, L. Sahasrabuddhe, B. Mukherjees. Survivable WDM mesh networks. J. Lightwave Technol.,2003:13-14
    34GEORGH1OU L. Global Cooperation in Research[J]. Research Policy.1998.27(6):611-626
    35Zhengzhong Shi. Inter-organizational information system use in supply chain:toward an integration of competence based and transaction cost based views of the firm. Ann Arbor. Mich. UMI,2002
    36Knowledge Management Using Information Technology:Determinants of Short-Term Impact on Firm Value. Decision Sciences. Atlanta,2005
    37Information management (IM) strategy:the construct and its measurement, JOURNAL OF STRATEGIC INFORMATION SYSTEMS,2001
    38Pat-Anthony Federico, with the assistance of Kim E. Brun and Douglas B. McCalla. Management information systems and organizational behavior. New York. Praeger,1985
    39Adopters and non-adopters of business-to-business electronic commerce in Singapore. INFORMATION&MANAGEMENT,2004
    40Ratzan Lee. Understanding information systems:What they do and why we need them. American Library Association,2004
    41Mukhopadhyay Tridas&Kekre Sunder. Strategic and Operational Benefits of Electronic in B2B Procurement Process [J]. Management Science,2002

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700