面向企业信息检索的语义扩展查询方法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:A Semantic Query Extension Method for Enterprise Information Retrieval
  • 作者:耿爽 ; 杨辰 ; 牛奔 ; 蚁文洁 ; 刘雷
  • 英文作者:Geng Shuang;Yang Chen;Niu Ben;Yi Wenjie;Liu Lei;College of Management, Shenzhen University;
  • 关键词:企业文本检索 ; 查询扩展 ; 知识域分类 ; 查询分类
  • 英文关键词:enterprise text retrieval;;query extension;;knowledge domain classification;;query classification
  • 中文刊名:QBXB
  • 英文刊名:Journal of the China Society for Scientific and Technical Information
  • 机构:深圳大学管理学院;
  • 出版日期:2019-07-24
  • 出版单位:情报学报
  • 年:2019
  • 期:v.38
  • 基金:国家自然科学基金青年科学基金项目“基于科研社交网络挖掘的专家组合推荐问题的研究”(71701134);; 教育部人文社会科学研究基金“基于在线科研社交平台的合作者推荐研究”(16YJC630153);; 广东省自然科学基金博士启动项目(2017A030310427)
  • 语种:中文;
  • 页:QBXB201907009
  • 页数:8
  • CN:07
  • ISSN:11-2257/G3
  • 分类号:80-87
摘要
为了弥补传统的信息检索方法在企业内实施时查准率较低的缺陷,解决监督学习中训练数据短缺的问题,本研究提出了基于企业知识域类别和语义关联的查询词扩展方法。该方法首先利用主题模型对企业文档库进行建模,其次结合专家意见构建企业知识分类及相应的带有权重的类别描述词集,最后利用语义相似度对查询进行分类,在知识域描述词集中选择查询扩展词对查询进行扩展。本研究利用一家电子产品制造公司的真实数据进行实验研究,实验结果表明,扩展后的查询更能准确反映用户的查询要求,有效地提升了企业信息检索的查准率。
        Conventional information retrieval methods usually attain relatively low accuracy in obtaining inner enterprise information retrieval solutions. This is partially because of the limited amount of training data available. To overcome these difficulties, this study proposed a query expansion approach based on enterprise knowledge domain categories and semantic relevance. The proposed method first makes use of a topic model and the expertise of professionals to create enterprise knowledge domain categories with weighted description terms, then classifies queries using semantic similarity into knowledge domain categories and selects terms for expansion from category description terms. This research used an electronic manufacturing company as case for experimental study. The experiment s results proved that the query expansion method effectively improves the enterprise information retrieval accuracy.
引文
[1] Hawking D. Challenges in enterprise search[C]//Proceedings of the 15th Australasian Database Conference, Dunedin, New Zealand, 2004, 27:15-24.
    [2] Carpineto C, Romano G. A survey of automatic query expansion in information retrieval[J]. ACM Computing Surveys, 2012, 44(1):1-50.
    [3] Mukherjee R, Mao J. Enterprise search:Tough stuff[J]. Queue,2004, 2(2):36.
    [4] Jansen B J, Booth D L, Spink A. Determining the informational,navigational, and transactional intent of Web queries[J]. Information Processing&Management, 2008, 44(3):1251-1266.
    [5] Brin S, Page L. Reprint of:The anatomy of a large-scale hyper textual web search engine[J]. Computer networks, 2012, 56(18):3825-3833.
    [6] Kleinberg J M. Authoritative sources in a hyperlinked environment[J]. Journal of the ACM, 1999, 46(5):604-632.
    [7] Li Z J, Raskin V, Ramani K. Developing ontologies for engineering information retrieval[C]//Proceedings of the ASME International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. ASME, 2007, 2:737-745.
    [8] Efthimiadis E N, Biron P V. Ucla-Okapi at Trec-2:Query expansion experiments[C]//Proceedings of the Second Text Retrieval Conference, 1993:278-290.
    [9] Maron M E, Kuhns J L. On relevance, probabilistic indexing and information retrieval[J]. Journal of the ACM, 1960, 7(3):216-244.
    [10] Lesk M E. Word-word associations in document retrieval systems[J]. American Documentation, 1969, 20(1):27-38.
    [11] Minker J, Wilson G A, Zimmerman B H. An evaluation of query expansion by the addition of clustered terms for a document retrieval system[J]. Information Storage and Retrieval. 1972, 8(6):329-348.
    [12] Harper D J, van Rijsbergen C J. An evaluation of feedback in document retrieval using co-occurrence data[J]. Journal of Documentation, 1978, 34(3):189-216.
    [13] Qiu Y G, Frei H P. Concept based query expansion[C]//Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York:ACM Press, 1993:160-169.
    [14] Jing Y F, Croft W B. An association thesaurus for information retrieval[R]//An Association Thesaurus for Information Retrieval.Amhers:University of Massachusetts, 1994, 1:146-160.
    [15] Deerwester S, Dumais S T, Furnas G W, et al. Indexing by latent semantic analysis[J]. Journal of the American Society for Information Science, 1990, 41(6):391-407.
    [16] Attar R, Fraenkel A S. Local feedback in full-text retrieval systems[J]. Journal of the ACM, 1977, 24(3):397-417.
    [17] Rocchio J J. Relevance feedback in information retrieval[M]//Salton G. The Smart Retrieval System-Experiments in Automatic Document Processing. Prentice-Hall, 1971:313-323.
    [18]张一洲.基于VSM和偏好本体的个性化信息检索技术的研究[J].情报学报, 2015, 34(7):711-716.
    [19]李纲,毛进,芦昆.医学信息检索中一种基于概念的查询相关模型[J].情报学报, 2014, 33(3):239-249.
    [20] Robertson S E, Jones K S. Relevance weighting of search terms[J]. Journal of the Association for Information Science and Technology, 1976, 27(3):129-146.
    [21] Lafferty J, Zhai C X. Document language models, query models,and risk minimization for information retrieval[J]. ACM SIGIR Forum, 2017, 51(2):251-259.
    [22] Lavrenko V, Croft W B. Relevance based language models[C]//Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.New York:ACM Press, 2001:120-127.
    [23] Zhai C X, Lafferty J. Model-based feedback in the language modeling approach to information retrieval[C]//Proceedings of the Tenth International Conference on Information and Knowledge Management. New York:ACM Press, 2001:403-410.
    [24] Xu J X, Croft W B. Query expansion using local and global document analysis[J]. ACM SIGIR Forum, 2017, 51(2):168-175.
    [25] Fonseca B M, Golgher P B, De Moura E S, et al. Discovering search engine related queries using association rules[J]. Journal of Web Engineering, 2003, 2(4):215-227.
    [26] Mart??n??-Bautista M J, Sánchez D, Chamorro-Mart??n??ez J, et al.Mining web documents to find additional query terms using fuzzy association rules[J]. Fuzzy Sets and Systems, 2004, 148(1):85-104.
    [27]崔航,文继荣,李敏强.基于用户日志的查询扩展统计模型[J].软件学报, 2003, 14(9):1593-1599.
    [28]黄名选,严小卫,张师超.查询扩展技术进展与展望[J].计算机应用与软件, 2007, 24(11):1-4.
    [29]李洁,丁颖.语义网、语义网格和语义网络[J].计算机与现代化, 2007(7):38-41.
    [30] Al-Hawamdeh S. Knowledge management:Cultivating knowledge professionals[M]. Oxford:Chandos Publishing, 2003:199-216.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700