用户名: 密码: 验证码:
基于关联规则后件扩展的越英跨语言信息检索
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Vietnamese-English Cross Language Information Retrieval Model Based on Association Rule Consequent Expansion
  • 作者:黄武锋 ; 何冬蕾 ; 黄名选
  • 英文作者:HUANG Wu-feng;HE Dong-lei;HUANG Ming-xuan;School of Information and Statistics,Guangxi University of Finance and Economics;
  • 关键词:完全加权模式挖掘 ; 查询扩展 ; 跨语言信息检索 ; 信息检索
  • 英文关键词:all-weighted patterns mining;;query expansion;;cross language information retrieval;;information retrieval
  • 中文刊名:计算机技术与发展
  • 英文刊名:Computer Technology and Development
  • 机构:广西财经学院信息与统计学院;
  • 出版日期:2018-12-20 15:19
  • 出版单位:计算机技术与发展
  • 年:2019
  • 期:04
  • 基金:国家自然科学基金(61762006,61262028)
  • 语种:中文;
  • 页:170-174+180
  • 页数:6
  • CN:61-1450/TP
  • ISSN:1673-629X
  • 分类号:TP311.13;TP391.3
摘要
针对跨语言信息检索中存在的查询主题漂移问题,提出一种基于完全加权关联规则后件扩展的越英跨语言信息检索模型,给出了模型结构及其各个功能模块,详细阐述了模型的关键技术及其算法。该模型将完全加权模式挖掘技术和用户相关反馈扩展融合应用于越英跨语言信息检索,将越南语查询通过机器翻译系统译为英文并检索为英文文档,提取前列初检文档构建用户相关反馈文档集,采用完全加权关联规则挖掘技术对用户相关反馈文档集挖掘与原查询相关的关联规则,将关联规则后件作为扩展词,并和原查询组合成新查询再次检索英文文档,得到最终检索结果。在NTCIR-5 CLIR数据集上的实验结果表明,该模型能减少越英跨语言检索中的查询漂移,提高和改善其检索性能。
        We propose a Vietnamese-English cross language information retrieval model based on all-weighted association rule consequent expansion to solve the problem of query drift existing in cross language information retrieval. The structure of the model and its function modules are given,and the key techniques and algorithms of the model are discussed in detail. This model integrates the techniques of all-weighted pattern mining and user relevance feedback expansion for Vietnamese-English cross language information retrieval,and translates the Vietnamese query into English by machine translation system so as to retrieve English documents,and extracts the top-ranked retrieved documents with the aim of setting up user relevance feedback document collection. The technique of all-weighed association rule mining is used to mine association rules related to the original query in the collection,and the association rule consequents are taken as the expansion terms,and combined with the original query as a new query to retrieve the English documents for the final search result. Experimental results on the NTCIR-5 CLIR data set show that the proposed model can effectively reduce query drift in Vietnamese-English cross language retrieval,and improve its retrieval performance.
引文
[1] GERALDO A,MOREIRA V P.UFRGS@CLEF2008:using association rules for cross-language information retrieval[C]//Proceedings of the 9th cross-language evaluation forum conference on evaluating systems for multilingual and multimodal information access.Aarhus,Denmark:Springer-Verlag,2008:66-74.
    [2] 姚寒冰,王丽清,徐永跃.供需信息跨语言检索算法研究[J].计算机技术与发展,2017,27(8):152-155.
    [3] 吴丹,何大庆,王惠临.一种基于相关反馈的跨语言信息检索查询翻译优化技术研究[J].情报学报,2012,31(4):398-406.
    [4] GIANG L T,HUNG V T,PHAP H C.Experiments with query translation and re-ranking methods in Vietnamese-English bilingual information retrieval[C]//Proceedings of the fourth symposium on information and communication technology.New York,NY,USA:Association for Computing Machinery,2013:118-122.
    [5] GIANG L T,HUNG V T,PHAP H C.Building structured query in target language for Vietnamese-English cross language information retrieval systems[J].International Journal of Engineering & Technical Research,2015,4(4):146-151.
    [6] DEBASIS G,JOHANNES L,GARETH J F J.Cross-lingual topical relevance models[C]//Proceedings of the 24th international conference on computational linguistics.Mumbai,India:[s.n.],2012:927-942.
    [7] WANG Xuwen,WANG Xiaojie,ZHANG Qiang.A web-based CLIR system with cross-lingual topical pseudo relevance feedback[C]//International conference of the cross-language evaluation forum for European languages.Berlin:Springer,2013:104-107.
    [8] 刘伟成,张志清,孙吉红.基于KCCA的跨语言专利信息检索研究[J].情报科学,2010,28(5):751-755.
    [9] GAO Jianfeng,NIE Jianyun,ZHANG Jian,et al.TREC-9 CLIR experiments at MSRCN[C]//Proceedings of the 9th text retrieval evaluation conference.[s.l.]:[s.n.],2001:343-353.
    [10] 吴丹,何大庆,王惠临.基于伪相关反馈的跨语言查询扩展[J].情报学报,2010,29(2):232-239.
    [11] GIANG L T,HUNG V T,PHAP H C.Improve cross language information retrieval with pseudo-relevance feedback[J].International Journal of Engineering Research & Technology,2015,4(6):1-7.
    [12] 黄名选.基于矩阵加权关联模式的印尼中跨语言信息检索模型[J].数据分析与知识发现,2017,1(1):26-36.
    [13] 周秀梅,黄名选.基于项权值变化的矩阵加权关联规则挖掘[J].计算机应用研究,2015,32(10):2918-2923.
    [14] WU X D,ZHANG C Q,ZHANG S C.Efficient mining of both positive and negative association rules[J].ACM Transactions on Information Systems,2004,22(3):381-405.
    [15] 周秀梅,黄名选.基于项权值变化的完全加权正负关联规则挖掘[J].电子学报,2015,43(8):1545-1554.
    [16] 黄名选,严小卫,张师超.基于矩阵加权关联规则挖掘的伪相关反馈查询扩展[J].软件学报,2009,20(7):1854-1865.
    [17] AGRAWAL R,IMIELINSKI T,SWAMI A.Mining association rules between sets of items in large database[C]//Proceedings of 1993 ACM SIGMOD international conference on management of data.Washington D C,USA:ACM,1993:207-216.
    [18] SALTON G,BUCKLEY C.Term-weighting approaches in automatic text retrieval[J].Information Processing & Management,1988,24(5):513-523.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700