面向兵棋演习的问答系统问句分类模型研究
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Research on Question Classification Model of Question Answering System Oriented WarGaming
  • 作者:孙泽健 ; 司光亚 ; 刘洋
  • 英文作者:SUN Zejian;SI Guangya;LIU Yang;Department of Information Operation & Command Training,NDU;
  • 关键词:Word2vec ; WMD算法 ; 兵棋演习 ; 问答系统 ; 问句分类
  • 英文关键词:Word2vec;;WMD algorithm;;wargaming;;QA System;;question classification
  • 中文刊名:JSSG
  • 英文刊名:Computer & Digital Engineering
  • 机构:国防大学信息作战与指挥训练教研部;
  • 出版日期:2019-02-20
  • 出版单位:计算机与数字工程
  • 年:2019
  • 期:v.47;No.352
  • 基金:国家自然科学基金军民共用重大研究计划联合基金“基于仿真大数据的武器装备体系复杂性机理及效能评估方法研究”(编号:U1435218);国家自然科学基金青年基金“基于仿真大数据的信息化作战体系分析方法研究”(编号:61403401)资助
  • 语种:中文;
  • 页:JSSG201902011
  • 页数:7
  • CN:02
  • ISSN:42-1372/TP
  • 分类号:53-58+64
摘要
通过分析兵棋演习过程中的常见问题,设计了一个针对兵棋演习特殊情景的问句分类模型。问句分类模型基于统计方法,利用Word2vec工具生成词向量,利用TextRank算法结合IDF值来生成词权重,共同完成问句表征。并综合考虑算法复杂度以及问句相似度计算的精确度,通过两个不同的问句相似度模型,以及改进的KNN算法来实现最终的问句分类。WMD(Word Mover's Distance)算法是在词向量基础上计算问句相似度较为精确的算法,但同时存在算法复杂度过高的缺点,论文通过改进的KNN算法将其与传统算法结合,来更好地完成需要的问句分类任务。
        By analyzing the common problems in the process of war gaming,a question classification model is designed for aQA(question answering)system oriented a specific situation. The question classification model generates word vectors by word2 vecbased on statistical methods,and generates word weight by TextRank algorithm so as to complete the question representation. Theclassification combines two different models of question similarity calculation through the improved KNN(K Nearest Neighbor)algo-rithm,balancing the computation complexity and accuracy. The WMD(Word Move Distance)algorithm is based on the word vectorto calculate the similarity of questions more accurate algorithm,which also has the disadvantage of high algorithm complexity howev-er. In this paper,the improved KNN algorithm is combined with the traditional algorithm,in order to complete the required questionclassification task better.
引文
[1]胡晓峰.战争工程论:走向信息时代的战争方法学[M].北京:国防大学出版社,2012.HU Xiaofeng. On War System Engineering MethodologyTowards Information Age's War[M]. Beijing:National De-fense University Press,2012.
    [2]胡晓峰,司光亚,吴琳,等.战争模拟原理与系统[M].北京:国防大学出版社,2009.HU Xiaofeng,Si Guangya,WU Lin. Principles and sys-tems of war simulation[M]. Beijing:National Defense Uni-versity Press,2009.
    [3]镇丽华,王小林,杨思春.自动问答系统中问句分类研究综述[J].安徽工业大学学报(自科版),2015,32(1):48-54.ZHEN Lihua,WANG Xiaolin,YANG Sichun. Overview onQuestion Classification in Question-answering System[J].Journal of Anhui University of Technology(Natural Sci-ence),2015,32(1):48-54.
    [4]牛彦清,陈俊杰,段利国,等.中文问句分类特征的研究[J].计算机应用与软件,2012,29(3):108-111.NIU Yanqing,CHEN Junjie,DUAN Liguo,et al. Study OnClassification Features of Chinese Interrogatives[J]. Com-puter Applications and Software,2012,29(3):108-111.
    [5]贾明静,董日壮,段良涛.问句相似度计算综述[J].电脑知识与技术,2014,10(31):7434-7437.JIA Mingjing,Deng Rizhuang,DUAN Liangtao. QuestionSimilarity Computation Review[J]. Computer Knowledgeand Technology,2014,10(31):7434-7437.
    [6]Mihalcea R,Corley C,Strapparava C. Corpus-based andKnowledge-based Measures of Text Semantic Similarity[J]. Unt Scholarly Works,2006,1:775-780.
    [7]Mikolov T,Chen K,Corrado G,et al. Efficient Estima-tion of Word Representations in Vector Space[J]. Comput-er Science,2013.
    [8]Mikolov T,Sutskever I,Chen K,et al. Distributed Repre-sentations of Words and Phrases and their Compositionality[J]. Advances in Neural Information Processing Systems,2013,26:3111-3119.
    [9]余珊珊,苏锦钿,李鹏飞.基于改进的TextRank的自动摘要提取方法[J].计算机科学,2016,43(6):240-247.YU Shanshan,SU Jindian,LI Pengfei. Improved Tex-tRank-based Method for Automatic Summarization[J].Computer Science.2016,43(6):240-247.
    [10]王丽月,叶东毅.面向游戏客服场景的自动问答系统研究与实现[J].计算机工程与应用,2016,52(17):152-159.WANG Liyue,YE Dongyi. Research and implementa-tion of automatic question-answering system in game cus-tomer service scenarios[J]. Computer Engineering andApplications. 2016,52(17):152-159.
    [11]Kusner M J,Sun Y,Kolkin N I,et al. From Word Em-beddings To Document Distances[C]//ICML,2015:957-966.
    [12]贾可亮,樊孝忠,许进忠.基于KNN的汉语问句分类[J].微电子学与计算机,2008,25(1):156-158.JIA keliang,FAN Xiaozhong,XU Jinzhong.Chinese Ques-tion Classification Based on KNN[J]. Microelectronics&Computer,2008,25(1):156-158.
    [13]张雪芬,李德玉,王素格,等.基于统计方法的面向旅游问句分类实验研究[J].电脑开发与应用,2009,22(1):14-16.ZHANG Xuefen,LI Deyu,WANG Suge. An EmpricalStudy On Question Sentence Classification For Tour Do-main based on Statistic Methods[J]. Computer Develop-ment&Applications,2009,22(1):14-16.
    [14]刘挺.人机对话浪潮:语音助手、聊天机器人、机器伴侣[J].中国计算机学会通讯,2015,11(10):54-56.LIU Ting. Man-machine Dialogue Wave:Voice Assis-tant,Chat Robot,Machine Mate[J]. Communications ofthe CCF,2015,11(10):54-56.
    [15]张宁,朱礼军.中文问答系统问句分析研究综述[J].情报工程,2016,2(1):32-42.ZHANG Ning,ZHU Lijun. A Survey of Chinese QA Sys-tem's Question Analysis[J]. Information Engineering,2016,2(1):32-42.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700