面向智能客服的句子相似度计算方法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Method of Sentence Similarity Calculation for Intelligent Customer Service
  • 作者:纪明宇 ; 王晨龙 ; 安翔 ; 牟伟晔
  • 英文作者:JI Mingyu;WANG Chenlong;AN Xiang;MU Weiye;School of Information and Computer Engineering, University of Northeast Forestry;
  • 关键词:智能客服 ; 句子相似度 ; 分词纠正 ; 词向量 ; 循环神经网络
  • 英文关键词:intelligent customer service;;sentence similarity;;participle correction;;word vector;;cyclic neural network
  • 中文刊名:JSGG
  • 英文刊名:Computer Engineering and Applications
  • 机构:东北林业大学信息与计算机工程学院;
  • 出版日期:2019-01-29 17:15
  • 出版单位:计算机工程与应用
  • 年:2019
  • 期:v.55;No.932
  • 基金:中央高校基本科研业务费专项资金(No.2572015CB32);; 国家自然科学青年科学基金(No.61801432)
  • 语种:中文;
  • 页:JSGG201913020
  • 页数:6
  • CN:13
  • 分类号:129-134
摘要
针对金融领域中智能客服的句子相似度计算方法进行了研究。利用基于词性的分词纠正模型减少中文歧义词、金融相关词汇的分词错误;通过词向量方法和循环神经网络分别提取词语级和句子级的语义特征,并且得到句子向量;用融合层计算出句子向量间的差异特征;对差异特征进行降维和归一化得到句子相似度计算结果。实验结果表明,该方法具有较高的准确率和F1值。
        In view of sentence similarity calculation method for intelligent customer service in financial field is studied.Firstly, it reduces the participle errors of Chinese ambiguous words and financial related words by the participle correction model based on part-of-speech. Then, it extracts the semantic features of word level and sentence level and obtains the sentence vectors by the method of word vector and circulatory neural network. In addition, it calculates the discriminative features between sentence vectors by the merge layer. Finally, it obtains the result of sentence similarity calculation by dimension reduction and normalization of the discriminative features. Experiments show that this method has high accuracy and F1 value.
引文
[1]张瑞,潘鑫,杨艳妮,等.情感介入式智能客户服务系统[J].情报理论与实践,2016,39(8):70-74.
    [2]李茹,王智强,李双红,等.基于框架语义分析的汉语句子相似度计算[J].计算机研究与发展,2013,50(8):1728-1736.
    [3] Omar N A,Kasim S,Fudzee M F,et al.A review of semantic similarity approach for multiple ontologies[J].International Journal of Information and Decision Sciences,2018,10(3):212-221.
    [4]张培颖.多特征融合的语句相似度计算模型[J].计算机工程与应用,2010,46(26):136-137.
    [5] Ming C L.A novel sentence similarity measure for semantic-based expert systems[J].Expert Systems with Applications,2011,38(5):6392-6399.
    [6]程传鹏,吴志刚.一种基于知网的句子相似度计算方法[J].计算机工程与科学,2012,34(2):172-175.
    [7] Uldn G,Wamd S,Nhndd S,et al.Sentence similarity measuring by vector space model[C]//Proceedings of International Conference on Advances in ICT for Emerging Regions,2015:185-189.
    [8] Sugathadasa K,Ayesha B,Silva N D,et al.Synergistic union of Word2Vec and lexicon for domain specific semantic similarity[C]//Proceedings of IEEE International Conference on Industrial and Information Systems,2017:1-6.
    [9]李晓,解辉,李立杰.基于Word2vec的句子语义相似度计算研究[J].计算机科学,2017,44(9):256-260.
    [10] Mueller J,Thyagarajan A.Siamese recurrent architectures for learning sentence similarity[C]//Proceedings of Thirtieth AAAI Conference on Artificial Intelligence,2016:2786-2792.
    [11] Neculoiu P,Versteegh M,Rotaru M,et al.Learning text similarity with siamese recurrent networks[C]//Meeting of the Association For Computational Linguistics,2016:148-157.
    [12] Jozefowicz R,Zaremba W,Sutskever I.An empirical exploration of recurrent network architectures[C]//Proceedings of International Conference on International Conference on Machine Learning,2015:2342-2350.
    [13]任智慧,徐浩煜,封松林,等.基于LSTM网络的序列标注中文分词法[J].计算机应用研究,2017,34(5):1321-1324.
    [14] Huang L,Du Y,Chen G.GeoSegmenter:A statistically learned Chinese word segmenter for the geoscience domain[J].Computers and Geosciences,2015,76:11-17.
    [15] Bengio Y,Ducharme R,Vincent P,et al.A neural probabilistic language model[J].Journal of Machine Learning Research,2014,3(1):1137-1155.
    [16] Mikolov T,Chen K,Corrado G S,et al.Efficient estimation of word representations in vector space[J].ar Xiv:1301.3781,2013.
    [17] Pennington J,Socher R,Manning C.Glove:Global vectors for word representation[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing,2014:1532-1543.
    [18]何炎祥,孙松涛,牛菲菲,等.用于微博情感分析的一种情感语义增强的深度学习模型[J].计算机学报,2017,40(4):773-790.
    [19] Cho K,van Merrienboer B,Gulcehre C,et al.Learning phrase representations using RNN encoder—decoder for statistical machine translation[J].arXiv:1406.1078,2014:1724-1734.
    [20] Dolan B,Quirk C,Brockett C.Unsupervised construction of large paraphrase corpora:Exploiting massively parallel news sources[C]//Proceedings of the 20th International Conference on Computational Linguistics,2004:350-356.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700