基于词向量的国际业务实时推理模型
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Real-time inference model of international business based on word2vec
  • 作者:张轼坤 ; 沈峰 ; 高列宁 ; 周云康
  • 英文作者:Zhang Shikun;Shen Feng;Gao Liening;Zhou Yunkang;Software Development Center of Bank of Communications (Shanghai);College of Economics,Wuhan University of Technology;
  • 关键词:SWIFT报文 ; word2vec ; GLoVe ; 国际惯例 ; 推理模型
  • 英文关键词:SWIFT message;;word2vec;;GLoVe;;international customs and practice;;inference model
  • 中文刊名:WXJY
  • 英文刊名:Information Technology and Network Security
  • 机构:交通银行软件开发中心(上海);武汉理工大学经济学院;
  • 出版日期:2019-05-10
  • 出版单位:信息技术与网络安全
  • 年:2019
  • 期:v.38;No.505
  • 基金:交通银行境内贸易服务延续开发项目(M201410184);交通银行海外贸易服务延续开发项目(M201801322);交通银行香港贸易服务延续开发项目(M201801321)
  • 语种:中文;
  • 页:WXJY201905018
  • 页数:7
  • CN:05
  • ISSN:10-1543/TP
  • 分类号:89-95
摘要
针对现有国际业务在人工分析SWIFT报文效率方面的不足,提出一种基于词向量的国际业务实时推理模型。利用TF-IDF值在词汇重要程度的度量作用,实时计算获得MT700、MT710、MT720、MT730等报文语料候选词集;通过GLoVe算法产生SWIFT报文词向量,使用seq2seq模型加载attention机制学习产出报文摘要;利用5万条UCP600、ISBP745、URC522、URR725、URDG758、ISP98国际惯例语料数据,训练得到BITS2vec词向量模型,注入候选词集后自动生成国际惯例参考信息。将该推理模型应用于国际业务系统SWIFT报文实时分析任务中,试验结果表明该推理模型能结合当前SWIFT报文语料特点,提供有关联价值的报文摘要和国际惯例参考信息,提高SWIFT报文处理质效。
        Aiming at the deficiency of existing international services in manual analysis efficiency of SWIFT message,a real-time inference model of international services based on word vector is proposed. Using the TF-IDF value as a measure of the importance of words,the candidate word sets of message corpus such as MT700,MT710,MT720 and MT730 are obtained by real-time calculation. GLoVe algorithm is used to generate the word vector of SWIFT message,and the seq2seq model is used to load the attention mechanism to learn and output message summary. Using 50 000 UCP600,ISBP745,URC522,URR725,URDG758 and ISP98 international customs and practice corpus data,the BITS2vec word vector model is trained,and the international customs and practice reference is automatically generated after the candidate word set is injected. The inference model is applied to the real-time analysis task of SWIFT message in the international business system. The test results show that the inference model can learn the features of the current corpus of SWIFT message,provide valuable message summary and relevant reference information of international conventions,and improve the quality and efficiency of SWIFT message processing.
引文
[1]张璞,王俊霞,王英豪.基于标签传播的情感词典构建方法[J].计算机工程,2018,44(5):168-173.
    [2]孟海东,张玉英,宋飞燕.一种基于加权欧氏距离聚类方法的研究[J].计算机应用,2006,26(s2):179-180.
    [3]李天彩,王波,毛二松,等.基于Skip-gram模型的微博情感倾向性分析[J].计算机应用与软件,2016,33(7):114-117.
    [4]LIU P,QIU X,HUANG X.Learning context-sensitive word embeddings with neural tensor skip-gram model[C].International Conference on Artificial Intelligence.AAAI,2015:1284-1290.
    [5]DIPIETRO L,SABATINI A M,DARIO P.A Survey of glove-based systems and their applications[J].IEEE Transactions on Systems,Man,and Cybernetics,Part C(Applications and Reviews),2008,38(4):461-482.
    [6]邓加原,姬东鸿,费超群,等.基于无监督学习算法的推特文本规范化[J].计算机应用,2016,36(7):1887-1892.
    [7]王庆,陈泽亚,郭静,等.基于词共现矩阵的项目关键词词库和关键词语义网络[J].计算机应用,2015,35(6):1649-1653.
    [8]宁丹丹,陈惠鹏,秦兵.基于序列到序列模型的句子级复述生成[J].智能计算机与应用,2018,8(3):61-63,69.
    [9]CHO K,COURVILLE A,BENGIO Y.Describing multimedia content using attention-based encoder-decoder networks[J].IEEE Transactions on Multimedia,2015,17(11):1875-1886.
    [10]郑雄风,丁立新,万润泽.基于用户和产品Attention机制的层次BGRU模型[J].计算机工程与应用,2018,54(11):145-152.
    [11]孟奎,刘梦赤,胡婕.基于字符级循环网络的查询意图识别模型[J].计算机工程,2017,43(3):181-186.
    [12]张振亚,王进,程红梅,等.基于余弦相似度的文本空间索引方法研究[J].计算机科学,2005,32(9):160-163.
    [13]庞伟正,金瑞琪,王成武.一种规则引擎的实现方法[J].哈尔滨工程大学学报,2005,26(3):385-389.
    [14]孙向琨,邓伟.结合TF-IDF的歌曲情感多标记分类[J].计算机工程,2011,37(19):189-190.
    [15]HAMMER B.Parametric nonlinear dimensionality reduction using kernel t-SNE[J].Neurocomputing,2015,147(1):71-82.
    [16]嵇晓声,刘宴兵,罗来明.协同过滤中基于用户兴趣度的相似性度量方法[J].计算机应用,2010,30(10):233-237.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700