基于SGAN的中文问答生成研究

英文篇名：CHINESE QUESTION ANSWER GENERATION BASED ON SGAN
作者：沈杰 ; 瞿遂春 ; 任福继 ; 邱爱兵 ; 徐杨
英文作者：Shen Jie;Qu Suichun;Ren Fuji;Qiu Aibing;Xu Yang;School of Electrical Engineering,Nantong University;Faculty of Engineering,The University of Tokushima;
关键词：问答系统 ; 序列对抗模型 ; 强化学习 ; Actor-Critic策略梯度 ; 评价指标
英文关键词：Question and answer system;;Sequence antagonistic model;;Reinforcement learning;;Actor-critic policy gradient;;Evaluation metrics
中文刊名：JYRJ
英文刊名：Computer Applications and Software
机构：南通大学电气工程学院;德岛大学先端科学技术部;
出版日期：2019-02-12
出版单位：计算机应用与软件
年：2019
期：v.36
基金：国家自然科学基金项目(61473159)
语种：中文;
页：JYRJ201902036
页数：6
CN：02
ISSN：31-1260/TP
分类号：200-205

摘要

生成对抗网络GAN(Generative adversarial networks)仅适用于解决连续型数据,同时中文对话模型训练缺乏高质量的样本数据集。研究开放域中文闲聊的问答生成,对话文本是离散型数据,GAN的使用受到限制。设计新的序列对抗生成网络SGAN(Sequence GAN)来解决此问题。SGAN使用基于强化学习的生成器扩展GAN,可以解决序列生成问题。同时使用Actor-Critic策略梯度训练模型,评价指标采用精准度和召回率。实验结果表明,该对话序列对抗模型能够生成足够的对话样本混淆人为提供的样本。
Generating antagonistic network(GAN) is only suitable for solving continuous data,while Chinese dialogue model training lacks high-quality sample data sets.This paper has a study on the Chinese question and answer generation of open domain.However,dialog text is discrete data,so the use of GAN is limited.Therefore,we designed a new model called SGAN(sequence GAN) to solve these problems.SGAN extended the GAN by using a method called reinforcement learning to train the generator to solve the problem of sequence generation.SGAN also used a policy gradient called actor-critic to train the networks.The precision and recall rate were used as the evaluation indexes of the model.Experimental results show that the proposed dialogue sequence adversarial model can generate enough dialogue samples to confuse the artificial-provided samples.

引文

[1]刘群.统计机器翻译综述[J].中文信息学报,2003,17(4):2-13.
    [2]Sutskever I,Vinyals O,Le Q V.Sequence to sequence learning with neural networks[C]//Advances in neural information processing systems.2014:3104-3112.
    [3]Rajpurkar P,Zhang J,Lopyrev K,et al.Squad:100,000+questions for machine comprehension of text[EB].ar Xiv preprint ar Xiv:1606.05250,2016.
    [4]Reddy S,Chen D,Manning C D.Coqa:A conversational question answering challenge[EB].ar Xiv preprint ar Xiv:1808.07042,2018.
    [5]Goodfellow I,Pouget-Abadie J,Mirza M,et al.Generative adversarial nets[C]//Advances in neural information processing systems.2014:2672-2680.
    [6]王坤峰,苟超,段艳杰,等.生成式对抗网络GAN的研究进展与展望[J].自动化学报,2017,43(3):321-332.
    [7]Li J,Monroe W,Shi T,et al.Adversarial learning for neural dialogue generation[EB].ar Xiv preprint ar Xiv:1701.06547,2017.
    [8]Grammatico S,Parise F,Colombino M,et al.Decentralized convergence to Nash equilibria in constrained deterministic mean field control[J].IEEE Transactions on Automatic Control,2016,61(11):3315-3329.
    [9]Madry A,Makelov A,Schmidt L,et al.Towards deep learning models resistant to adversarial attacks[EB].ar Xiv preprint ar Xiv:1706.06083,2017.
    [10]Tramèr F,Kurakin A,Papernot N,et al.Ensemble adversarial training:Attacks and defenses[J].ar Xiv preprint ar Xiv:1705.07204,2017.
    [11]唐贤伦,杜一铭,刘雨微,等.基于条件深度卷积生成对抗网络的图像识别方法[J].自动化学报,2018,44(5):855-864.
    [12]Akhtar N,Mian A.Threat of adversarial attacks on deep learning in computer vision:A survey[EB].ar Xiv preprint ar Xiv:1801.00553,2018.
    [13]Yu L,Zhang W,Wang J,et al.Seq GAN:Sequence generative adversarial nets with policy gradient[C]//The ThirtyFirst AAAI Conference on Artificial Intelligence(AAAI2017).2017:2852-2858.
    [14]Lillicrap T P,Hunt J J,Pritzel A,et al.Continuous control with deep reinforcement learning[EB].ar Xiv preprint ar X-iv:1509.02971,2015.
    [15]Silver D,Lever G,Heess N,et al.Deterministic policy gradient algorithms[C]//Proceedings of the 31st International Conference on International Conference on Machine Learning-Volume 32.2014:I-387-I-395.
    [16]Silver D,Huang A,Maddison C J,et al.Mastering the game of Go with deep neural networks and tree search[J].Nature,2016,529(7587):484.
    [17]王坤峰,左旺孟,谭营,等.生成式对抗网络:从生成数据到创造智能[J].自动化学报,2018,44(5):769-774.
    [18]Vinyals O,Le Q.A neural conversational model[EB].ar X-iv preprint ar Xiv:1506.05869,2015.
    [19]Hu B,Lu Z,Li H,et al.Convolutional neural network architectures for matching natural language sentences[C]//Advances in neural information processing systems.2014:2042-2050.
    [20]Konda V R,Tsitsiklis J N.Actor-critic algorithms[C]//Advances in neural information processing systems.2000:1008-1014.
    [21]Pfau D,Vinyals O.Connecting generative adversarial networks and actor-critic methods[EB].ar Xiv preprint ar Xiv:1610.01945,2016.
    [22]Wiseman S,Rush A M.Sequence-to-sequence learning as beam-search optimization[EB].ar Xiv preprint ar Xiv:1606.02960,2016.
    [23]Lowe R,Pow N,Serban I,et al.The ubuntu dialogue corpus:A large dataset for research in unstructured multi-turn dialogue systems[EB].ar Xiv preprint ar Xiv:1506.08909,2015.
    [24]Kim Y.Convolutional neural networks for sentence classification[EB].ar Xiv preprint ar Xiv:1408.5882,2014.
    [25]Papineni K,Roukos S,Ward T,et al.BLEU:a method for automatic evaluation of machine translation[C]//Proceedings of the 40th annual meeting on association for computational linguistics.Association for Computational Linguistics,2002:311-318.
    [26]Kannan A,Kurach K,Ravi S,et al.Smart reply:Automated response suggestion for email[C]//Proceedings of the22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2016:955-964.
    [27]Kannan A,Vinyals O.Adversarial evaluation of dialogue models[EB].ar Xiv preprint ar Xiv:1701.08198,2017.
    [28]王智圣,李琪,汪静,等.基于隐式用户反馈数据流的实时个性化推荐[J].计算机学报,2016,39(1):52-64.
    [29]杜慧,徐学可,伍大勇,等.基于情感词向量的微博情感分类[J].中文信息学报,2017,31(3):170-176.
    [30]胡国平,张丹,苏喻,等.试题知识点预测:一种教研知识强化的卷积神经网络模型[J].中文信息学报,2018,32(5):137-146.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700