ED-GAN:基于改进生成对抗网络的法律文本生成模型

英文篇名：ED-GAN:Judicial Document Generating Model Based on Improved Generative Adversarial Networks
作者：康云云 ; 彭敦陆 ; 陈章 ; 刘丛
英文作者：KANG Yun-yun;PENG Dun-lu;CHEN Zhang;LIU Cong;School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology;
关键词：案情要素 ; GAN ; 文本自动生成 ; LSTMs ; Encoder-Decoder ; CNN
英文关键词：case elements;;GAN;;automatic text generation;;LSTMs;;encoder-decoder;;CNN
中文刊名：XXWX
英文刊名：Journal of Chinese Computer Systems
机构：上海理工大学光电信息与计算机工程学院;
出版日期：2019-05-14
出版单位：小型微型计算机系统
年：2019
期：v.40
基金：国家自然科学基金项目(61772342,61703278)资助
语种：中文;
页：XXWX201905021
页数：6
CN：05
ISSN：21-1106/TP
分类号：110-115

摘要

法律文本的自动生成能缓解我国法律服务行业中的人力资源不足的问题,对抗生成网络模型的出现为法律文本的自动生成提供了新思路.本文提出一种基于对抗生成网络的文本自动生成模型——ED-GAN(Generative Adversarial Networks based on Encoder-Decoder).在该模型的生成器中,首先将案情要素的关键词序列输入至编码器Encoder阶段的LSTM中编码成一隐含层向量,再将这个隐含层向量输入到解码器Decoder的LSTM中,并结合其各时间步的输出生成下一时间步的隐含层向量,进而得到各时间步的输出,生成文本序列.模型最后采用CNN网络来鉴别生成文本和真实文本之间的差距.实验验证表明,采用所提模型能够生成较理想的法律文本.
The emergence of the Generative Adversarial Networks(GAN) model provides new ideas for the automatic generation of the legal texts which can alleviate the shortage of the human resources in Chinese legal service industry. In this paper,we propose an automatic text generation model based on the Generative Adversarial Networks—ED-GAN(Generative Adversarial Networks based on Encoder-Decoder). In this model,the keyword sequence of the case element will be inputted into the encoder with the Long ShortTerm Memory(LSTM). The encoded result is a hidden layer vector as the initial input feed into the decoder,which is a LSTM. The hidden state of current time step is generated by combining the output and hidden state of previous time step of the LSTM,and then the output of each time step will be generated that is the target text sequence. The model finally uses the CNN network to identify the gap between the generated text and the actual text. The experiment prove that the proposed model can generate better legal texts.

引文

[1] Kipyatkova I,Karpov A. Language models with RNNs for rescoring hypotheses of Russian ASR[M]. Advances in Neural Netw orks-ISNN,Springer International Publishing,2016:418-425.
    [2] Chernodub A,Nowicki D. Sampling-based gradient regul-arization for capturing long-term dependencies in recurrent neural netw orks[M]. Neural Information Processing,Springer International Publishing,2016:90-97.
    [3] Hochreiter S,Schmidhuber J. Long short-term memory[J]. Neural Computation,1997,9(8):1735-1780.
    [4] Tran V K,Nguyen L M. Natural language generation for spoken dialogue system using RNN encoder-decoder net-w orks[C]. Conference on Computational Natural Language Learning,2017:442-451.
    [5] Park S H,Kim B D,Kang C M,et al. Sequence-to-sequence prediction of vehicle trajectory via LSTM encoder-decoder architecture[J]. Computing Research Repository,2018,https://arxiv. org/pdf/1802. 06338. pdf.
    [6] Zhang Y,Gan Z,Fan K,et al. Adversarial feature matching for text generation[C]. Conference on M achine Learning,2017:4006-4015.
    [7] Zhang Ming-huan,Chen Ying,Shen Ying,et al. Classification prediction of duchenne muscular dystrophy w ith a machine learning method[J]. Journal of Shanghai University of Science and Technology,2016,38(2):154-159.
    [8] Hinton G E,Osindero S,Teh Y W. A fast learning algorithm for deep belief nets[J]. Neural Computation,2006,18(7):1527-1554.
    [9] Cho K,Merrienboer B V,Gulcehre C,et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]. Conference on Empirical M ethods in Natural Language Processing,2014:1724-1734.
    [10] Zhang J,Feng Y,Wang D,et al. Flexible and creative Chinese poetry generation using neural memory[J]. Computation and Language(cs. CL),2017,https://arxiv. org/abs/1705. 03773.
    [11] Yu L,Zhang W,Wang J,et al. SeqGAN:sequence generative adversarial nets w ith policy gradient[C]. In the Association for the Advance of Artificial Intelligence,2017:2852-2858.
    [7]章鸣嬛,陈瑛,沈瑛,等.利用机器学习方法对神经肌肉罕见病DMD进行分类预测[J].上海理工大学学报,2016,38(2):154-159.
    1 China Judgements Online.http://wenshu.court.gov.cn,2016.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700