用户名: 密码: 验证码:
关于命名实体识别的生成式对抗网络的研究
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Research on Generative Adversarial Networks of Named Entity Recognition
  • 作者:冯建周 ; 马祥聪 ; 刘亚坤 ; 宋沙沙
  • 英文作者:FENG Jian-zhou;MA Xiang-cong;LIU Ya-kun;SONG Sha-sha;Yanshan University College of Information Science and Engineering;Yanshan University Key Laboratory of Hebei Software Engineering;
  • 关键词:命名实体识别 ; 生成式对抗网络 ; BiLSTM ; Wasserstein距离 ; CWGAN
  • 英文关键词:named entity recognition;;generative adversarial networks;;bidirectional LSTM;;wasserstein distance;;conditional wasserstein generative adversarial nets(CWGAN)
  • 中文刊名:XXWX
  • 英文刊名:Journal of Chinese Computer Systems
  • 机构:燕山大学信息科学与工程学院;燕山大学河北省软件工程重点实验室;
  • 出版日期:2019-06-14
  • 出版单位:小型微型计算机系统
  • 年:2019
  • 期:v.40
  • 基金:国家自然科学基金青年基金项目(61602401)资助;; 河北省高等学校科学技术研究青年基金项目(QN2018074)资助
  • 语种:中文;
  • 页:XXWX201906010
  • 页数:6
  • CN:06
  • ISSN:21-1106/TP
  • 分类号:57-62
摘要
本文结合条件生成式对抗网络(CGAN)和改进的Wasserstein生成式对抗网络(WGAN-GP),提出一种适合于命名实体识别任务的条件Wasserstein生成式对抗网络模型(CWGAN).该模型借鉴CGAN以文本描述为条件的图像概率分布的思想,来完成以句子序列为条件获得标注序列概率分布的任务.该模型的生成器和判别器都采用BiLSTM结构,不同的是生成器生成命名实体标签的概率分布,判别器则为生成器的生成质量打分并反馈给生成器,生成器根据反馈更新梯度从而提升生成标签概率的质量.另外,CWGAN采用梯度惩罚的方法来保证梯度在反向传播的过程中保持平稳,通过拉近真实样本分布和生成样本之间的Wasserstein距离,优化目标函数.最后通过实验验证了该方法的可行性和优越性.
        This paper proposed a Generative Adversarial Nets suitable for the task of named entity recognition named Conditional Wasserstein Generative Adversarial Nets( CWGAN),inspired from Conditional GAN and improved Wasserstein GAN. Relative to the image probability distribution conditioned on textual description in CGAN,CWGAN obtains the NER label sequence probability distribution conditioned on sentence sequences. Both the generator and the discriminator use a bidirectional LSTM network. The difference is that the generator generates the probability distribution of the named entity tags,and the discriminator scores the generation quality of the generator and feeds it back to the generator. The generator updates the gradient according to the feedback to improve the quality of the probability of generating tags. In addition,this paper use gradient penalty in improved Wasserstein GAN to ensure that the gradient remains stable during backward propagation. Meanwhile,this paper use the mean which decrease the Wasserstein distance between real sample distribution and generate sample ensure that the target function is optimized. Experiments show that the CWGAN model we proposed is effective in the task of named entity recognition. Finally,the feasibility and superiority of the method are verified by experiments.
引文
[1] Chen Yu,Zheng De-quan,Zhao Tie-jun. Chinese relation extraction based on deep belief nets[J]. Journal of Software,2012,23(10):2572-2585.
    [2] Grishman R. The NYU System for MUC-6 or where's the syntax[C]//Proceedings of the 6th Conference on Message Understanding,Columbia,Maryland:Association for Computational Linguistics,1995:167-175.
    [3] Borthwick A,Grishman R. A maximum entropy approach to named entity recognition[D]. New York:New York University,Graduate School of Arts and Science,1999:38-47.
    [4] Lafferty J D,Mccallum A,Pereira F C N. Conditional random fields:probabilistic models for segmenting and labeling sequence data[C]//Eighteenth International Conference on Machine Learning,Williams College,Williamstown,MA,USA:Morgan Kaufmann Publishers Inc,2001:282-289.
    [5] Zhou G D,Su J. Named entity recognition using an HMM-based chunk tagger[C]//Proceedings of the 40th Annual Meeting on Association for Computational Linguistics,Pennsylvania,USA:Association for Computational Linguistics,2002:473-480.
    [6] McCallum A,Li W. Early results for named entity recognition with conditional random fields,feature induction and web-enhanced lexicons[C]//Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003-Volume 4,Stroudsburg,USA:Association for Computational Linguistics,2003:188-191.
    [7] Putthividhya D P,Hu J. Bootstrapped named entity recognition for product attribute extraction[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing,UK:Association for Computational Linguistics,2011:1557-1567.
    [8] Etzioni O,Cafarella M,Downey D,et al. Unsupervised named-entity extraction from the web:an experimental study[J]. Artificial Intelligence,2005,165(1):91-134.
    [9] Li Li-shuang,He Hong-lei,Liu Shan-shan,et al. Research of word representions on biomedical named entity recongnition[J]. Journal of Chinese Computer Systems,2016,37(2):302-307.
    [10] Banko M,Cafarella M J,Soderland S,et al. Open information extraction from the web[C]//International Joint Conference on Artificial Intelligence(IJCAI),Banff,Canada,2007,7:2670-2676.
    [11] Etzioni O,Fader A,Christensen J,et al. Open information extraction:the second generation[C]//International Joint Conference on Artificial Intelligence(IJCAI),Barcelona,Catalonia,Spain,2011,11:3-10.
    [12] Athavale V,Bharadwaj S,Pamecha M,et al. Towards deep learning in hindi NER:an approach to tackle the labelled data scarcity[C]//13th International Conference on Natural Language Processing.IIT(BHU),Varanasi,2016:154-162.
    [13] Huang Z,Xu W,Yu K. Bidirectional LSTM-CRF models for sequence tagging[J]. Computer Science,2015,42(5):45-54.
    [14] Lample G,Ballesteros M,Subramanian S,et al. Neural architectures for named entity recognition[C]//The 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies(NAACLHLT),2016:260-270.
    [15] Chiu J P C,Nichols E. Named entity recognition with bidirectional LSTM-CNNs[J]. Transactions of the Association for Computational Linguistics,2016,4(17):357-370.
    [16] Rei M,Crichton G,Pyysalo S. Attending to characters in neural sequence labeling models[C]//Proceedings of COLING 2016,the26th International Conference on Computational Linguistics:Technical Papers,Osaka,Japan,2016:309-318.
    [17] Goodfellow I J,Pouget-Abadie J,Mirza M,et al. Generative adversarial nets[C]//International Conference on Neural Information Processing Systems,MIT Press,2014:2672-2680.
    [18] Yu L,Zhang W,Wang J,et al. SeqGAN:sequence generative adversarial nets with policy gradient[C]//11 Proceedings of the Thirty-first AAAI Conference on Artificial Intelligence,AAAI Press,2017:2852-2858.
    [19] Arjovsky M,Chintala S,Bottou L. Wasserstein GAN[J]. arXiv Preprint arXiv:1701. 07875,2017.
    [20] Mirza M,Osindero S. Conditional generative adversarial nets[J].arXiv preprint arXiv:1411. 1784,2014:2672-2680.
    [21] Gulrajani I,Ahmed F,Arjovsky M,et al. Improved training of wasserstein gans[C]//Advances in Neural Information Processing Systems,Long Beach Convention Center,Long Beach,2017:5769-5779.
    [1]陈宇,郑德权,赵铁军.基于Deep Belief Nets的中文名实体关系抽取[J].软件学报,2012,23(10):2572-2585.
    [9]李丽双,何红磊,刘珊珊,等.基于词表示方法的生物医学命名实体识别[J].小型微型计算机系统,2016,37(2):302-307.
    1MUC. http://en. wikipedia. org. wiki/Message_Understanding_Conference.
    2 ACE. http://n. wikipedia. org. wiki/Automatic_Content_Extraction.
    3 Stanford Open Information Extraction. https://nlp. stanford. edu/software/openie. html.
    4 http://blog. heuritech. com/2016/01/20/attention-mechanism/
    5 https://en. wikipedia. org/wiki/Word2vec
    6 https://nlp. stanford. edu/projects/glove/

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700