摘要
目前存在的中文命名实体识别方法依赖于大量标注数据,但是某些领域标注数据的获取成本十分高昂.通过引入迁移学习技术,降低了实体识别模型对于大量标注数据的需求.论文从大规模非结构化文本数据出发,利用双向循环神经网络构建语言预测模型,将其作为迁移学习源模型;同时,基于上下文特征的字符级向量生成算法迁移源模型知识至实体识别模型,最终构建出迁移学习模型:Trans-NER.实验结果显示,提出的模型表现优于其他实体识别模型.
The existing Chinese named entity recognition method relies on a large amount of annotated data,but in some areas,the acquisition cost of the annotated data is very high. By introducing Transfer learning technology,the need for a large number of annotated data by the entity recognition model is reduced. Starting from large-scale unstructured text data,the paper constructs a language prediction model using bidirectional recurrent neural network as a transfer learning source model. At the same time,the character-level vector generation algorithm based on context features transfer the knowledge in source model to the entity recognition model. Finally,a transfer learning model called Trans-NER is constructed. The experimental results showthat the proposed model performs better than the other entity recognition model.
引文
[1]Nadeau D,Sekine S.A survey of named entity recognition and classification[J].Lingvisticae Investigationes,2007,30(1):3-26.
[2]Zhang Ming-huan,Chen Ying,Shen Ying,et al.Classification prediction of duchenne muscular dystrophy w ith a machine learning method[J].Journal of University of Shanghai for Science and Technology,2016,38(2):154-159.
[3]Zhuang Fu-zhen,Luo Ping,He Qing,et al.Survey on transfer learning research[J].Journal of Softw are,2015,26(1):26-39.
[4]Collobert R,Weston J,Karlen M,et al.Natural language processing(almost)from scratch[J].Journal of Machine Learning Research,2011,12(1):2493-2537.
[5]Lample G,Ballesteros M,Subramanian S,et al.Neural architectures for named entity recognition[C]//HLT-NAACL,K.Knight,A.Nenkova,O.Rambow,eds.The Association for Computational Linguistics,2016:260-270.
[6]Lafferty J D,Mccallum A,Pereira F C N.Conditional random fields:probabilistic models for segmenting and labeling sequence data[C]//Eighteenth International Conference on M achine Learning,M organ Kaufmann Publishers Inc,2001:282-289.
[7]Chiu J,Nichols E.Named entity recognition with bidirectional LSTM-CNNs[J].Transactions of the Association for Computational Linguistics,2016,4:357-370.
[8]Wu Y,Zhao J,Xu B.Chinese Named Entity Recognition combining a statistical model w ith human know ledge[C]//ACL 2003 Workshop on M ultilingual and M ixed-Language Named Entity Recognition.Association for Computational Linguistics,2003:65-72.
[9]Guo H,Jiang J,Hu G,et al.Chinese named entity recognition based on multilevel linguistic features[M].Natural Language ProcessingIJCNLP 2004,Springer Berlin Heidelberg,2004:90-99.
[10]Józefowicz R,Zaremba W,Sutskever I.An empirical exploration of recurrent network architectures[C]//International Conference on Machine Learning 2015(ICML),2015:2342-2350.
[11]Sinno Jialin Pan,Qiang Yang.A survey on transfer learning[J].IEEE Transactions on Know ledge and Data Engineering,2010,22(10):1345-1359.
[12]Mou L,Meng Z,Yan R,et al.How transferable are neural networks in NLP applications[C]//The Association for Computational Linguistics(EM NLP),2016:479-489.
[13]Peters M E,Ammar W,Bhagavatula C,et al.Semi-supervised sequence tagging w ith bidirectional language models[C]//Association for Computational Linguistics(ACL),2017:1756-1765.
[2]章鸣嬛,陈瑛,沈瑛,等.利用机器学习方法对神经肌肉罕见病DMD进行分类预测[J].上海理工大学学报,2016,38(2):154-159.
[3]庄福振,罗平,何清,等.迁移学习研究进展[J].软件学报,2015,26(1):26-39.
1 http://wenshu. court. gov. cn/