摘要
结合语言模型条件随机场(CRF)和双向长短时记忆(BiLSTM)网络,构建一种BiLSTM-CRF模型,以提取商情文本序列中的招标人、招标代理以及招标编号3类实体信息。将规范化后的招标文本序列按字进行向量化,利用BiLSTM神经网络获取序列化文本的前向、后向文本特征,并通过CRF提取出双向本文特征中相应的实体。实验结果表明,与传统机器学习算法CRF相比,该模型3类实体的精确率、召回率和F1值平均提升15.21%、12.06%和13.70%。
A BiLSTM-CRF model is constructed by combining the Conditional Random Field(CRF) model of Bidirectional Long Short-Term Memory(BiLSTM) network to extract three kinds of entity information,tenderer,bidding agent and bidding number,in a commercial text sequence.The normalized bidding text sequence is vectorized by word.The forward and backward text features of the serialized text are obtained by BiLSTM neural network,and the corresponding entities in the two-way text features are extracted by CRF.Experimental results show that compared with the traditional machine learning algorithm CRF,the precision,recall rate and F1 value of the three types of entities in the proposed model are improved by 15.21%,12.06% and 13.70% in average,respectively.
引文
[1] ZHANG Lei,ZHANG Yi.Big data analysis by infinite deep neural networks[J].Journal of Computer Research and Development,2016,53(1):68-79.
[2] BOSCO A,LAGANà D,MUSMANNO R,et al.Modeling and solving the mixed capacitated general routing problem[J].Optimization Letters,2013,7(7):1451-1469.
[3] MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distri-buted representations of words and phrases and their compositionality[J].Advances in Neural Information Processing Systems,2013,26:3111-3119.
[4] CHIU J P C,NICHOLS E.Namedentity recognition with bidirectional LSTM-CNNs[EB/OL].[2018-09-30].https://arxiv.org/pdf/1511.08308.pdf.
[5] ZHANG Suxiang,WANG Xiaojie.Automatic recognition of Chinese organization name based on conditional random fields[C]//Proceedings of International Conference on Natural Language Processing and Knowledge Engineering.Washington D.C.,USA:IEEE Press,2007:229-233.
[6] BORTHWICK A E.A maximum entropy approach to named entity recognition[D].New York,USA:New York University,1999.
[7] BIKEL D M,MILLER S,SCHWARTZ R,et al.Nymble:a high-performance learning name-finder[C]//Proceedings of the 15th Conference on Applied Natural Language Processing.Washington D.C.,USA:IEEE Press,1997:194-201.
[8] ASAHARA M,MATSUMOTO Y.Japanese named entity extraction with redundant morphological analysis[C]//Proceedings of NAACL’03.Stroudsburg,USA:Association for Computational Linguistics,2003:8-15.
[9] MCCALLUM A,LI Wei.Early results for named entity recognition with conditional random fields,feature induction and Web-enhanced lexicons[C]//Proceedings of CONLL’03.Stroudsburg,USA:Association for Computational Linguistics,2003:188-191.
[10] CHO K,VAN MERRIENBOER B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[EB/OL].[2018-09-30].http://anthology.aclweb.org/D/D14/D14-1179.pdf.
[11] SANTOS C N D,GATTIT M.Deep convolutional neural networks for sentiment analysis of short texts[EB/OL].[2018-09-30].http://www.aclweb.org/anthology/C14-1008.
[12] LI Jiwei,GALLEY M,BROCKETT C,et al.A diversity-promoting objective function for neural conversation models[EB/OL].[2018-09-25].https://arxiv.org/pdf/1510.03055.pdf.
[13] KARJALA T W,HIMMELBLAU D M,MIIKKULAINEN R.Data rectification using recurrent (Elman) neural networks[C]//Proceedings of International Joint Confe-rence on Neural Networks.Washington D.C.,USA:IEEE Press,1992:901-906.
[14] GRAVES A.Long short-term memory[M]//GRAVES A.Supervised sequence labelling with recurrent neural networks.Berlin,Germany:Springer,2012:1735-1780.
[15] ZHOU Guobing,WU Jianxin,ZHANG Chenlin,et al.Minimal gated unit for recurrent neural networks[J].International Journal of Automation and Computing,2016,13(3):226-234.
[16] MIKOLOV T,CHEN Kai,CORRADO G,et al.Efficient estimation of word representations in vector space[EB/OL].[2018-09-10].http://export.arxiv.org/pdf/1301.3781.
[17] BENGIO Y,SCHWENK H,SENéCAL J S,et al.Neural probabilistic language models[J].Journal of Machine Learning Research,2001,3(6):1137-1155.
[18] MIKOLOV T,KARAFIáT M,BURGET L,et al.Recurrent neural network based language model[EB/OL].[2018-09-25].http://www.fit.vutbr.cz/research/groups/speech/publi/2010/mikolov_interspeech2010_IS100722.pdf.
[19] MIKOLOV T,ZWEIG G.Context dependent recurrent neural network language model[C]//Proceedings of 2012 IEEE Spoken Language Technology Workshop.Washington D.C.,USA:IEEE Press,2012:234-239.
[20] MIKOLOV T,DEORAS A,POVEY D,et al.Strategies for training large scale neural network language models[C]//Proceedings of 2011 IEEE Workshop on Automatic Speech Recognition and Understanding.Washington D.C.,USA:IEEE Press,2011:196-201.
[21] GRAVES A,SCHMIDHUBER J.Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J].Neural Networks,2005,18(5):602-610.
[22] HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[23] GRAVES A,JAITLY N,MOHAMED A R .Hybrid speech recognition with deep bidirectional LSTM[C]//Proceedings of 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.Washington D.C.,USA:IEEE Press,2013:273-278.
[24] BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[EB/OL].[2018-09-15].https://arxiv.org/pdf/1409.0473.pdf.
[25] RATNAPARKHI A.A maximum entropy model for part-of-speech tagging[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing.Washington D.C.,USA:IEEE Press,1996:133-142.
[26] MCCALLUM A,FREITAG D,PEREIRA F C N.Maximum entropy Markov models for information extraction and segmentation[C]//Proceedings of the 17th International Conference on Machine Learning.Washington D.C.,USA:IEEE Press,2000:591-598.
[27] LAFFERTY J D,MCCALLUM A,PEREIRA F C N.Conditional random fields:probabilistic models for segmenting and labeling sequence data[C]//Proceedings of the 18th International Conference on Machine Learning.[S.l.]:Morgan Kaufmann Publishers Inc.,2001:282-289.