摘要
针对传统的简历信息实体抽取方法泛化能力差、难以维护的问题,提出一种基于深层神经网络的简历信息实体抽取方法。经过数据清洗、分词等预处理将非结构化的简历文本信息处理为词序列,通过由Word2Vec在大规模语料库以无监督方式训练得到的词向量表,将每个词映射为低维实数向量,由双向LSTM层融合待标注词所处的语境信息,输出所有可能标签序列的分值给CRF层,由其引入前后标签之间的约束求解最优标签序列,以随机梯度下降法训练该模型,辅以Dropout防止过拟合。实验结果表明,该方法提升了相应的解析标注性能,提高了泛化能力。
The traditional information entity extraction methods of the resume(ERIE)are hard to be maintained because of poor generalization ability.To tackle above problems,an ERIE method based on deep neural network was proposed.After data cleaning and word segmentation,the unstructured resume text information was represented as a word sequence.Each word was mapped into a low-dimensional real vector,which was trained by using an unsupervised method Word2 Vec based on a large-scale corpus.The bidirectional LSTM layer was used to fuse the contextual information of the words to be marked,and the values of all possible tag sequences were exported to the CRF layer.The constraint between the front and rear tags was introduced to solve the optimal tag sequence.The model was trained using the stochastic gradient descent method,and the dropout was used to prevent overfitting.Experimental results show that the proposed method produces better parsing performance and improves the generalization ability.
引文
[1]Mesnil G,Dauphin Y,Yao K,et al.Using recurrent neural networks for slot filling in spoken language understanding[J].IEEE/ACM Transactions on Audio,Speech and Language Processing,2015,23(3):530-539.
[2]Plank B,Sogaard A,Goldberg Y,et al.Multilingual part-ofspeech tagging with bidirectional long short-term memory models and auxiliary loss[J]. Meeting of the Association for Computational Linguistics,2016:412-418.
[3]Mikolov T,Sutskever I,Chen K,et al.Distributed representations of words and phrases and their compositionality[J].Neural Information Processing Systems,2013:3111-3119.
[4]Srivastava N,Hinton G E,Krizhevsky A,et al.Dropout:A simple way to prevent neural networks from overfitting[J].Journal of Machine Learning Research,2014,15(1):1929-1958.
[5]Yao K,Peng B,Zweig G,et al.Recurrent conditional random field for language understanding[C]//IEEE International Conference on Acoustics,Speech and Signal Processing.IEEE,2014:4077-4081.
[6]Chiu J P,Nichols E.Named entity recognition with bidirectional LSTM-CNNs[J].Transactions of the Association for Computational Linguistics,2015,4(0):357-370.
[7]Ma X,Hovy E H.End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF[J].Meeting of the Association for Computational Linguistics,2016:1064-1074.
[8]Jagannatha A,Yu H.Structured prediction models for RNN based sequence labeling in clinical text[J].Empirical Methods in Natural Language Processing,2016:856-865.
[9]Graves A,Mohamed A,Hinton G.Speech recognition with deep recurrent neural networks[C]//IEEE International Conference on Acoustics,Speech and Signal Processing.IEEE,2013:6645-6649.
[10]Ling W,Dyer C,Black A W,et al.Finding function in form:Compositional character models for open vocabulary word representation[J].Empirical Methods in Natural Language Processing,2015:1520-1530.
[11]Lample G,Ballesteros M,Subramanian S,et al.Neural architectures for named entity recognition[J].North American Chapter of the Association for Computational Linguistics,2016:260-270.