摘要
针对命名实体识别不具备良好的领域自适应性,大多研究对象是某个领域的命名实体识别,本文分析了当下流行的条件随机场模型、隐马尔科夫模型和最大熵模型的优劣对比,最后采用条件随机场与规则相结合,以词特征、词性特征作为特征模板训练模型结合规则提取命名实体,实验结果表明本文的方法能有效提高命名实体识别的准确率。
Named entity recognition does not have good domain adaptability,Most of the research object is a field named entity recog-nition.This paper analyses the advantages and disadvantages of the current popular Conditional Random Field model, Hidden Mar-kov model and Maximum Entropy model.Finally, Conditional Random Fields and rules are combined to extract named entities byusing word features and part-of-speech features as feature template training models. The experimental results show that the pro-posed method can effectively improve the accuracy of named entity recognition.
引文
[1]吴金星,丽丽,杨振新.CRF和词典相结合的蒙古地名识别研究[J].计算机工程与科学,2016,38(5):1047-1051.
[2]李丽双,何红磊,刘珊珊,等.基于词表示方法的生物医学命名实体识别[J].小型微型计算机系统,2016,37(2):302-305.
[3]朱颢东,杨立志,丁温雪.基于主题标签和CRF的中文微博命名实体识别[J].华中师范大学学报(自然科学版),2018,52(3):317-319.
[4]郑秋生,刘守喜.基于CRF的互联网文本命名实体识别研究[J].中原工程学报,2016,27(1):71-73.
[5]姚霖,刘轶,李鑫鑫.词边界字向量的中文命名实体识别[J].智能系统学报[J].2016,11(1):38-40.
[6]Lafferty J,Mccallum A,Pereira F,et al.Probabilistic Models for Segmenting and Labeling Sequence Data[J].Proc.international Conf.on Machine Learning,2002,53(2):282-289.
[7]Sha F,Pereira F.Shallow Parsing with Conditional Random Fields[C].Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics
[8]Rabiner L,Juang B.An Introduction to Hidden Markov Models[J].IEEE Assp Magazine,1986(3):4-16.
[9]Rabiner L.A Tutorial on Hidden Markov Models and Selected Applications in Speech