Multi-granularity sequence labeling model for acronym expansion identification
详细信息    查看全文
文摘
Identifying expansion forms for acronyms is beneficial to many natural language processing and information retrieval tasks. In this work, we study the problem of finding expansions in texts for given acronym queries by modeling the problem as a sequence labeling task. However, it is challenging for traditional sequence labeling models like Conditional Random Fields (CRF) due to the complexity of the input sentences and the substructure of the categories. In this paper, we propose a Latent-state Neural Conditional Random Fields model (LNCRF) to deal with the challenges. On one hand, we extend CRF by coupling it with nonlinear hidden layers to learn multi-granularity hierarchical representations of the input data under the framework of Conditional Random Fields. On the other hand, we introduce latent variables to capture the fine granular information from the intrinsic substructures within the structured output labels implicitly. The experimental results on real data show that our model achieves the best performance against the state-of-the-art baselines.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700