Corpus based part-of-speech tagging
详细信息    查看全文
  • 作者:Chengyao Lv ; Huihua Liu ; Yuanxing Dong…
  • 刊名:International Journal of Speech Technology
  • 出版年:2016
  • 出版时间:September 2016
  • 年:2016
  • 卷:19
  • 期:3
  • 页码:647-654
  • 全文大小:498 KB
  • 刊物类别:Engineering
  • 刊物主题:Signal,Image and Speech Processing
    Social Sciences
    Artificial Intelligence and Robotics
  • 出版者:Springer Netherlands
  • ISSN:1572-8110
  • 卷排序:19
文摘
In natural language processing, a crucial subsystem in a wide range of applications is a part-of-speech (POS) tagger, which labels (or classifies) unannotated words of natural language with POS labels corresponding to categories such as noun, verb or adjective. Mainstream approaches are generally corpus-based: a POS tagger learns from a corpus of pre-annotated data how to correctly tag unlabeled data. Presented here is a brief state-of-the-art account on POS tagging. POS tagging approaches make use of labeled corpus to train computational trained models. Several typical models of three kings of tagging are introduced in this article: rule-based tagging, statistical approaches and evolution algorithms. The advantages and the pitfalls of each typical tagging are discussed and analyzed. Some rule-based and stochastic methods have been successfully achieved accuracies of 93–96 %, while that of some evolution algorithms are about 96–97 %.KeywordsNatural language processingPOS taggingHidden markov modelsSupport vector machineNeural networksGene expression programming

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700