Hierarchical Amharic Base Phrase Chunking Using HMM with Error Pruning
详细信息    查看全文
  • 关键词:Amharic language processing ; Base phrase chunking ; Partial parsing
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2016
  • 出版时间:2016
  • 年:2016
  • 卷:9561
  • 期:1
  • 页码:126-135
  • 全文大小:2,584 KB
  • 参考文献:Abney, S.: Parsing by chunks. In: Berwick, R., Abney, S., Tenny, C. (eds.) Principle-Based Parsing. Kluwer Academic Publishers, Dordrecht (1991)
    Abney, S.: Chunks and dependencies: bringing processing evidence to bear on syntax. In: Computational Linguistics and the Foundations of Linguistic Theory. CSLI (1995)
    Ali, W., Hussain, S.: A hybrid approach to Urdu verb phrase chunking. In: Proceedings of the 8th Workshop on Asian Language Resources (ALR-8), COLING-2010, Beijing, China (2010)
    Amare, G.: (Modern Amharic Grammar in a Simple Approach). Addis Ababa, Ethiopia (2010)
    Brants, T.: Cascaded markov models. In: Proceedings of the Ninth Conference on European Chapter of the Association for Computational Linguistics, EACL 1999, Bergen, Norway (1999)
    Kutlu, M.: Noun phrase chunker for Turkish using dependency parser. Doctoral dissertation, Bilkent University (2010)
    Lewis, P., Simons, F., Fennig, D.: Ethnologue: Languages of the World, 17th edn. SIL International, Dallas (2013)
    Molina, A., Pla, F.: Shallow parsing using specialized HMMs. J. Mach. Learn. Res. 2, 595–613 (2002)MATH
    Ramshaw, A., Marcus, P.: Text chunking using transformation-based learning. In: Proceedings of the Third ACL Workshop on Very Large Corpora, pp. 82–94 (1995)
    Thao, H., Thai, P., Minh N., Thuy, Q.: Vietnamese noun phrase chunking based on conditional random fields. In: International Conference on Knowledge and Systems Engineering (KSE 2009), pp. 172–178 (2009)
    Tjong, E.F., Sang, K., Buchholz, S.: Introduction to the CoNLL-2000 shared task: chunking. In: Proceedings of the 2nd Workshop on Learning Language in Logic and the 4th Conference on Computational Natural Language Learning, vol. 7, pp. 127–132 (2000)
    Xu, F., Zong, C., Zhao, J.: A hybrid approach to Chinese base noun phrase chunking. In: Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing, Sydney (2006)
    Yangarber, R., Grishman, R.: NYU: description of the Proteus/PET system as used for MUC-7. In: Proceedings of the Seventh Message Understanding Conference, MUC-7, Washington, DC (1998)
    Yimam, B.: (Amharic Grammar). Addis Ababa, Ethiopia (2000)
  • 作者单位:Abeba Ibrahim (16)
    Yaregal Assabie (16)

    16. Department of Computer Science, Addis Ababa University, Addis Ababa, Ethiopia
  • 丛书名:Human Language Technology. Challenges for Computer Science and Linguistics
  • ISBN:978-3-319-43808-5
  • 刊物类别:Computer Science
  • 刊物主题:Artificial Intelligence and Robotics
    Computer Communication Networks
    Software Engineering
    Data Encryption
    Database Management
    Computation by Abstract Devices
    Algorithm Analysis and Problem Complexity
  • 出版者:Springer Berlin / Heidelberg
  • ISSN:1611-3349
  • 卷排序:9561
文摘
Segmentation of a text into non-overlapping syntactic units (chunks) has become an essential component of many applications of natural language processing. This paper presents Amharic base phrase chunker that groups syntactically correlated words at different levels using HMM. Rules are used to correct chunk phrases incorrectly chunked by the HMM. For the identification of the boundary of the phrases IOB2 chunk specification is selected and used in this work. To test the performance of the system, corpus was collected from Amharic news outlets and books. The training and testing datasets were prepared using the 10-fold cross validation technique. Test results on the corpus showed an average accuracy of 85.31 % before applying the rule for error correction and an average accuracy of 93.75 % after applying rules.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700