Arabic Question Answering: Systems, Resources, Tools, and Future Trends
详细信息    查看全文
  • 作者:Mohamed Shaheen (1)
    Ahmed Magdy Ezzeldin (1)
  • 关键词:QA ; Factoid questions ; QA4MRE ; Question analysis ; Passage retrieval ; Answer extraction ; Answer validation ; Test ; sets ; Evaluation ; Metrics ; Language resources ; NLP ; Information retrieval ; Stemming ; Corpus ; NER ; Stemming ; Lemmatization ; Morphological analysis ; Part ; of ; speech tagging ; Diacritization ; Overview ; Review ; Survey
  • 刊名:Arabian Journal for Science and Engineering
  • 出版年:2014
  • 出版时间:June 2014
  • 年:2014
  • 卷:39
  • 期:6
  • 页码:4541-4564
  • 全文大小:
  • 参考文献:1. Abdelbaki, H.; Shaheen, M.; Badawy, O.: ARQA high-performance arabic question answering system. In: Proceedings of Arabic Language Technology International Conference (ALTIC) (2011)
    2. Abdelrahman, S.; Elarnaoty, M.; Magdy, M.; Fahmy, A.: Integrated machine learning techniques for Arabic named entity recognition. IJCSI 1 (2010)
    3. Abouenour, L.; El Hassani, S.; Yazidy, T.; Bouzouba, K.; Hamdani, A.: Building an Arabic morphological analyzer as part of an open Arabic NLP platform. In: The Language Resources and Evaluation Conference (LREC), Marrakech, Morocco, 31st May (2008)
    4. Abouenour, L.; Bouzoubaa, K.; Rosso, P.: Three-level approach for passage retrieval in Arabic question/answering systems. In: Proc. of the 3rd International Conference on Arabic Language Processing CITALA2009, Rabat, Morocco (2009)
    5. Abouenour, L.; Bouzouba, K.; Rosso, P.: An Evaluated Semantic Query Expansion and Structure-Based Approach for Enhancing Arabic Question/Answering (2010)
    6. Abouenour, L.: On the improvement of passage retrieval in arabic question/answering (Q/A) systems. Natural Lang. Process. Inf. Syst., pp. 336鈥?41 (2011)
    7. Abouenour, L.; Bouzoubaa, K.; Rosso, P.: IDRAAQ: new arabic question answering system based on query expansion and passage retrieval. In: CLEF 2012 Workshop on Question Answering For Machine Reading Evaluation (QA4MRE) (2012)
    8. Abuleil, S.; Evens, M.: Discovering Lexical Information by Tagging Arabic Newspaper Text. Workshop on Semantic Language Processing. COLING-ACL 鈥?8, University of Montreal, Montreal, PQ, Canada, Aug. 16 1998, pp. 1鈥? (1998)
    9. Al-Safadi L., Al-Rgebh D., AlOhali W.: A comparison between ontology-based and translation-based semantic search engines for Arabic blogs. Arab. J. Sci. Eng. 38(11), 2985鈥?992 (2013) CrossRef
    10. Alshalabi R.: Pattern-based Stemmer for finding Arabic roots. Inf. Technol. J. 4(1), 38鈥?3 (2005) CrossRef
    11. Attia, M.; Rashwan, M.; Ragheb, A.; Al-Badrashiny, M.; Al-Basoumy, H.; Abdou, S.: A compact Arabic lexical semantics language resource based on the theory of semantic fields. In: Advances in Natural Language Processing, pp. 65鈥?6. Springer, Berlin, Heidelberg (2008)
    12. Attia, M.; Rashwan, M.; Al-Badrashiny, M.A.S.A.A.: Fassieh, a semi-automatic visual interactive tool for morphological, PoS-Tags, phonetic, and semantic annotation of Arabic Text Corpora. In: IEEE Transactions on Audio, Speech, and Language Processing, vol. 17(5), pp. 916鈥?25 (2009)
    13. Awadallah, R.; Rauber, A.: Web-based multiple choice question answering for English and Arabic questions. Adv. Inf. Retr. 515鈥?18 (2006)
    14. Bekhti S., Rehman A., Al-Harbi M., Saba T.: AQuASys an Arabic question-answering system based on extensive question analysis and answer relevance scoring. Inf. Comput. Int. J. Acad. Res. 3(4), 45鈥?4 (2011)
    15. Benajiba, Y.; Rosso, P.: ANERsys 2.0: conquering the NER task for the Arabic language by combining the maximum entropy with PoS-tag information. In: Proc. of Workshop on Natural Language-Independent Engineering, IICAI-2007 (2007)
    16. Benajiba, Y.; Rosso, P.; Lyhyaoui, A.: Implementation of the ArabiQA question answering system鈥檚 components. In: Proc. Workshop on Arabic Natural Language Processing, 2nd Information Communication Technologies Int. Symposium, ICTIS-2007, Fez, Morroco, April, pp. 3鈥? (2007)
    17. Benajiba Y., Rosso P.: Arabic question answering. Diploma of advanced studies. Technical University of Valencia, Spain (2007)
    18. Benajiba, Y.; Rosso, P.; Bened铆Ruiz, J.: ANERsys: an Arabic named entity recognition system based on maximum entropy. Comput. Linguist. Intell. Text Process. 143鈥?53 (2007)
    19. Benajiba, Y.; Rosso, P.; G贸mez Soriano, J.: Adapting the JIRS passage retrieval system to the Arabic language. Comput. Linguist. Intell. Text Process. 530鈥?41 (2007)
    20. Benajiba, Y.; Rosso, P.: Arabic named entity recognition using conditional random fields. In: Proc. of Workshop on HLT NLP within the Arabic World, LREC, vol. 8, pp. 143鈥?53 (2008)
    21. Bhaskar, P.; Pakray, P.; Banerjee, S.; Banerjee, S.; Bandyopadhyay, S.; Gelbukh, A.: Question answering system for QA4MRE@CLEF 2012. In: CLEF 2012 Workshop on Question Answering For Machine Reading Evaluation (QA4MRE) (2012)
    22. Bouzouba, K.; Kabbaj, A.: An Integrated Development Platform for Arabic Language Processing. ISCAL-07.s (2007)
    23. Brini, W.; Ellouze, M.; Trigui, O.; Mesfar, S.; Belguith, H.L.; Rosso, P.: Factoid and Definitional Arabic Question Answering System. Post-Proc. NOOJ-2009, Tozeur, Tunisia, June, 8鈥?0 (2009)
    24. Brini, W.; Ellouze, M.; Mesfar, S.; Belguith, L.H.: An Arabic question-answering system for factoid questions. In: IEEE International Conference on Natural Language Processing and Knowledge Engineering, 2009. NLP-KE 2009, pp. 1鈥? (2009)
    25. Buckwalter, T.: Buckwalter Arabic Morphological Analyzer Version 1.0. Linguistic Data Consortium, catalog number LDC2002L49, ISBN 1-58563-257-0 (2002)
    26. Buscaldi, D.; G贸mez, J.M.; Rosso, P.; Sanchis, E.: The UPV at QA@ CLEF 2006. In: Working Notes for the CLEF 2006 Workshop (2006)
    27. Diab, M.: Second generation AMIRA tools for Arabic processing: fast and robust tokenization, PoS tagging, and base phrase chunking. In: Proceedings of the second international conference on arabic language resources and tools, pp. 285鈥?88 (2009)
    28. Elghamry, K.; Al-Sabbagh, R.; El-Zeiny, N.: Cue-based bootstrapping of Arabic semantic features. JADT 2008: 9es Journ茅es internationales d鈥橝nalyse statistique des Donn茅es Textuelles (2008)
    29. Elkateb, S.; Black, W.; Vossen, P.; Farwell, D.; Rodr铆guez, H.; Pease, A.; Alkhalifa, M.: Arabic WordNet and the challenges of Arabic. In: Proceedings of Arabic NLP/MT Conference, London, UK (2006)
    30. Ferrucci D., Brown E., Chu-Carroll J., Fan J., Gondek D., Kalyanpur A.A., Welty C., Welty C.: Building Watson: an overview of the DeepQA project. AI Mag. 31(3), 59鈥?9 (2010)
    31. Gomez, J.M.; Montes-Gomez, M.; Sanchis, E.; Villasenor-Pineda, L.; Rosso, P.: Language independent passage retrieval for question answering. In: Fourth Mexican International Conference on Artificial IntelligenceMICAI 2005, Lecture Notes in Computer Science, pp. 816鈥?23, Monterrey, Mexico, 2005. Springer, Berlin (2005)
    32. Habash, N., Rambow, O., Roth, R.: MADA+TOKAN: a toolkit for Arabic tokenization, diacritization, morphological disambiguation, pos tagging, stemming and lemmatization. In: Proceedings of the 2nd International Conference on Arabic Language Resources and Tools (MEDAR), Cairo, Egypt, pp. 102鈥?09 (2009)
    33. Hammo, B.; Abu-Salem, H.; Lytinen, S.: QARAB: a question answering system to support the Arabic language. In: Proceedings of the ACL-02 workshop on computational approaches to semitic languages, pp. 1鈥?1. Association for Computational Linguistics (2002)
    34. Hammo B., Abuleil S., Lytinen S., Evens M.: Experimenting with a question answering system for the Arabic language. Comput. Human. 38(4), 397鈥?15 (2004) CrossRef
    35. Harmanani, H.M.; Keirouz, W.T.; Raheel, S.: A rule-based extensible Stemmer for information retrieval with application to Arabic. Int. Arab. J. Inf. Technol. 3(3), 265鈥?72
    36. Hatcher, E.; Gospodnetic, O.; McCandless, M.: Lucene in action (2004)
    37. Kadri, Y.; Nie, J.Y.: Effective Stemming for Arabic information retrieval. In: Proceedings of the Challenge of Arabic for NLP/MT Conference, Londres, Royaume-Uni (2006)
    38. Kanaan G., Hammouri A., Al-Shalabi R., Swalha M.: A new question answering system for the Arabic language. Am. J. Appl. Sci. 6(4), 797鈥?05 (2009) CrossRef
    39. Khoja, S.; Garside, R.: Stemming Arabic text. Computing Department, Lancaster University, Lancaster, UK (1999)
    40. Kontos, J.; Malagardi, I.O.A.N.N.A.; Peros, J.O.H.N.: Question answering and rhetoric analysis of biomedical texts in the aroma system. In: Proceedings of the 7th HERCMA: Hellenic European conference in computer mathematics and its applications, Athens, Greece (2005)
    41. Larkey, L.S.; Connell, M.E.: Arabic Information Retrieval at UMass in TREC-10. Massachusetts Univ Amherst Center for Intelligent Information Retrieval (2006)
    42. Larkey, L.S.; Ballesteros, L.; Connell, M.E.: Light stemming for Arabic information retrieval. In: Arabic Computational Morphology, pp. 221鈥?43. Springer, Netherlands (2007)
    43. Laurent, D.; S茅gu茅la, P.; N猫gre, S.: QA better than IR? In: Proceedings of the Workshop on Multilingual Question Answering, pp. 1鈥?. Association for Computational Linguistics (2006)
    44. Maamouri, M.; Bies, A.; Buckwalter, T.; Mekki, W.: The Penn Arabic Treebank: building a large-scale annotated Arabic Corpus. In: NEMLAR Conference on Arabic Language Resources and Tools, pp. 102鈥?09 (2004)
    45. Manning, C.D.; Raghavan, P.; Sch眉tze, H.: Introduction to information retrieval, vol. 1. Cambridge University Press, Cambridge (2008)
    46. Mesfar, S.: Morpho-Syntactic Analysis and Automatic Recognition of Named Entities in Standard Arabic. University of Franche-account, Academic (2008)
    47. Minock, M.: Where are the 鈥榢iller applications鈥?of restricted domain question answering. In: Proceedings of the IJCAI Workshop on Knowledge Reasoning in Question Answering, p. 4 (2005)
    48. Mohammed F.A., Nasser K., Harb H.M.: A Knowledge Based Arabic Question Answering System (AQAS). ACM SIGART Bull. 4(4), 21鈥?0 (1993) CrossRef
    49. Moldovan, D.; Clark, C.; Bowden, M.: Lymba鈥檚 PowerAnswer 4 in TREC 2007. In: Proceedings of the Sixteenth Text REtrieval Conference (TREC 2007). Gaithersburg (2007)
    50. Molla D., Schwitter R., Rinaldi F., Dowdall J., Hess M.: Extrans: extracting answers from technical texts. IEEE Intell. Syst. 18(4), 12鈥?7 (2003) CrossRef
    51. O鈥橲teen, D.; Breeden, D.: Named Entity Recognition in Arabic: A Combined Approach (2009)
    52. Pelzer, B.; Gl枚ckner, I.; Dong, T.: Loganswer in question answering Forums. In: 3rd International Conference on Agents and Artificial Intelligence (ICAART 2011), SciTePress, pp. 492鈥?97 (2011)
    53. Penas, A.; Rodrigo, A.; del Rosal, J.: A simple measure to assess non-response. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 1415鈥?424 (2011)
    54. Penas, A.; Hovy, E.; Forner, P.; Rodrigo, A.; Sutcliffe, R.; Sporleder, C.; Forascu, C.; Benajiba, Y.; Osenova, P.: Overview of QA4MRE at CLEF 2012: question answering for machine reading evaluation. In: CLEF 2012 Workshop on Question Answering For Machine Reading Evaluation (QA4MRE) (2012)
    55. Rashwan M.A., Al-Badrashiny M.A.S.A.A., Attia M., Abdou S.M., Rafea A.: A stochastic Arabic diacritizer based on a hybrid of factorized and unfactorized textual features. IEEE Transactions on Audio Speech Lang. Process. 19(1), 166鈥?75 (2011) CrossRef
    56. Rosso, P.; Lyhyaoui, A.; Pe帽arrubia, J.; y G贸mez, M.M.; Benajiba, Y.; Raissouni, N.: Arabic-English question answering. In: Proc. Symposium on Information Communication Technologies Int., Tetuan, Morocco (2005)
    57. Rosso, P.; Benajiba, Y.; Lyhyaoui, A.: Towards an Arabic question answering system. In: Proc. 4th Conf. on Scientific Research Outlook Technology Development in the Arab world, SROIV, Damascus, Syria, pp. 11鈥?4 (2006)
    58. Sidrine, S.; Souteh, Y.; Bouzoubaa, K.; Loukili, T.: SAFAR: vers une Plateforme Ouverte pour le Traitement Automatique de la Langue Arabe. In: Proc of the 6th Intelligent Systems: Theory and Applications SITA 2010 Conference, Rabat, Morocco (2010)
    59. Silberztein, M.: NooJ: a linguistic annotation system for corpus processing. In: Proceedings of HLT/EMNLP on Interactive Demonstrations, pp. 10鈥?1. Association for Computational Linguistics (2005)
    60. Smucker, M.D.; Allan, J.; Dachev, B.: Human question answering performance using an interactive information retrieval system. Center for Intelligent Information Retrieval Technical Report IR-655, University of Massachusetts (2008)
    61. Taghva, K.; Elkhoury, R.; Coombs, J.: Arabic Stemming without a root dictionary. In: IEEE International Conference on Information Technology: Coding and Computing, 2005. ITCC 2005, vol. 1, pp. 152鈥?57 (2005)
    62. Trigui, O.; Belguith, H.L.; Rosso, P.: DefArabicQA: Arabic definition question answering system. In: Workshop on Language Resources and Human Language Technologies for Semitic Languages, 7th LREC, Valletta, Malta, pp. 40鈥?5 (2010)
    63. Trigui, O.; Belguith, L.H.; Rosso, P.; Amor, H.B.; Gafsaoui, B.: Arabic QA4MRE at CLEF 2012: Arabic question answering for machine reading evaluation. In: CLEF 2012 Workshop on Question Answering For Machine Reading Evaluation (QA4MRE) (2012)
    64. Voorhees, E.M.: Question answering in TREC. In: Proceedings of the Tenth International Conference on Information and Knowledge Management, pp. 535鈥?37. ACM, New York (2001)
    65. Voorhees, E.M.; Harman, D.: Overview of TREC 2001. In: Proceedings of TREC, pp. 1鈥?5 (2001)
    66. Zaghouani, W.; Pouliquen, B.; Ebrahim, M.; Steinberger, R.: Adapting a resource-light highly multilingual named entity recognition system to Arabic. In: Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC鈥?0), pp. 563鈥?67 (2010)
  • 作者单位:Mohamed Shaheen (1)
    Ahmed Magdy Ezzeldin (1)

    1. College of Computing and Information Technology, Arab Academy for Science, Technology and Maritime Transport, Alexandria, Egypt
文摘
Arabic is the 6th most wide-spread natural language in the world with more than 350 million native speakers. Arabic question answering systems are gaining great importance due to the increasing amounts of Arabic content on the Internet and the increasing demand for information that regular information retrieval techniques cannot satisfy. In spite of the importance of Arabic question answering, there is no review that covers Arabic question answering systems, tools, resources, and test-sets so far, which was the motivation for this work. In this survey, different Arabic question answering systems are demonstrated and analyzed and the main question answering tasks like question analysis, passage retrieval, and answer extraction are explored. The main difficulties of modern standard Arabic and how these difficulties are tamed and classified are also explained. Arabic question answering evaluation metrics, test-sets, and language resources are reviewed, and future trends are also highlighted to guide new research in this area. This survey provides guidance for new research in Arabic question answering to get up-to-date knowledge about the state-of-the-art approaches in this area. It also demonstrates the tools created and used by researchers to build an Arabic question answering system.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700