中亚语言自然语言处理综述
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:A Survey of Central Asian Language Processing
  • 作者:吐尔根·依布拉音 ; 卡哈尔江·阿比的热西提 ; 艾山·吾买尔 ; 买合木提·买买提
  • 英文作者:Tuergun Ibrahim;Kahaerjiang Abiderexiti;Aishan Wumaier;Maihemuti Maimaiti;School of Information Science and Engineering,Xinjiang University;Xinjiang Laboratory of Multi-Language Information Technology;
  • 关键词:土耳其语 ; 哈萨克语 ; 黏着语 ; 形态复杂语
  • 英文关键词:Turkish;;Kazakh;;agglutinative language;;morphological complex languages
  • 中文刊名:MESS
  • 英文刊名:Journal of Chinese Information Processing
  • 机构:新疆大学信息科学与工程学院;新疆大学新疆多语种信息技术实验室;
  • 出版日期:2018-05-15
  • 出版单位:中文信息学报
  • 年:2018
  • 期:v.32
  • 基金:国家自然科学基金(61462083,61762084,61331011,61463048);; 国家973计划(2014cb340506)
  • 语种:中文;
  • 页:MESS201805001
  • 页数:14
  • CN:05
  • ISSN:11-2325/N
  • 分类号:6-18+26
摘要
该文对中亚地区属于同一个语族的土耳其语、哈萨克语等诸语言的自然语言处理现状进行了综述。首先分别回顾土耳其语、哈萨克语和其他中亚语言在词法分析、句法分析、命名实体识别、机器翻译方面的研究进展,随后讨论了与具体语言无关的黏着语词法分析方面的研究情况,最后指出国内外中亚诸语言处理自然语言领域中所面临的问题和挑战,并对未来的研究提出了建议。
        This paper reviews the progresses of natural language processing of Turkish,Kazakh and so on,which belong to the same language family in Central Asia.First,morphological analysis,syntax analysis,named entity recognition and machine translation are reviewed.Then the language independent methods for agglutinative language morphological analysis are discussed.Finally,problems and challenges of Central Asian language processing at home and abroad is summarized,and future studies are suggested.
引文
[1]Oflazer K.Two-level description of Turkish morphology[J].Literary and Linguistic Computing,1994,9(2):137-148.
    [2]Oflazer K.Error-tolerant finite-state recognition with applications to morphological analysis and spelling correction[J].Computational Linguistics,1996,22(1):73-89.
    [3]Eryigit G,Adali E.An affix stripping morphological analyzer for Turkish[C]//Proceedings of the IASTED international conference on artificial intelligence and applications,Vols 1and 2,Innsbruck,Austria:2004,299-304.
    [4]Hakkani-Tür D Z,Oflazer K,Tür G.Statistical morphological disambiguation for agglutinative languages[J].Computers and the Humanities,2002,36(4):381-410.
    [5]Sak H,Gungor T,Saraclar M.Morphological disambiguation of Turkish text with perceptron algorithm[M].Computational linguistics and intelligent text processing,Gelbukh A,2007:107-118.
    [6]Sak H,Guengor T,Saraclar M.Turkish language resources:morphological parser,morphological disambiguator and web corpus[C]//Proceedings of the advances in natural language processing,2008.417-427.
    [7]Sak H,Güngor T,Saraclar M.Resources for Turkish morphological processing[J].Language Resources and Evaluation,2011,45(2):249-261.
    [8]Dincer T,Karaoglan B,Kisla T.a suffix based partof-speech tagger for Turkish[C]//Proceedings of the fifth international conference on information technology:New generations,USA:IEEE Computer Society,2008:680-685.
    [9]Kutlu M,Cicekli I.A hybrid morphological disambiguation system for Turkish[C]//Proceedings of the IJCNLP,2013:1230-1236.
    [10]Yildiz E,Tirkaz C,Sahin H B,et al.A morphologyaware network for Morphological Disambiguation[C]//Proceedings of the 13th AAAI conference on artificial Intelligence,USA,2016:2863-2869.
    [11]Zafer H R,Tilki B,Kurt A,et al.Two-level description of Kazakh morphology[C]//Proceedings of1st International Conference on Foreign Language Teaching and Applied Linguistics.Sarajevo,2011:560-564.
    [12]Yergesh B,Mukanova A,Sharipbay A,et al.Semantic hyper-graph based representation of nouns in the Kazakh language[J].Computación y Sistemas,2014,18(3):627-635.
    [13]Mukanova A,Yergesh B,Sharipbay A,et al.Formal model of adjective in the Kazakh language[J].TRKiYE BiLi爦iM VAKFI BiLGiSAYAR BiLiMLERi ve MHENDiSLi G(i DERGiSi,2015,8(8):57-61.
    [14]Tukeyev U A,Zhumanov Z M,Rakhimova D R.Models and algorithms of translation of the Kazakh language sentences into English language with use of link grammar and the statistical approach[C]//Proceedings of the IV Congress of the Turkic World Mathematical Society,Baku,2011.
    [15]Zhumanov Z M.Understanding of Kazakh language with using of link grammar[C]//Proceedings of the2012Joint 6th International Conference on Soft Computing and Intelligent Systems and 13th International Symposium on Advanced Intelligent Systems(SCIS-ISIS 2012),IEEE,2012:1085-1088.
    [16]Tukeyev U A,Minosz M,Zhumanov Z M.Finitestate transducers with multivalued mappings for processing of rich inflectional languages[M].New trends in intelligent information and database systems.Springer International Publishing,Barbucha D,Nguyen N T,Batubara J.2015:271-280.
    [17]Kuandykova A,Kartbayev A,Kaldybekov T.English-Kazakh parallel corpus for statistical machine translation[J].International Journal on Natural Language Computing(IJNLC),2014,3(3):65-72.
    [18]Rakhimova D,Abakan M.Lexical selection in machine translation of Russian-to-Kazakh[J].TRKiYE BiLi爦iM VAKFI BiLGiSAYAR BiLiMLERi ve MHENDiSLi G(i DERGiSi,2015,8(8):97-102.
    [19]Bekbulatov E,Kartbayev A.A study of certain morphological structures of Kazakh and their impact on the machine translation quality[C]//Proceedings of the 2014IEEE 8th International Conference on Application of Information and Communication Technologies(AICT),Kazakhstan,2014:1-5.
    [20]Kairakbay B M.Finite state approach to the Kazakh nominal paradigm[C]//Proceedings of the Finite State Methods and Natural Language Processing,Scotland,2013:108.
    [21]Kairakbay B M,Nurseitov D B,Stolyarov Y Y,et al.Design and implementation of interactive web system for the Kazakh text recognition and correction with using of parallel computing[C]//Proceedings of the International Journal of New Computer Architectures and their Applications(IJNCAA),2013:100-114.
    [22]Kairakbay B M,Nurseitov D B,Stolyarov Y Y,et al.Integrated high-performance and web-oriented system of the Kazakh language text Recognition[C]//Proceedings of the the 2nd International Conference on Informatics Engineering&Information Science(ICIEIS2013),The Society of Digital Information and Wireless Communication,2013:25-36.
    [23]Kessikbayeva G,Cicekli I.Rule Based Morphological Analyzer of Kazakh Language[C]//Proceedings of the 2014Joint Meeting of SIGMORPHON and SIGFSM,USA:2014,46-54.
    [24]Kessikbayeva G,Cicekli I.A rule based morphological analyzer and a morphological disambiguator for Kazakh language[J].Linguistics and Literature Studies,2016,4(1):96-104.
    [25]Makhambetov O,Makazhanov A,Yessenbayev Z,et al.Assembling the Kazakh language corpus[C]//Proceedings of 2013Conference on Empirical Methods in Natural Language Processing.Seattle,Washington,USA;Association for Computational Linguistics.2013:1022-1031.
    [26]Makazhanov A,Makhambetov O,Sabyrgaliyev I,et al.Spelling correction for Kazakh[C]//Proceedings of the computational linguistics and Intelligent text processing,Germeny:Springer Berlin Heidelberg,2014:533-541.
    [27]Makazhanov A,Yessenbayev Z,Sabyrgaliyev I,et al.On certain aspects of Kazakh part-of-speech tagging[C]//Proceedings of the 2014IEEE 8th International Conference on Application of Information and Communication Technologies(AICT)Kazakhstan,2014:1-4.
    [28]Makhambetov O,Makazhanov A,Yessenbayev Z,et al.Towards a data-driven morphological analysis of Kazakh language[J].TRKiYE BiLi爦iM VAKFI BiLGiSAYAR BiLiMLERi ve MHENDiSLiG(i DERGiSi,2015,8(8):69-74.
    [29]Makhambetov O,Makazhanov A,Sabyrgaliyev I,et al.Data-driven morphological analysis and disambiguation for Kazakh[M].Computational linguistics and Intelligent text processing.Springer International Publishing,Gelbukh A.2015:151-163.
    [30]Yessenbayev Z,Karabalayeva M,Shamayeva F.Towards building an intelligent voice system for Kazakh:Acoustic database and system design[C]//Proceedings of the 8th EUROSIM Congress on Modelling and Simulation(EUROSIM),United Kingdom,2013:393-397.
    [31]Yessenbayev Z,Yapanel U.Perceptual mvdr-based unsupervised built-in speaker normalization for Kazakh speech recognition[C]//Proceedings of the 2014IEEE 8th International Conference on Application of Information and Communication Technologies(AICT),Kazakhstan,2014:1-5.
    [32]Yessenbayev Z,Saparkhojayev N,Tibeyev T.Implementation of the intelligent voice system for Kazakh[J].Journal of Physics:Conference Series,2014,495(1):1-5.
    [33]达吾勒·阿布都哈依尔,古丽拉·阿东别克.哈萨克语词法分析器的研究与实现[J].计算机工程与应用,2008,44(19):146-149.
    [34]侯呈风,古丽拉·阿东别克.改进的HMM应用于哈萨克语词性标注[J].计算机工程与应用,2010,46(36):147-149.
    [35]Altenbek G,Wang X,Haisha G.Identification of basic phrases for Kazakh language using maximum entropy model[C].//Proceedings of the 25th International Conference on Computational Linguistics(COLING 2014)Dublin,Ireland;Association for Computational Linguistics,2014:1007-1014.
    [36]吐尔根·依布拉音,袁保社.新疆少数民族语言文字信息处理研究与应用[J].中文信息学报,2011,25(06):149-156.
    [37]玉素甫·艾白都拉,张海军,艾孜尔古丽.信息处理用现代维吾尔语词干词类标记集研究[J].信息技术与标准化,2011(06):45-48,63.
    [38]玉素甫·艾白都拉,艾孜尔古丽,祖丽皮亚.基于网站用词调查的现代维吾尔语词长研究[J].计算机应用与软件,2012,29(05):32-34.
    [39]Wumaier A,Tursun P,Kadeer Z,et al.Uyghur noun suffix finite state machine for stemming[C]//Proceedings of the 2nd IEEE International Conference on Computer Science and Information Technology,Beijing,China,2009:161-164.
    [40]Wumaier A,Kadeer Z,Tursun P,et al.Maximum entropy combined FSM stemming method for Uyghur[C]//Proceedings of the 2009 Oriental COCOSDA International Conference on Speech Database and Assessments,Urumqi,China,2009:51-55.
    [41]Wumaier A,Yibulayin T,Zaokere K,et al.Conditional random fields combined FSM stemming method for Uyghur[C]//Proceedings of the 2009 2nd IEEE International Conference on Computer Science and Information Technology,Beijing,China,2009:295-299.
    [42]麦热哈巴·艾力,姜文斌,王志洋,等.维吾尔语词法分析的有向图模型[J].软件学报,2012,23(12):3115-3129.
    [43]麦热哈巴·艾力,姜文斌,吐尔根·依布拉音.维吾尔语词法中音变现象的自动还原模型[J].中文信息学报,2012,26(01):91-96.
    [44]帕提古丽·依马木一,买合木提·买买提,卡哈尔江·阿比的热西提,等.基于感知器算法的维吾尔语词性标注研究[J].中文信息学报,2014,28(05):358-362.
    [45]赛迪亚古丽·艾尼瓦尔,向露,宗成庆,等.融合多策略的维吾尔语词干提取方法[J].中文信息学报,2015,(05):204-210.
    [46]Mahmoud A,Pattar A,Hamdulla A.Uyghur stemming using conditional random fields[J].International Journal of Signal Processing,Image Processing and Pattern Recognition,2015,8(8):43-50.
    [47]Tohti T,Musajan W,Hamdulla A.Unsupervised learning and linguistic rule based algorithm for Uyghur word segmentation[J].Journal of Multimedia,2014,9(5):627-634.
    [48]Yang Y,Mi C,Ma B,et al.Character tagging-based word segmentation for Uyghur[M].Machine translation.Shi X,Chen Y.Springer,2014:61-69.
    [49]王海波,祖漪清,力提甫·托乎提.基于功能词缀串的维吾尔语词性标注方法[J].中文信息学报,2013,27(05):179-183.
    [50]张海波,蔡洽吾,姜文斌,等.基于联合音变还原和形态切分的形态分析方法[J].中文信息学报,2014,28(06):9-17.
    [51]Atalay N B,Oflazer K,Say B.The annotation process in the turkish Treebank[C]//Proceedings of the 4th Intern Workshop on Linguistically Interpreteted Corpora(LINC),Citeseer,2003.
    [52]Eryigit G,Oflazer K.Statistical dependency parsing of Turkish[C]//Proceedings of the European Chapter of the Association for Computational Linguistics,Trento,2006:89-96.
    [53]Eryigit G,Nivre J,Oflazer K.Dependency parsing of Turkish[J].Computational Linguistics,2008,34(4):627-627.
    [54]陈莉,古丽拉·阿东别克.基于HMM的柯尔克孜语词性标注的研究[J].计算机工程与应用,2014,50(15):120-124.
    [55]Eryigit G,Ilbay T,Can O A.Multiword expressions in statistical dependency parsing[C]//Proceedings of the Second Workshop on Statistical Parsing of Morphologically Rich Languages,Association for Computational Linguistics,2011:45-55.
    [56]Seeker W,Cetinogglu O.A graph-based lattice dependency parser for joint morphological segmentation and syntactic analysis[J].Transactions of the Association for Computational Linguistics,2015(3):359-373.
    [57]哈里旦木·阿布都克里木,吐尔根·依布拉音,帕力旦·吐尔逊,等.基于短语结构语法的维吾尔语规则库建设[J].现代计算机(专业版),2010(5):30-33.
    [58]Wushouer J,Abulizi W,Abiderexiti K,et al.Building contemporary Uyghur grammatical information dictionary[C]//Proceedings of the Worldwide Language Service Infrastructure,Kyoto,Japan:Springer International Publishing,2016:137-144.
    [59]Mamitimin S,Ibrahim T,Eli M.The annotation scheme for Uyghur dependency treebank[C]//Proceedings of the 2013International Conference on Asian Language Processing(IALP),Urumqi,China,2013:185-188.
    [60]Aili M,Xialifu A,Maimaitimin S.Building Uyghur dependency Treebank:Design principles,annotation schema and tools[C]//Proceedings of the Worldwide Language Service Infrastructure,Kyoto,Japan:Springer,2016:124-136.
    [61]Aili M,Mushajiang W,Yibulayin T,et al.Universal dependencies for Uyghur[C]//Proceedings of the WLSI-OIAF4HLT 2016,Japan,2016:44-50.
    [62]Buchholz S,Marsi E.Conll-X shared task on multilingual dependency parsing[C]//Proceedings of the10th Conference on Computational Natural Language Learning,Association for Computational Linguistics,2006:149-164.
    [63]Tatar S,Cicekli I.Automatic rule learning exploiting morphological features for named entity recognition in Turkish[J].Journal of Information Science,2011,37(2):137-151.
    [64]Seker G A,Eryig(it G.Initial explorations on using CRFS for Turkish named entity recognition[C]//Proceedings of the COLING 2012:Technical Papers,Mumbai,2012:2459-2474.
    [65]Yavuz S,Kücük D,Yazici A.Named entity recognition in Turkish with Bayesian learning and hybrid approaches[M].Information sciences and systems2013.Switzerland:Springer Gelenbe E,Lent R.International Publishing,2013:129-138.
    [66]Kü9ük D.Automatic compilation of language resources for named entity recognition in Turkish by utilizing Wikipedia article titles[J].Computer Standards&Interfaces,2015,41:1-9.
    [67]Demir H,Ozgur A.Improving named entity recognition for morphologically rich languages using word embeddings[C]//Proceedings of the 13th International Conference on Machine Learning and Applications(ICMLA),USA.IEEE,2014:117-122.
    [68]李佳正,刘凯,麦热哈巴·艾力,等.维吾尔语中汉族人名的识别及翻译[J].中文信息学报,2011,25(04):82-87.
    [69]艾斯卡尔·肉孜,宗成庆,姑丽加玛丽·麦麦提艾力,等.基于条件随机场的维吾尔人名识别方法[J].清华大学学报(自然科学版),2013(06):873-877.
    [70]加日拉·买买提热衣木,吐尔根·依布拉音,艾山·吾买尔.基于统计和规则混合策略的维吾尔人名识别研究[J].新疆大学学报(自然科学版),2014,31(03):319-324.
    [71]热合木·马合木提,于斯音·于苏普,张家俊,等.基于模糊匹配与音字转换的维吾尔语人名识别[J].清华大学学报(自然科学版),2017(02):188-196.
    [72]麦合甫热提,米日姑·肉孜,麦热哈巴·艾力,等.基于语法语义知识的维吾尔文机构名识别[J].计算机工程与设计,2014,35(08):2944-2948.
    [73]木合塔尔·艾尔肯,艾斯卡尔·艾木都拉,地里木拉提·吐尔逊.基于规则的维吾尔文地名识别[J].通信技术,2013(7):103-105.
    [74]Tantug(A C,AdalE,Oflazer K.Computer analysis of the Turkmen language morphology[C]//Proceedings.Advances in Natural Language Processing,2006:186-193.
    [75]Tantug(A C,Adali E,Oflazer K.Machine translation between Turkic languages[C]//Proceedings of the45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions,Prague,Czech Republic.Association for Computational Linguistics,2007:189-192.
    [76]Tyers F M,Alperen M S.South-East European Times:A Parallel Corpus of Balkan Languages[C]//Proceedings of the LREC Workshop on Exploitation of Multilingual Resources and Tools for Central and(South-)Eastern European Languages,Valletta,Malta,2010:49-53.
    [77]Mericli B S,Bloodgood M.Annotating cognates and etymological origin in Turkic languages[C]//Proceedings of the 1st Workshop on Language Resources and Technologies for Turkic Languages at LREC2012,Turkey,2012:47-50.
    [78]Eyig9z E,Gildea D,Oflazer K.Simultaneous wordmorpheme alignment for statistical machine translation[C]//Proceedings of the NAACL-HLT 2013,USA,2013:32-40.
    [79]Yildiz O T,Solak E,Gorgün O,et al.Constructing a Turkish-English parallel treebank[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics,Baltimore,Maryland,USA,2014:112-117.
    [80]Yildiz O T,Solak E,Candir S,et al.Constructing a Turkish constituency parse treebank[M].Information sciences and systems 2015.Switzerland:Springer,Abdelrahman O H,al.e.2016:339-347.
    [81]Oflazer K.Turkish and its challenges for language processing[J].Language Resources and Evaluation,2014,48(4):639-653.
    [82]C9ltekin C.A set of open-source tools for Turkish natural language processing[C]//Proceedings of the9th International Conference on Language Resources and Evaluation,Reykjavik,Iceland,2014:1079-1086.
    [83]董兴华,周俊林,郭树盛,等.基于短语的汉维/维汉统计机器翻译[J].计算机工程,2011,37(9):16-18,21.
    [84]董兴华,陈丽娟,周喜,等.汉维统计机器翻译中的形态学处理[J].计算机工程,2011,37(12):150-152.
    [85]陈丽娟,张恒,董兴华,等.基于句法调序的汉维统计机器翻译[J].计算机工程与应用,2011,38(3):169-171.
    [86]艾孜孜·吐尔逊,杨雅婷,吐尔洪·吾司曼,等.维—汉统计机器翻译中维吾尔语预处理研究[J].计算机工程与设计,2014,35(11):4034-4039.
    [87]米成刚,王磊,杨雅婷,等.维汉机器翻译未登录词识别研究[J].计算机应用研究,2013,30(04):1112-1115.
    [88]Wang Z,LüY,Sun M,et al.Stem translation with affix-based rule selection for agglutinative languages[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics,Sofia,Bulgaria.Association for Computational Linguistics,2013,364-369.
    [89]Abiderexiti K,Yao T,Yibulayin T,et al.Implementation of Chinese-Uyghur bilateral EBMT system[C]//Proceedings of the 2013International Conference on Asian Language Processing(IALP),China,Urumqi,2013:87-90.
    [90]Xuehelaiti M,Liu K,Jiang W,et al.Uyghur language model with graphic structure[J].Journal of Multimedia,2014,9(8):1005-1010.
    [91]米莉万·雪合来提,麦热哈巴·艾力,吐尔根·依布拉音,等.维吾尔语词尾对汉维统计机器翻译影响的研究[J].计算机工程,2014,40(03):224-227.
    [92]Washington J N,Ipasov M,Tyers F M.A finitestate morphological transducer for Kyrgyz[C]//Proceedings of the 8th International Conference on Language Resources and Evaluation(LREC!12),Turkey,2012:934-940.
    [93]Washington J N,Salimzyanov I,Tyers F M.Finitestate morphological transducers for three Kypchak Languages[C]//Proceedings of the Ninth International Conference on Language Resources and Evaluation,Iceland,2014:3378-3385.
    [94]Tyers F M,Washington J N,Salimzyanov I,et al.A prototype machine translation system for Tatar and Bashkir based on free/open-source components[C]//Proceedings of 1st Workshop on Language Resources and Technologies for Turkic Languages at LREC2012.Turkey,2012:11-14.
    [95]Salimzyanov I,Washington J N,Tyers F M.A Free/Open-Source Kazakh-Tatar Machine Translation System[C]//Proceedings of the XIV Machine Translation Summit,Nice:2013,175–182.
    [96]Ogawa Y,Fukuda M,Toyama K.Transliteration from Uighur to Uzbek for expansion of Japanese translation dictionary[C]//Proceedings of the recent advances of asian language processing technologies,2008:182-188.
    [97]Wushouer M,Ishida T,Lin D.A heuristic framework for pivot-based bilingual dictionary induction[C]//Proceedings of the 2013International Conference on Culture and Computing(Culture Computing),IEEE,2013:111-116.
    [98]Wushouer M,Lin D,Ishida T,et al.A constraint approach to pivot-based bilingual dictionary induction[J].ACM Transactions on Asian and Low-Resource Language Information Processing,2015,15(1):1-26.
    [99]Sheymovich A V,Dybo A V.Towards a morphological annotation of the Khakass corpus[C]//Proceedings of 1st Workshop on Language Resources and Technologies for Turkic Languages at LREC 2012,2012:39-46.
    [100]Galieva A,Gatiatullin A,Nevzorova O,et al.Semantic annotation of Tatar verbs for linguistic applications[J].TRKiYE BiLi爦iM VAKFI BiLGiSAYAR BiLiMLERi ve MHENDiSLi G(i DERGiSi,2014,8(8):45-49.
    [101]Suleymanov D S,Gatiatullin A R,Almenova A B.Multifunctional model of morphemes in the Turkic group languages(on the Example of the Kazakh and Tatar Languages)[J].TRKiYE BiLi爦iM VAKFI BiLGiSAYAR BiLiMLERi ve MHENDiSLi G(i DERGiSi,2014,8(8):63-67.
    [102]木合亚提·尼亚孜别克,古力沙吾利·塔里甫,达吾勒·阿布都哈依尔.柯尔克孜语语料库语言资源管理平台的设计与开发[J].南昌大学学报(理科版),2015(03):247-250.
    [103]Kurimo M,Virpioja S,Turunen V,et al.Morpho Challenge Competition 2005—2010:Evaluations and Results[C]//Proceedings of the 11th Meeting of the ACL Special Interest Group on Computational Morphology and Phonology,2010:87-95.
    [104]Baisa V,Suchomel V.Large corpora for Turkic languages and unsupervised morphological analysis[C]//Proceedings of the 8th Conference on International Language Resources and Evaluation(LREC’12),Istanbul,Turkey:European Language Resources Association(ELRA),2012:28-32.
    [105]Narasimhan K,Barzilay R,Jaakkola T.An unsupervised method for uncovering morphological chains[J].Transactions of the Association for Computational Linguistics,2015(3):157-167.
    [106]StrakováJ,Straka M,Hajic^J,et al.Open-source tools for morphology,lemmatization,pos tagging and named entity recognition[C]//Proceedings of52nd Annual Meeting of the Association for Computational Linguistics:System Demonstrations,USA,2014:13-18.
    [107]Faruqui M,Tsvetkov Y,Neubig G,et al.Morphological inflection generation using character sequence to sequence learning[J/OL]2015,arXiv preprint arXiv:1512.06110v2.
    [108]Faruqui M,Mcdonald R,Soricut R.Morpho-syntactic lexicon generation using graph-based semi-supervised learning[J/OL]2015,arXiv preprint arXiv:1512.05030.
    [109]Durrett G,Denero J.Supervised learning of complete morphological paradigms[C]//Proceedings of the NAACL-HLT 2013,USA,2013:1185-1195.
    [110]Ahlberg M,Forsberg M,Hulden M.Semi-supervised learning of morphological paradigms and lexicons[C]//Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics,Sweden,2014:569-578.
    [111]Ahlberg M,Forsberg M.Paradigm classification in supervised learning of morphology[C]//Proceedings of the Main Conference HLT-NAACL 2015Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics,USA,2015:1024-1029.
    [112]Nicolai G,Cherry C,Kondrak G.Inflection generation as discriminative string transduction[C]//Proceedings of the Main Conference HLT-NAACL2015 Human Language Technology,USA,2015:922-931.
    (1)http://www.cmpe.boun.edu.tr/~hasim
    (2)Formalization of syntactic rules of the Kazakh language,http://www.enu.kz/repository/repository2012/pdf/4.pdf(2017,1,10)
    (1)http://ii.metu.edu.tr/node/495
    (1)http://universaldependencies.org/#ug
    (1)http://elx.dlsi.ua.es/~fran/SETIMES/
    (2)http://research.ics.aalto.fi/events/morphochallenge/
    (1)http://en.wiktionary.org
    (2)http://corpora.uni-leipzig.de
    (3)http://zemberek.googlecode.com/
    (4)https://en.wiktionary.org/wiki/Wiktionary:Statistics

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700