摘要
词义消歧是自然语言信息处理领域的基础研究,对自然语言信息处理领域的研究至关重要。为解决词义消歧中提取关联词语不精确进而影响词义消歧正确率的问题,该文依存句法模板设计了5种复合特征模板,并结合最大熵模型进行训练。实验证明,使用该复合模板,不仅降低了计算复杂度,而且提高了词义消歧的性能。对500余条复句进行词义消歧,取得了较好的词义消歧正确率。
Word sense disambiguation is a basic research in the field of natural language information processing. It is very important for the study of natural language information processing. In order to solve the problem of inaccuracy of word association in word sense disambiguation,this paper proposes a dependency syntax template,and designs five kinds of compound templates,which are combined with the maximum entropy model. Experiments show that the proposed method can reduce the computationalcomplexity and improve the performance of word sense disambiguation. More than 500 complex sentences are disambiguation and the correct rate of word sense disambiguation is higher.
引文
[1]Jing Wen Zhan,Yan min Chen.Research on Word SenseDisambiguation[A].In Proceedings of The 2010 Interna-tional Conference on Material Science&Technology,2011.477-482.
[2]鲁松,白硕,黄雄,等.基于向量空问模型中义项词语的无导词义消歧[J].软件学报,2002,13(6):1082-1089.LU Song,BAI Shuo,HUANG Xiong,et al.Non-lead wordsense disambiguation based on the meaning term of vectorspace model[J].Journal of Software,2002,13(6):1082-1089.
[3]杨险卓,黄河燕.基于词语距离的网络图词义消歧[J].软件学报,2012,23(4):776-785.YANG Zhuozhuo,HUANG Heyan.Symbolic disambigua-tion of network graphs based on word distance[J].Journalof Software,2012,23(4):776-785.
[4]张仰森.基于最大熵模型的汉语词义消歧与标注方法[J].计算机工程,2009,35(18):15-17.ZHANG Yangsen.Chinese word sense disambiguation andannotation method based on maximum entropy model[J].Computer Engineering,2009,35(18):15-17.
[5]李永亮,黄曙光,鲍蕾,等.一种基于Page Rank算法和知网的词义消歧方法[J].计算及应用与软件,2011,28(5):213-215.LI Yongliang,HUANG Shuguang,BAO Lei,et al.A meth-od of word sense disambiguation based on Page Rank algo-rithm and knowledge network[J].Computer Applicationsand Software,2011,28(5):213-215.
[6]范冬梅,卢志茂,张汝波,等.基于信息增益改进贝叶斯模型的汉语词义消歧[J].电了与信息学报,2008,30(12):2926-2929.FAN Dongmei,LU Zhimao,ZHANG Rubo,et al.Chineseword sense disambiguation base on information gain im-provement Bayesian model[J].Journal of Electronics Letters,2008,30(12):2926-2929.
[7]王素格.基于Web的评论文本感情分类问题研究[D].上海:上海大学,2008,45-88.WANG Suge.Web-based review of emotional classifica-tion of text problems[D].Shanghai:Shanghai University,2008,45-88.
[8]袁文宜.依存语法概述[J].科技情报开发与经济,2010,15(8):152-154.YUAN Wenyi.Overview of Dependent Syntax[J].Sci-Tech Information Developmet and Economy,2010,15(8):152-154.
[9]Manning,H Schatze.Foundations of statistical natural lan-guage processing[M].The MIT Press,Cambridge,Massa-chusetts,London,England,1999,229-260.
[10]刘小虎.英汉机器翻译中词义消歧方法研究[D].哈尔滨:哈尔滨工业大学计算机系,1998,18-25.LIU Xiaohu.Research on word sense disambiguation inEnglish-Chinese machine translation[D].Harbin:Doctor-al Dissertation of Harbin Institute of Technology,1998,18-25.