基于词义的汉语排歧方法研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
一词多义是普遍存在的语言现象,但在具体的上下文中一个词语就只有一个确定的意思,如何在具体的语言环境中确定多义词的词义是词义排歧所要研究的内容。本文主要针对汉语词义排歧的问题做了相关的探讨。首先给出了词义排歧研究的目的及其意义,接着根据排歧时所使用的不同的知识源介绍了目前比较常用的几种词义排歧方法,并对其中一些典型的方法做了较为详细的讲解;然后借助句法分析树,运用“中心词关联法”来提取表征多义词词义能力较强的特征词;在此基础上,通过计算多义词每个词义与特征词之间不同义原的相关系数,提出了一种基于义原同现频率的词义排歧方法。最后,根据本文所讨论的主要内容提出了一种汉语词义排歧系统的开发思路,并对其中一些模块进行了代码实现。
It is a universal phenomena in the language that a word possesses many senses, but when a word is in the context it only possesses a certain sense.It is the primary studied content in the field of word sense disambiguation how to confirm the sense of a word in the context. Word sense disambiguation of Chinese will be discussed in this thesis.The author introduces the aim and meaning of word sense disambiguation of Chinese firstly.In succession, The author narrates several methods of word sense disambiguation and explains the theories of some typical methods at length. Afterwards, At the base of parsing tree, the author uses associating headword method to distill the character words the ability of which is strong in expressing the senses of ambiguous word ; Afterwards, the author bring forwards a kind of method of word sense disambiguation which is based on the simultaneous arisen frequency of primitive by calculating the related moduluses of primitive between the senses of ambiguous word and character words.
    At last,the author put forwards the idea of empoldering a system of word sense disambiguation according the primary content of the thesis and some coding experiments of its core modules are conducted.
引文
1 王明会.自然语言处理的研究与进展.电子科技导报,1995(5):10-13
    2 刘小虎.英汉机器翻译中词义消歧方法的研究.:[博士学位论文].东北:哈尔滨工业大学计算机系,1998
    3 刘开瑛.中文文本自动分词和标注.北京:商务印书馆.第一版,2000
    4 Lesk Michal. Automatic sense disambiguation:How to tell a pine from an ice aream cone[A]. Association for computing machinery[C]. 1986,24-26
    5 Gale William A, Church Kenneth W, Yarowsky David.A method for disambiguation word sense in a Large corpus[J]. Computer and the Humanities, 1993 (26) : 415-439
    6 Ronald L. Rivest. Learning decision lists. Machine Learning. 1987 (2) : 229-246
    7 D. Yarowsky. Unsupervised Word Sense Disambiguation Rivaling Supervied Methods. In:The 33rd Annual Meeting of ACL, Cambridge, Massachusetts :ACL. 1995,181-188
    8 童翔.汉语真实文本的义项标注:[硕士学位论文].北京:清华大学计算机系,1993
    9 SZE-SING LAM,KAM-FAI WONG,and VINCENT LUM.LSD-C - A.linguistic-based word-sense disambiguation algorithm for Chinese. Computer Processing of Oreintal Languages. 1997(4), 409-422
    10 李涓子.汉语词义排歧方法研究:[博士学位论文].北京:清华大学计算机系,1999
    11 董振东.《知网》.http://www.keenage.com
    12 王惠.机器翻译中基于语法、语义知识库的汉语词义消歧策略.第七界中国人工智能联合学术会议,2002
    13 詹卫东.词汇分析.http://icl.pku.edu.cn/doubtfire/course/CL/2001_2001_2.htm
    14 李生,张晶,赵铁军,姚建民.词义消歧研究的现状与发展方向.计算机科学,2001(9):95-98
    15 Ido Dagan et al.Word sense Disambiguation Using a Second Language Monolingual corpus.Computational Linguistics, 1994 (4) : 563-596
    16 李涓子,黄昌宁.基于转换的无指导词义标注方法.清华大学学报,1999(7):116—120
    17 Luk,Alpha K.Statistical Sense Disambiguation with Relatively Small Corpora Using Dictionary Definitions. In: ACL eds. The 33rd Annual Meeting of ACL, Cambridge, Massachusetts : 1995, 151-188
    18 SEONG-BAE PARK, BYOUNG-TAK ZHANG, YUNG TAEK KIM Word Sense Disambiguation by Learning Decision Trees from Unlabeled Data. Applied Intelligence, 2003 (19): 27-38
    19 梅加驹.同义词词林.上海:上海辞书出版社.第一版,1983
    
    
    20 卢志茂,刘挺,张刚,李生.基于依存分析改进贝叶斯模型的词义消歧.高技术通讯,2003(5):1-7
    21 朱靖波,李珩,张跃,姚天顺.基于对数模型的词义自动消歧.软件学报,2001(9):1405-1412
    22 DAVID YAROWSKY. Hierarchical Decision Lists for Word Sense Disambiguation. Computer and the Humanities, 2000.179-186
    23 Sehutze, Hinrich. Automatic word sense discrimination. Computational Linguistics, 1998, 95-123
    24 王小捷,常宝宝.自然语言处理技术基础.北京:北京邮电大学出版社.第一版,2002
    25 杨宪泽.自然语言处理的句法分析和规则索引.科技通报.2002(6):470-473
    26 王鹏,戴新宇、陈佳骏、王启祥.基于规则的汉语句法分析方法研究.计算机工程与应用,2003(29):63-66
    27 邹志仁.信息学概论.南京:南京大学出版社.第一版,2000
    28 丁丰、董娜、林碧琴、袁保宗.自然语言处理系统中自动分词的研究.北方交通大学学报,1999(6):31-33
    29 张斌,陈昌来.现代汉语.上海:华东师范大学出版社.第一版,2000
    30 李向宏,王丁,黄成哲,雷国华.自然语言句法分析研究现状和发展趋势.微处理器,2003(2):4-7
    31 赵益民.用VFP实现汉语文献的自动分词.图书情报工作,2002(11):64—66
    32 魏欧,孙玉芳.汉语词性标注方法的研究.计算机科学,2000(7):71—75
    33 沈达阳,孙茂松,黄长宁.汉语自动分词和词性标注一体化系统,中文信息,1996(5):17—19
    34 http://www.nlp. org.cn
    35 王玉美,阮晓钢.基于BP网络的汉语句法分析专家系统.昆明理工大学学报,2003(3):93—96
    36 鲁松,白硕,黄雄.基于向量空间模型中义项词语的无导词义消歧.软件学报,2002(6):1082-1089
    37 李涓子,黄昌宁,杨尔弘.一种自组织的汉语词义排歧方法.中文信息学报,1999(3):1—8
    38 杨尔弘,张国清,张永奎.基于义原同现频率的汉语词义排歧方法.计算机研究与发展;2001(7):833—838
    39 刘群.李素建.基于《知网》的词汇语义相似度计算.http://www.keenage.com/papers
    40 葛瑞芳,李涓子.一个汉语词义自动标注系统的设计与实现.计算机工程与应用,2001(17):170—173
    
    
    41 岑咏华.科技信息门户网站的技术研究:[硕士学位论文].南京:南京理工大学,2003
    42 郑杰,茅于杭,董清富.基于语境的语义排歧方法.中文信息学报,2000(5):1-7
    43 俞士汶,朱学锋.现代汉语语法信息词典祥解.北京:清华大学出版社.第一版,1998
    44 Gale,William A, Kenneth W. Church, and David Yarowsky. Using bilmgual materials to develop word sense disambiguation methods[A]. The International Conference on Theoretical and Methodological Issues in Machine Translation[C]. 1992,101-112
    45 严蔚敏,吴伟民.数据结构.北京:清华大学出版社.第一版,1997
    46 詹卫东.面向中文信息处理的现代汉语短语结构规则研究.北京:清华大学出版社.第一版,2000
    47 程莉,卢正鼎,文坤梅,李娟.基于语义的模糊匹配探索与应用.华中科技大学学报,2003(2):23—25
    48 C.de LOUPY,M.EL-BEZE,P.-F.MARTEAU.Using Semantic Classification Trees for WSD. Computers and the Humanities, 2000 ( 34 ) : 187-192
    49 E. AGIRRE, G. RIGAU, L. PADR(?), J. ATSERIAS. Combining Supervised and Unsupervised Lexical Knowledge Methods for Word Sense Disambiguation.Computers and the Humanities,2000 (34): 103-108
    50 JEREMY ELLMAN, IAN KLINCKE, JOHN TAIT.Word Sense Disambiguation by Information Filtering and Extraction. Computers and the Humanities, 2000 (34) : 127-134
    51 JOHN CARROLL, DIANA McCARTHY. Word Sense Disambiguation Using Automatically Acquired Verbal Preferences. Computers and the Humanities, 2000 ( 34 ): 109-114

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700