详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
Currently, the Chinese syntactic analysis is basically targeted at single sentence. However, the border of Chinese single sentence is very difficult to assure automatically in real corpus. The main form tag is punctuation sentence levels. The prerequisite of Chinese language processing is to formalize. So punctuation sentence become the basic units that computer processes Chinese sentence automatically. The border of Punctuation sentence is clear, but the syntactic elements of many of punctuation sentences is incomplete ,and we need to find them in context. But the problem of syntax analysis of inter - punctuation sentence is not systemic .This makes the parsing of Chinese Long Sentences and the generating of long sentences a poor result,and has become the most difficulty of foreign and Chinese machine translation and the deep-rooted understanding of Chinese Processing. To solve this problem, first, we must investigate the syntactic relations of Chinese-punctuation-sentences carefully and summed up some rules and constraints.
     This work is based on the theory framework of the punctuation sentence. The main purpose is to identify the common element in punctuation sentence, and in order to computer process punctuation sentences expediently ,we need find the formal binding rules besides the stack-type rules in the syntax relation. This work consists of two aspects:
     (1) mark the corps and make a survey and statistics We Marked the total of Qian Zhongshu's "WeiCheng", 22, 6641 words and 2,4115 punctuation sentence. The tags include the syntactic relations between punctuation sentence, the common ingredients, the shallow syntactic structure within the punctuation sentence and we gain the statistical data about each kind of punctuation sentence in marked Corpus. I also use text retrieval tools to do some specialized investigations and statistics on modern and contemporary Chinese novel of tens of millions of characters.
     (2) Mining the constraints
     On the basis of marked corpus and special investigations,we summed up various of constraints of punctuation sentence from about a hundred of big or small aspects . We focus on the punctuation sentences that yuanpei sentence and xupei sentence is homologous and ordinal .The contents include :
     whether the punctuation sentences whose beginning element is noun or pronoun miss subject.
     If structure of yuanpei-sentence is subject–verb-object,the subject of xupei-sentence is subject or object in yuanpei-sentence .We discusses these punctuation sentence whose yuanpei-sentence’predicate is sense verb ,”有”,sentence-object verb,two-verb structure,”像”,”V着”,”V完”as well as the affect to common elements of relevance words, adverb, adjective and noun.
     How to identify the adverbial modifier of xupei-sentence,involving various forms of adverbial. We discuss the domain of negative word in punctuation sentence in a special chapter.
     How to identify the attribute of xupei-sentence, involving quantifiers, adjectives, pronouns, nouns and noun phrase.
     If yuanpei-sentence is把sentence and被sentence ,how to identify the common components in sentence.
     How to identify the overall or part of the noun phrase connected with“跟”in yuanpei-sentence is shared by xupei-sentence.
     If Yuanpei-sentence is jianyu-sentence, how to identify the common components in sentence.
     This work is characteristics in the following aspects:
     (1) About the scope of the study, in addition to previous studies about the subject-predicate punctuation sentence, We also studied the attribute-head punctuation sentence, adverb-head punctuation sentence ,predicate-object punctuation sentence ,predicate-complement punctuation sentence,preposition-object punctuation sentence, spreading completely the syntactic system research of the punctuation sentence.
     (2) About the research perspective, We focus on the formal features of constraints, so the studying results is convenient to operate, and lay a solid foundation for computer processing automatically.
     (3) About the research methods, besides examples ,we not only try to find the language cognitive reasons in traditional methods of self-examination, but also focus on the language phenomenon statistics in real Corpus, and look the statistical data as the reliability corroboration of the rules. In this paper, the major innovative features is the deep mining of the language features from many perspective. The main features are given in the following:
     If Yuanpei-sentence is the structure of“subject-verb-object”,and the xupei-sentence lack of subject,how to identify the xupei-sentence uses the subject or object of yuanpei-sentence .Tthe paper pointed out several important differences features:
     To identify the subjects topic and the topic of object, one of the main indicators is static sentence and dynamic sentence ,and formally defined both punctuation sentence, pointing out the relation about the two kinds of punctuate sentences with the subject topic and the object topic.
     According to the affect of verbs to agentive nouns , verb is divided into the verbs only impacting on Agent nouns and verbs which will have an impact on the patient nouns to distinguish whether the subject convert or not.
     Put forward the concept of information ,and point out if yuanpei-sentence is“有”sentence and or xupei-sentence‘s predicate is middle-state adjective phrase,the confirming of the subject of xupei-sentence has relation to informativity of the object in yuanpei-sentence . The smaller the informativity of object is, the more likelihood object is the subject of xupei-sentence. We divided punctuation sentence into independent punctuation sentence and dependent punctuation sentence to judge whether the two punctuation sentences has relation .with each other.
     We divided nouns into independent and non-independent nouns overall to judge whether the punctuation sentence is integrated or not.
     For the punctuation sentence whose predicate has two vebs and has the relation of main-Vice , the paper used Sentence transform method to attribute them to single predicate sentence which is subject-verb-object and then confirm the subject of xupei-sentence.
     We divide verbs and adjectives predicate overall into directional predicate and non-directional predicate to settle the question whether overall parallel noun phrase is used or part or them is used.
     Put adverbial modifier into sentences adverbial modifier and lexical adverbial modifier to judge whether the adverbial modifier is shared.
     The above concepts and classifications were introduced for the first time in this paper.
     Make detailed classification to many words of each POS from semantic to resolve the confirming of common components in cross-sentence punctuation. Many of these parts has appeared in much linguistics literature, but the methods to define them and the purpose is different, Some of this is put forward the first time. This paper will use these word classes synthetically, and some have been redefined, and we given word list within high-frequency words. These include:
     Verb classes:existential-presentative verbs,pre- existential-presentative verbs,sensory verbs,cognitive verbs,mental verbs,motion-verbs,command verbs,body-motion verbs.
     Nouns classes :organ nouns,attribute nouns,family nouns,mental nouns; Adjective classes :dynamic adjective,static adjective,middle adjective; Adverb classes: momently-motion adverb,mental adverb,modal adverb,time adverb,conjunction adverb,scope adverb,extend adverb and so on; Put forward the concept of mental words,including mental nouns, mental verbs,mental adjective,mental adverb.
     The words classes put forward the first time in the paper are:organ nouns,middle adjective,momently adverb,mental adverb,mental nouns,mental words
     The words classes which appear in linguistics literature but the the method of defined and domain is different sre: pre-existential-presentative verbs,body-action verb,dynamic adjective,static adjective.
     We also use parallel structure to settle the question.
     And so on.
     This work is very preliminary in the field of syntax relation of punctuation sentence. Due to time constraint, many of the issues are not mentioned, many of the problems have only the first step. Research results are more chaotic, not much systemic, not covered algorithm, the procedures. These will be gradually carried out in the future.
[1] 陈平:汉语零形回指的话语分析,《中国语文》1987 年第 5 期
    [2] 陈平:《现代语言学研究》,重庆出版社,1991 年 5 月第一版。
    [3] 方梅:关于复句中分句主语省略的问题,《延边大学学报》,1988 年
    [4] 傅爱平: 汉英机器翻译源语分析中词的识别,《中文信息学报》,1999 年第 5 期
    [5] 候敏,孙建军:汉语中的零形回指及其在在汉英机器翻译中的处理对策,《中文信息学报》,2004 年第 19 卷第 1 期
    [6] 胡德清:流水句的理解与英译,大连外国语学院学报,1999 年第 3 期
    [7] 华宏仪:主语承主语省略探讨,烟台师范学院学报(哲学社会科学版),2001,18(1)
    [8] 华宏仪:主语承非主语省略探讨,烟台师范学院学报(哲学社会科学版),2002,19(2)
    [9] 黄河燕,陈肇雄:基于多策略分析的复杂长句翻译处理算法,《中文信息学报》,2002年,第16卷第3期
    [10]黄南松:现代汉语叙事体语篇中的成分省略,中国人民大学学报,1996 年第 5 期
    [11]黄南松:现代汉语的指称形式及其在篇章中的运用,《世界汉语教学》,2001 年第 2 期。
    [12]蒋平:零形回指的句法和语篇特征研究,上海外国语大学博士学位论文,2004 年。
    [13]李临定:《现代汉语动词》,中国社会科学出版社,1990 年
    [14]李临定:《现代汉语句型》,商务印书馆,1986 年
    [15] 李幸,宗成庆,引入标点处理的层次化汉语长句句法分析方法,《中文信息学报》,2006年,第4期
    [16] 廖秋忠:现代汉语中动词支配成分的省略,《中国语文》1984 年第 4 期
    [17] 廖秋忠:《廖秋忠文集》,北京语言学院出版社,1992 年 10 月。
    [18] 刘长庆:汉语动态形容词的界说及其基本特征,武汉理工大学学报,2006 年 10 月
    [19]刘月华等:《实用现代汉语语法》,商务印书馆,2006 年
    [20]刘倬,傅爱平:机器翻译中汉语的形式和语义分析二题,《中文信息学报》,1999 年
    [22]鲁松,宋柔:汉英机器翻译中描述型复句的关系识别与处理,《软件学报》,2001 年 11卷
    [23]吕叔湘:汉语句法的灵活性,《中国语文》,1986 年第 1 期
    [24] 毛奇,连乐新,周文翠,袁春风,基于标点符号分割的汉语句法分析算,中文信息学报,2007年,第2期
    [25]孟琮等:《汉语动词用法词典》,商务印书馆,2000 年 3 月
    [26]沈阳:动词的句位和句位变体结构中的空语类,《中国语文》,1994 年第 2 期
    [27]沈阳,郑定欧:《现代汉语配价语法研究》,北京大学出版社,1995 年 6 月。
    [28]宋柔:汉语小句前部省略现象初析,《中文信息学报》,1992 年 3 期
    [29]宋柔:一种汉语主语承前省略现象的分析兼谈汉语叙述文处理, 全国机器翻译研讨会, 1992 年
    [30]宋柔,潘维桂,尹振海:关于主语省略的一项实验 ICCC1992,北京,1992.10
    [31]宋柔:基于前缀省略的汉语叙述文篇章结构模型,全国计算语言学联合学术会议,1991年 11 月,杭州
    [32]宋柔:从主语省略现象看汉语记叙文处理,《机器翻译研究进展》,电子工业出版社,1992 年
    [34]宋柔:现代汉语书面语中跨标点句的句法关系研究. 未发表
    [35]田然:现代汉语叙事语篇中 NP 的省略,北京语言大学硕士学位论文,2000 年。
    [36]邢福义:《汉语复句研究》,商务印书馆,2003 年
    [37]夏军:现代汉语省略系统研究,山西大学,2004 年硕士学位论文
    [38]徐纠纠:《现代汉语篇章回指研究》,中国社会科学出版社,2003 年 10 月
    [39]徐烈炯:与空语类有关的一些汉语语法现象,《中国语文》,1994 年第 5 期
    [40]许余龙:《篇章回指的功能语用探索》,上海外语教育出版社,2004 年 11 月
    [41]许余龙:从回指确认的角度看汉语叙述体篇章中的主题标示,《当代语言学》,2005 年2 期
    [42]袁毓林:《汉语动词的配价研究》,江西教育出版社,1998 年
    [43]袁毓林,郭锐:《现代汉语配价语法研究》第二辑,北京大学出版社,1998 年
    [44]袁毓林:并列结构的否定表达,《语言文字应用》,1999 年第 3 期。
    [45] 袁毓林:流水句中否定的辖域及其警示标志,《世界汉语教学》,2000 年第 3 期
    [46]张伯江:论“把”字句的句式语义,《语言研究》,2000 年第 1 期
    [47]张国宪:现代汉语的动态形容词,《中国语文》,1995 年第 3 期
    [48]张国宪:现代汉语形容词的体及形态化的历程,《中国语文》,1998 年第 6 期
    [49]张国宪:延续性形容词的续段结构及其体表现,《中国语文》,1999 年第 6 期
    [50]张谊生:《现代汉语副词研究》,学林出版社,2006 年 6 月
    [51]赵元任:《汉语口语语法》,商务印书馆,2005 年 6 月
    [52]郑锦全:通信本位汉语篇章语法,《世界汉语教学》,1988 年第 1 期
    [53]朱德熙:《语法讲义》,商务印书馆,2000 年
    [54]中国社科院现代汉语研究室《句型和动词》,语文出版社,1987 年 4 月
    [55]Cher-Leng Lee:Zero Anaphora in Chinese,台北文鹤出版有限公司,2002 年 5 月。
    [56] Megumi Kameyama ``A Property-Sharing Constraint in Centering'', Proceedings of the 24th Annual Meeting of the Association for Computational Linguistics, Columbia University New York, New York, USA, 1986, pp 200-206.
    [57] Susan E Brennan, Marilyn W Friedman and Carl J Pollard ``A Centering Approach to Pronouns'', Proceedings of the 25th Annual Meeting of the Association for Computational Linguistics, Stanford University, Stanford, California, USA, 1987, pp 155-162.
    [58] Megumi Kameyama, Rebecca Passonneau and Massimo Poesio ``Temporal Centering'', Proceedings of the 31st Association for Computational Linguistics, Ohio State University, Columbus, 1993, pp 70-77.
    [59] Susan E. Brennan, Marilyn W. Friedman and Carl Pollard ``A centering approach to pronouns'', ACL Proceedings, 25th Annual Meeting, 1987, pp 155-162.
    [60] Megumi Kameyama ``A property-sharing constraint in centering'', ACL Proceedings, 24th Annual Meeting, 1986, pp 200-206.
    [61] Marilyn Walker, Masayo iida and Sharon Cote ``Japanese Discourse and the Process of Centering'', Computational Linguistics, Vol. 20, 2, 1994, pp 193-231.
    [62] Barbara J Grosz, Aravind K Joshi and Scott Weinstein ``Centering: A Framework for Modeling the Local Coherence of Discourse'', Computational Linguistics, MiT Press, Vol. 21, 2, 1995, pp 203-226.
    [63] Andrew Kehler ``Current Theories of Centering for Pronoun interpretation: A Critical Evaluation'', Computational Linguistics, Vol. 23, 3 , 1997.
    [64] S. Cote, M. iida and M. Walker ``Centering in Japanese Discourse'', COLiNG-90: Proceedings of the 13th international Conference on Computational Linguistics, Helsinki, Finland, Vol. 1, 1990, pp ?.
    [65] B. Di Eugenio ``Centering Theory and italian Pronouns'', COLiNG-90: Proceedings of the 13th international Conference on Computational Linguistics, Helsinki, Finland, Vol. 2, 1990, pp 270-276.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700