汉语动词名物化复合结构的语义解释
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
网络信息和信息检索在人们日常生活中已成为不可或缺的组成部分,语言文字占据信息形式上的绝大多数,实际上关注的是语言形式所承载的内容信息,本质上都涉及自然语言的语义概念。自然语言成分结构间的语义关系计算问题是自然语言理解的关键,其本质在于计算语言结构与语言语义之间的对应关系。如何寻找新的思路及其理论和方法,使得语言的结构和语义尽量能同构对应,尤其是适应动态地计算语言复合结构的概念意义,具有重要的理论研究意义和广阔的应用前景。尽管语言表达有句、短语等多种形式,但从概念分析角度看都可归结到词汇概念间的组合叠置。这与当前国内外语言学聚焦于词汇理论相一致。
     复合结构是一种由若干个名词性词汇直接组合而成,在整体上相当于一个新的名词性词汇的语言结构形式。和短语、句等语言结构不同,复合结构的构成缺乏功能标记,这对其语义计算形成很大障碍,实际上成为语义计算的一个瓶颈问题。本文主要解决汉语中动词名物化进入复合结构时的语义解释问题。研究的起点是从实例分析出发,剖析以往语法研究中的不足,标引复合结构子成分之间的概念关系,归纳复合结构中概念耦合的内在特点以及多语种表达式在复合结构这一层次上存在自然对齐的潜在可能性。首先,作为数据准备工作,研究了动词名物化复合结构的识别;然后,分别构建了两种基本动词名物化复合结构(NV型和VN型)的语义解释模型;最后,还探讨了属性知识在复合结构语义解释中的应用。
     具体来讲,本文的创新性工作有以下几点:
     一、提出了一种基于主题词表和万维网的复合结构识别方法。为了有效地解决汉语中名词和动词组合时的结构歧义问题,构造了两个新的分类特征集合:词汇复合能力和指称模板特征。特征的获取使用了两个独立的资源:主题词表和万维网,其好处在于不依赖于复合结构出现的具体上下文,可以用于对文档中的低频复合结构进行识别,而这是以往的识别模型所无法解决的问题。机器学习实验表明,两个新特征集极大的改善了动词名物化复合结构识别的性能。
     二、归纳了汉语NV型复合结构中涉及的语义关系,构建了一个基于词汇语法模板的复合结构语义解释模型。模型定义了新的词汇模板形式:功能词例化模板,并将其作为分类特征,对复合结构词汇间的语义关系进行标注。模型的主要优点是其对资源的依赖性很低,以往的方法主要利用词汇本体和句法语料,而该模型则使用纯文本语料来获取复合结构的分类特征,从而使得模型的适用性和可移植性大大增强。实验表明,基于功能词例化模板的模型取得了很好的性能。
     三、提出了汉语VN型复合结构的语义关系标注集,并设计了一个机器翻译驱动的复合结构语义解释模型。基于复合结构的多语种自然同构假设,模型首先将汉语复合结构自动翻译为对齐的英文复合结构,然后将英文复合结构作为附加信息,用于对汉语的复合结构进行解释。模型的主要优点是可以利用跨语种的资源,对多语种的对齐复合结构同时进行语义解释,从而可以在某种程度解决某些语种中的资源缺乏问题。实验证实,双语语义解释模型的性能要好于单语模型。
     四、构建了一个属性知识库的获取框架。词汇概念可以被描述为属性和属性值的集合,属性知识对于复合结构语义解释非常重要。属性获取分为两个阶段,一个阶段是属性词的获取,一个阶段是属性宿主的求取。在属性词的获取中,设计了一个机读词典和万维网的协同自举算法。算法充分利用了汉语的义符构词特点,并结合了机读词典和万维网作为属性知识的来源,对属性词进行获取。而针对属性宿主的求取,则将其视为一个选择约束求解问题,通过评估属性与可能的概念类之间的选择关联度来确定属性的宿主。该方法的特点在于其可以动态、高效地获取以属性词为中心的词汇知识。
     五、利用所获取的属性知识,提出了一种基于属性词的词汇相似度计算模型。与以往基于词汇层级知识体系的相似度计算方法不同,该模型充分利用了词汇概念所可能具有的属性词信息来对词汇概念进行表征。属性词可以对概念的各个不同方面进行刻画,如果两个词汇概念共享的关键属性信息越多,则两个词汇概念越为相似,从而,用属性词向量表示词汇概念可以更加精细的刻画词汇概念之间的区分程度。在标准数据集评测以及复合结构语义解释的应用上,该模型取得了比其他词汇相似度模型更好的性能。
The information on the Web and information retrieval has become an essential part indaily life. Language is the main form of information. It is an urge need to make comput-ers understand the content and semantics of the language information. The computation ofsemantic relations between natural language structures is the key for natural language un-derstanding. The essence of the semantic computation is to compute the correspondencebetween structures and semantic representations. Although there are many forms of lan-guage structures, they can all be reduced to word combinations. This is in accordance withthe trend of lexical approach in language theory.
     Compound is a consecutive sequence of nominal words which functions as a new nom-inal word as a whole. The semantic problem in word combination has been a major concernfor scholars working in this area, because the research on it has important significance inboth theory and application. However, there are no semantic clues like functional wordsin compound formation as in other language structures which presents a big challenge forcomputing the semantic of compound expressions. This dissertation focuses on the semanticinterpretation of a subset of Chinese compounds in which a verb nominalization is involved.First, as a work of data preparation, this thesis explores the problem of compound iden-tification. Second, it constructs the interpretation models for the two basic types of verbnominalization compounds (NV compound and VN compound), respectively. At last, thisthesis explores the application of attribute knowledge to compound interpretation.Concretely to say, the creative work of this dissertation includes the following aspects:
     1. The author proposes a method for compound identification based on thesaurus andthe Web. To solve the structural ambiguity in Chinese verb and noun combination, the iden-tification model introduces two novel feature sets, one is compounding ability and the otheris referential patterns. The acquisition of such features doesn’t rely on the context of thecompound candidate. Instead, it uses two independent sources: thesaurus and the Web. Theadvantage of such an approach is that it has the ability to recognize compounds with low frequency in text. The machine learning experiments show that the novel features greatlyimprove the performance of compound identification.
     2. The author introduces the semantic relations involved in Chinese NV compound,and then, implements a compound interpretation model based on lexical syntactic patterns(LSPs). A new form of LSP is defined which is called functional lexicalized patterns (FLPs).The FLP vector of a NV compound is used as the features for the labeling of its semanticrelations. Different from previous approaches which mainly rely on ontologies or treebanks,the model exploits plain text for acquiring the classification features, which makes it morerobust and easy to generalize.
     3. The author presents the set of semantic relations of Chinese VN compounds, andthen proposes a translation-driven bilingual compound interpretation model. The model firsttranslates Chinese compounds into their English equivalents. Then, it explores the Englishcompounds as additional information to interpret Chinese VN compounds. The main merit ofthe model is that it can use the cross linguistic resources to interpret multilingual compoundsat the same time. The experiments verify that bilingual model has a better performance thanthe monolingual model.
     4. For the purpose of application in compound interpretation, the author designs aframework for the construction of attribute knowledge base. It includes two phases: the firstis attribute word acquisition and the second is attribute host computation. In the first phase,the author proposes a multi-resource bootstrapping algorithm which boots off from a set ofChinese morphemes and exploits both an MRD and the Web as the resource. In the secondphase, the author models it as a problem of selectional constraint resolution. The character-istic of the framework is that it can dynamically acquire attribute-centered knowledge. Theexperiments show the algorithms are very effective.
     5. Applying the acquired attribute knowledge, the author presents an attribute-basedword similarity model. Different from previous models which mainly explore the IS-A tax-onomies, the proposed model represents a word concept by the attribute word it can take.If two concepts share more important attributes, they will be more similar. Such an at-tribute representation of concepts can make more fine-grained difference between word con-cepts. The attribute-based word similarity model gets good results evaluated both in standarddatasets and in the application to compound interpretation.
引文
[1] R. Leonard. The interpretation of English noun sequences on the computer. North-Holland, 1984.
    [2]陆汝占.概念、语义计算及内涵逻辑.中文信息处理若干重要问题, 2003.
    [3] M. Lauer. Designing Statistical Language Learners: Experiments on Compound Nouns. PhDthesis, Ph. D. thesis, Macquarie University, Sydney, 1995.
    [4] P. Downing. On the Creation and Use of English Compound Nouns. Language, 53(4):810–842,1977.
    [5] D.R. Dowty. Word Meaning and Montague Grammar: The Semantics of Verbs and Times in Gen-erative Semantics and in Montague’s PTQ. Springer, 1979.
    [6] L. Bauer. English Word-formation. Cambridge University Press, 1983.
    [7] J.R. Hobbs, M.E. Stickel, D.E. Appelt, and P. Martin. Interpretation as Abduction. ArtificialIntelligence, 63(1-2):69–142, 1993.
    [8] M. Minsky. A framework for the representation of knowledge. The psychology of computer vision.New York: McGraw-Hill, 1975.
    [9] R.C. Schank. Conceptual Information Processing. Elsevier Science Inc. New York, NY, USA,1975.
    [10] C.J. Fillmore. Frame semantics and the nature of language. Annals of the New York Academy ofSciences: Conference on the Origin and Development of Language and Speech, 280:20–32, 1976.
    [11] T.W. Finin. The semantic interpretation of compound nominals. Dissertation Abstracts Interna-tional Part B: Science and Engineering[DISS. ABST. INT. PT. B- SCI. & ENG.],, 41(6), 1980.
    [12] D.B. Mcdonald. Understanding noun compounds. Carnegie-Mellon University, 1982.
    [13] R.M. Kaplan and J. Bresnan. Lexical functional grammar. Bresnan, J., editor, pages 173–281,1982.
    [14] G. Gazdar et al. Generalized phrase structure grammar. Harvard University Press Cambridge,Mass, 1985.
    [15] C.J. Pollard and I.A. Sag. Information-based syntax and semantics. Center for the Study of Lan-guage and Information Stanford, CA, 1987.
    [16] W.G. ter Stal and P.E. van der Vet. Two-level semantic analysis of compounds. CLIN IV, papersfrom the fourth CLIN meeting, Dept. of Alfa-Informatics, University of Groningen, Groningen, theNetherlands, pages 163–178, 1994.
    [17] B. Jones. Predicating Nominal Compounds. Proceedings of the Seventeenth Annual Conference ofthe Cognitive Science Society, 1995.
    [18] A. Copestake and D. Flickinger. An open-source grammar development environment and broad-coverage English grammar using HPSG. Conference on Language Resources and Evaluation,2000.
    [19] J.J. Pustejovsky. The Generative Lexicon. Bradford Book, 1995.
    [20] M. Johnston and F. Busa. Qualia structure and the compositional interpretation of compounds.Proceedings of the ACL SIGLEX workshop on breadth and depth of semantic lexicons, Santa Cruz,CA, 1996.
    [21] JM Moravcsik. Aitia as Generative Factor in Aristotle’s Philosophy. Dialogue, 14:622–636,1975.
    [22] J.N. Levi. The Syntax and Semantics of Complex Nominals. Academic Press, 1978.
    [23] P.M. Roget. Roget’s Thesaurus of English Words and Phrases: Classified and Arranged So as toFacilitate the Expression of Ideas and Assist in Literary Composition. Thomas Y. Crowell company,1911.
    [24] S.N. Kim and T. Baldwin. Interpreting Semantic Relations in Noun Compounds via Verb Seman-tics. Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, pages 491–498,2006.
    [25] C. Fellbaum. Wordnet: an electronic lexical database. MIT Press, 1998.
    [26] F.J. Costello, T. Veale, and S. Dunne. Using WordNet to Automatically Deduce Relations betweenWords in Noun-Noun Compounds. Proceedings of the COLING/ACL on Main conference postersessions, pages 160–167, 2006.
    [27] B. Rosario and M. Hearst. Classifying the semantic relations in noun compounds via a domain-specific lexical hierarchy. Proceedings of the Conference on Empirical Methods in Natural Lan-guage Processing (EMNLP-01), 2001.
    [28] S.N. Kim and T. Baldwin. Automatic interpretation of noun compounds using WordNet similarity.Proc. of IJCNLP-05, pages 945–956, 2005.
    [29] T. Pedersen, S. Patwardhan, and J. Michelizzi. WordNet:: Similarity-Measuring the Relatedness ofConcepts. Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI-04),pages 1024–1025, 2004.
    [30] D.A. Evans and C. Zhai. Noun-phrase analysis in unrestricted text for information retrieval. Pro-ceedings of the 34th Annual Meeting of the Association for Computational Linguistics, pages 17–24, 1996.
    [31] C. Zhai. Fast statistical parsing of noun phrases for document indexing. Proceedings of the fifthconference on Applied natural language processing, pages 312–319, 1997.
    [32] G. Grefenstette. The World Wide Web as a Resource for Example-Based Machine TranslationTasks. prostate, 28:40772.
    [33] T. Tanaka and T. Baldwin. Translation selection for Japanese-English noun-noun compounds. Proc.of the Ninth Machine Translation Summit (MT Summit IX), pages 89–96, 2003.
    [34] L. Barrett, A.R. Davis, and B.J. Dorr. Interpretation of Compound Nominals Using WordNet.Proceedings of the Second International Conference on Computational Linguistics and IntelligentText Processing, pages 169–181, 2001.
    [35] A.L. Berger, V.J. Della Pietra, and S.A. Della Pietra. A maximum entropy approach to naturallanguage processing. Computational Linguistics, 22(1):39–71, 1996.
    [36] K. Nigam, J. Lafferty, and A. McCallum. Using maximum entropy for text classification. IJCAI-99Workshop on Machine Learning for Information Filtering, pages 61–67, 1999.
    [37] W. Skut and T. Brants. A maximum-entropy partial parser for unrestricted text. Sixth Workshop onVery Large Corpora, pages 143–151, 1998.
    [38] A. Ratnaparkhi. Maximum entropy models for nautural language ambiguity resolution. PhD thesis,University of Pennsylvania, 1998.
    [39] D. Bourigault. Surface grammatical analysis for the extraction of terminological noun phrases.Proceedings of the 14th conference on Computational linguistics-Volume 3, pages 977–981, 1992.
    [40] N. Calzolari and R. Bindi. Acquisition of lexical information: from a large textual Italian corpus.Proceedings of the 13th conference on Computational linguistics-Volume 3, pages 54–59, 1990.
    [41] C. Jacquemin. A symbolic and surgical acquisition of terms through variation. Connectionist,Statistical and Symbolic Approaches to Learning for Natural Language Processing, pages 425–438, 1996.
    [42] K.Y. Su, M.W. Wu, and J.S. Chang. A corpus-based approach to automatic compound extraction.Proceedings of the 32nd annual meeting on Association for Computational Linguistics, pages 242–247, 1994.
    [43] J. Zhang, J. Gao, and M. Zhou. Extraction of Chinese compound words: An experimental studyon a very large corpus. Proceedings of the Second Chinese Language Processing Workshop, pages132–139, 2000.
    [44] L.F. Chien. PAT-tree-based keyword extraction for Chinese information retrieval. Proceedings ofthe 20th annual international ACM SIGIR conference on Research and development in informationretrieval, pages 50–58, 1997.
    [45] M. Lapata and A. Lascarides. Detecting novel compounds: The role of distributional evidence.Proceedings of the 10th Conference of the European Chapter of the Association for ComputionalLinguistics, pages 235–242, 2003.
    [46] R. Malouf. A comparison of algorithms for maximum entropy parameter estimation. InternationalConference On Computational Linguistics, pages 1–7, 2002.
    [47] S. Chen et al. Indexing Manual for the Chinese Classification Subject Thesaurus. Beijing LibraryPress, 1998.
    [48] H. Fang, Ruzhan Lu, and Shaoming. Liu. System of Implementing Chinese Corpus Segmentationand Tagging Algorithms. Computer Engineering, 2004.
    [49] M.A. Hearst. Automatic acquisition of hyponyms from large text corpora. Proceedings of the 14thconference on Computational linguistics-Volume 2, pages 539–545, 1992.
    [50] S. Brin. Extracting patterns and relations from the world wide web. WebDB Workshop at 6thInternational Conference on Extending Database Technology, EDBT’98, pages 172–183, 1998.
    [51] G. Grefenstette and J. Nioche. Estimation of English and non-English Language Use on the WWW.Arxiv preprint cs.CL/0006032, 2000.
    [52] R. Jones and R. Ghani. Automatically building a corpus for a minority language from the web.Proceedings of the Student Research Workshop at the 38th Annual Meeting of the Association forComputational Linguistics, pages 29–36, 2000.
    [53] X. Zhu and R. Rosenfeld. Improving trigram language modeling with the World Wide Web. Acous-tics, Speech, and Signal Processing, 2001. Proceedings.(ICASSP’01). 2001 IEEE InternationalConference on, 1, 2001.
    [54] F. Keller and M. Lapata. Using the web to obtain frequencies for unseen bigrams. ComputationalLinguistics, 29(3):459–484, 2003.
    [55] P.D. Turney. Mining the Web for synonyms: PMI-IR versus LSA on TOEFL. Proceedings of theTwelfth European Conference on Machine Learning, pages 491–502, 2001.
    [56] O. Etzioni, M. Cafarella, D. Downey, S. Kok, A.M. Popescu, T. Shaked, S. Soderland, D.S. Weld,and A. Yates. Web-scale information extraction in knowitall:(preliminary results). Proceedings ofthe 13th international conference on World Wide Web, pages 100–110, 2004.
    [57] LA Kurgan and KJ Cios. CAIM discretization algorithm. Knowledge and Data Engineering, IEEETransactions on, 16(2):145–153, 2004.
    [58] G. Jin, S. Guo, H. Xiao, and Y. Zhang. Standardization for Corpus Processing. Applied Linguistics,pages 16–24, 2003.
    [59] A. McCallum and K. Nigam. A comparison of event models for naive bayes text classification.AAAI-98 Workshop on Learning for Text Categorization, 752, 1998.
    [60] R.R. Bouckaert. Bayesian Network Classifiers in Weka. Technical Report, Department of Com-puter Science, Waikato University, Hamilton, NZ, 2004.
    [61] MT Musavi, W. Ahmed, KH Chan, KB Faris, and DM Hummels. On the training of radial basisfunction classifiers. Neural Networks, 5(4):595–603, 1992.
    [62] Y. Freund and L. Mason. The alternating decision tree learning algorithm. Machine Learning:Proceedings of the Sixteenth International Conference, pages 124–133, 1999.
    [63] L. Breiman. Random Forests. Machine Learning, 45(1):5–32, 2001.
    [64] I.H. Witten, E. Frank, L. Trigg, M. Hall, G. Holmes, and S.J. Cunningham. Weka: PracticalMachine Learning Tools and Techniques with Java Implementations. ICONIP/ANZIIS/ANNES,pages 192–196, 1999.
    [65] M. Lapata. The disambiguation of nominalizations. Computational Linguistics, 28(3):357–388,2002.
    [66] C. Grover, A. Lascarides, and M. Lapata. A comparison of parsing technologies for the biomedicaldomain. Natural Language Engineering, 11(01):27–65, 2005.
    [67] B.L. Humphreys, D.A.B. Lindberg, H.M. Schoolman, and G.O. Barnett. The Unified MedicalLanguage System An Informatics Research Collaboration, 1998.
    [68] D.A. Dahl, M.S. Palmer, and R.J. Passonneau. Nominalizations in PUNDIT. Proceedings ofthe 25th Annual Meeting of the Association for Computational Linguistics, Stanford University,Stanford, CA, July, 1987.
    [69] R.D. Hull and F. Gomez. Semantic interpretation of nominalizations. AAAI Conference, pages1062–1068, 1996.
    [70] S. Pradhan, H. Sun, W. Ward, J.H. Martin, and D. Jurafsky. Parsing Arguments of Nominalizationsin English and Chinese. Proc. of HLT-NAACL, 2004.
    [71] C.F. Baker, C.J. Fillmore, and J.B. Lowe. The Berkeley FrameNet project. Proceedings of theCOLING-ACL, 98, 1998.
    [72] N. Xue, F. Xia, F.U.D. Chiou, and M. Palmer. The Penn Chinese TreeBank: Phrase structureannotation of a large corpus. Natural Language Engineering, 11(02):207–238, 2005.
    [73] N. Xue. Semantic Role Labeling of Nominalized Predicates in Chinese. Proceedings of the HumanLanguage Technology Conference of the North American Chapter of the ACL, 2006.
    [74] M. Berland and E. Charniak. Finding parts in very large corpora. Proceedings of the 37th annualmeeting of the Association for Computational Linguistics on Computational Linguistics, pages 57–64, 1999.
    [75] S. Brin. Extracting Patterns and Relations from the World Wide Web. The World Wide Web andDatabases: International Workshop WebDB’98: Valencia, Spain, March 27-28, 1998: SelectedPapers, 1999.
    [76] E. Agichtein and L. Gravano. Snowball: extracting relations from large plain-text collections.Proceedings of the fifth ACM conference on Digital libraries, pages 85–94, 2000.
    [77] M. Thelen and E. Riloff. A bootstrapping method for learning semantic lexicons using extractionpattern contexts. Proceedings of the ACL-02 conference on Empirical methods in natural languageprocessing-Volume 10, pages 214–221, 2002.
    [78] M. Pennacchiotti and P. Pantel. A Bootstrapping Algorithm for Automatically Harvesting SemanticRelations. Proceedings of Inference in Computational Semantics (ICoS-06), Buxton, England,2006.
    [79] DR Dowty. Thematic ProtoRoles and Argument Selection. Second Conference on Maritime Ter-monology, Turku, 33:31–38, 1991.
    [80] D. Zhu. Lectures on Grammar. Beijing: The Commercial Press, 1982.
    [81] Y.W. Chen and C.J. Lin. Combining SVMs with various feature selection strategies. Departmentof Computer Science and Information Engineering, 2005.
    [82] S. Siegel and NJ Castellan. Nonparametric statistics for the behavioral sciences. McGraw-HiUBook Company, New York, 1988.
    [83] P.S. Resnik. Selection AND Information: AC LASS-Based Approach TO Lexical Relationships.PhD thesis, University of Pennsylvania, 1993.
    [84] Z. Dong and Q. Dong. HowNet and the Computation of Meaning. World Scientific, 2006.
    [85] L. Qun and L. Sujian. Word Similarity Computing Based on HowNet. Computational Linguisticsand Chinese Language Processing, 7(2):59–76, 2002.
    [86] P.F. Brown, S.D. Pietra, V.J.D. Pietra, and R.L. Mercer. The Mathematic of Statistical MachineTranslation: Parameter Estimation. Computational Linguistics, 19(2):263–311, 1994.
    [87] P. Fung and L.Y. Yee. An IR approach for translating new words from nonparallel, comparabletexts. Proceedings of the 17th international conference on Computational linguistics-Volume 1,pages 414–420, 1998.
    [88] K. Tanaka and H. Iwasaki. Extraction of lexical translations from non-aligned corpora. Proceedingsof the 16th International Conference on Computational Linguistics, 580, 1996.
    [89] G. Kikui. Resolving translation ambiguity using non-parallel bilingual corpora. Proceedings ofACL99 Workshop on Unsupervised Learning in Natural Language Processing, 1999.
    [90] Y. Cao and H. Li. Base Noun Phrase translation using web data and the EM algorithm. Proceedingsof the 19th international conference on Computational linguistics-Volume 1, pages 1–7, 2002.
    [91] M. Nagata, T. Saito, and K. Suzuki. Using the web as a bilingual dictionary. Proceedings of theworkshop on Data-driven methods in machine translation-Volume 14, pages 1–8, 2001.
    [92] D. Wu. Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Com-putational Linguistics, 23(3):377–403, 1997.
    [93] G. Qu and R. Lu. Bilingual parsing and disambiguity. Journal of Shanghai Jiao Tong University,36(009):1312–1316, 2002.
    [94] Y. Gu and Y. Shen. The construction process of Chinese synthetic compound. Chinese Languageand Literature, 2, 2001.
    [95] C. Macleod, R. Grishman, A. Meyers, L. Barrett, and R. Reeves. Nomlex: A lexicon of nominal-izations. Proceedings of EURALEX’98, 1998.
    [96] R. Rada, H. Mili, E. Bicknell, and M. Blettner. Development and application of a metric on seman-tic nets. Systems, Man and Cybernetics, IEEE Transactions on, 19(1):17–30, 1989.
    [97] Z. Wu and M. Palmer. Verb semantics and lexical selection. Proceedings of the 32nd AnnualMeeting of the Association for Computational Linguistics, pages 133–138, 1994.
    [98] RA Amsler. The Structure of the Merriam-Webster Pocket Dictionary. 1980.
    [99] M.S. Chodorow, R.J. Byrd, and G.E. Heidorn. Extracting semantic hierarchies from a large on-line dictionary. Proceedings of the 23rd conference on Association for Computational Linguistics,pages 299–304, 1985.
    [100] H. Alshawi. Analysing the dictionary definitions. Computational lexicography for natural lan-guage processing table of contents, pages 153–169, 1989.
    [101] K. Jensen and J.L. Binot. Disambiguating prepositional phrase attachments by using on-line dic-tional definitions. Computational Linguistics, 13(3-4):251, 1987.
    [102] L.H. Vanderwende. The analysis of noun sequences using semantic information extracted fromon-line dictionaries. PhD thesis, Georgetown University, 1995.
    [103] N. Ide and J. Veronis. Extracting knowledge bases from machine-readable dictionaries: Have wewasted our time. Proceedings of KB&KS, 93:257–266, 1993.
    [104] A. Sanfilippo and V. Poznan′ski. The acquisition of lexical knowledge from combined machine-readable dictionary sources. Proceedings of the third conference on Applied natural languageprocessing, pages 80–87, 1992.
    [105] R. Girju, A. Badulescu, and D. Moldovan. Learning semantic constraints for the automatic discov-ery of part-whole relations. Proceedings of the 2003 Conference of the North American Chapter ofthe Association for Computational Linguistics on Human Language Technology-Volume 1, pages1–8, 2003.
    [106] D. Ravichandran and E. Hovy. Learning surface text patterns for a Question Answering system.Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pages 41–47, 2001.
    [107] C.S. Caraballo. Automatic acquisition of a hypernym-labeled noun hierarchy from text. Proceed-ings of the 37th Annual Meeting of the Association for Computational Linguistics, June. CiteSeer.IST-Copyright Penn State and NEC, 1999.
    [108] P. Pantel and D. Ravichandran. Automatically labeling semantic classes. Proceedings ofHLT/NAACL, 4:321–328, 2004.
    [109] H.H. Chen, S.C. Tsai, and J.H. Tsai. Mining tables from large scale html texts. 18th InternationalConference on Computational Linguistics (COLING), pages 166–172, 2000.
    [110] M. Yoshida, K. Torisawa, and J. Tsujii. A method to integrate tables of the world wide web. inProceedings of the International Workshop on Web Document Analysis (WDA 2001), Seattle, US,2001.
    [111] A. Almuhareb and M. Poesio. Attribute-Based and Value-Based Clustering: An Evaluation. Proc.of EMNLP, 2004:158–165, 2004.
    [112] W.A. Woods. What’s in a Link: Foundations for Semantic Networks. Bolt, Beranek and Newman,1975.
    [113] A.M. Popescu and O. Etzioni. Extracting product features and opinions from reviews. Proceedingsof EMNLP 2005, 2005.
    [114] R. Dale and E. Reiter. Computational Interpretations of the Gricean Maxims in the Generation ofReferring Expressions. Cognitive Science, 19(2):233–263, 1995.
    [115] Z. Dong and Q. Dong. Development Stragegy of HowNet. Fudan Seminor on Chinese LanguageProcessing, http://www.china-language.gov.cn/doc/NLP0/, 2004.
    [116]李行健.现代汉语规范词典. 2004.
    [117] P. Resnik. Semantic Similarity in a Taxonomy: An Information-Based Measure and its Applicationto Problems of Ambiguity in Natural Language. Journal of Artificial Intelligence, 11(11):95–130,1999.
    [118] F. Zhiwei. Hybrid Approaches for Automatic Segmentation and Annotation of a Chinese TextCorpus. International journal of corpus linguistics, 6(1-12):35–42, 2001.
    [119] P. Resnik. Selectional constraints: an information-theoretic model and its computational realiza-tion. Cognition, 61(1-2):127–159, 1996.
    [120] Y. Li and Z. Bandar. An approach for measuring semantic similarity between words using multipleinformation sources. IEEE Transactions on Knowledge and Data Engineering, 15(4):871–882,2003.
    [121] D. Lin. Automatic retrieval and clustering of similar words. Proceedings of the 17th internationalconference on Computational linguistics, pages 768–774, 1998.
    [122] S.A. Caraballo. Automatic construction of a hypernym-labeled noun hierarchy from text. Associa-tion for Computational Linguistics Morristown, NJ, USA, 1999.
    [123] Y. Matsuo, T. Sakaki, K. Uchiyama, and M. Ishizuka. Graph-based Word Clustering using a WebSearch Engine. Proc. of EMNLP 2006, 2006.
    [124] H.H. Chen, M.S. Lin, and Y.C. Wei. Novel association measures using web search with doublechecking. Proceedings of the 21st International Conference on Computational Linguistics and the44th annual meeting of the ACL, pages 1009–1016, 2006.
    [125] D. Bollegala, Y. Matsuo, and M. Ishizuka. An Integrated Approach to Measuring Semantic Simi-larity between Words Using Information available on the Web. Proceedings of NAACL HLT, pages340–347, 2007.
    [126] G.A. Miller and W.G. Charles. Contextual correlates of semantic similarity. Language and Cogni-tive Processes, 6(1):1–28, 1991.
    [127] D. Yarowsky. Unsupervised word sense disambiguation rivaling supervised methods. Proceedingsof the 33rd conference on Association for Computational Linguistics, pages 189–196, 1995.
    [128] P. Resnik. Using information content to evaluate semantic similarity in a taxonomy. Proceedingsof the 14th International Joint Conference on Artificial Intelligence, 1:448–453, 1995.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700