基于自然语言与记忆再重构的常识推理模型
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
基于自然语言的文本信息处理和推理相结合的课题在最近十多年来扮演着越来越重要的角色,其应用范围涉及多个方面,比如广泛使用的网络搜索引擎,生物信息中基于文献的数据采集等等。研究者们使用基于统计的监督和半监督机器学习方法,通过已经注释的训练数据,可以得到很多高效的文本信息处理模型。然而,随着文本数据的日益增多,人为注释的数据却相对来说非常有限,而要把所有各个领域的数据都注释一遍不仅耗费的人力巨大,在时间上也是几乎行不通的。人们往往还得通过自己的知识,通过推理进一步筛选出自己需要的资料和概括文章的大意。另一方面,类似专家系统等一类的推理引擎(Inference engine),能够把某一具体领域的数据按照相关性逻辑地聚集一起。而这些数据之中的一部分,在另一些领域中也是合理的。因此,有必要把文本信息处理和推理引擎结合在一起,把某一领域的已注释文本数据利用推理引擎,扩充到其他领域中,并从中推断出新的信息,使得计算能够“理解”自然语言,减轻人们处理信息的负担。
     本课题的研究主要通过自然语义处理和推理模型,模拟人脑对文本认知的过程,利用记忆再重构的理论,建立了一种能够对一些描述性的句子进行理解,分析,并作出反馈的智能推理系统。这些反馈是利用常识合理地推导出来的,在一定程度上减轻了人们筛选数据、提炼关键意思的负担。主要创新点包括:
     1、建立一种词义消歧(word sense disambiguation)机制。在文章中,常常会出现多义词。人们通过上下文,很容易得到这些多义词的确切意思,然而对于机器来说,在文章中消除多义词的歧义,是一个棘手的问题。本文建立了新的词义消歧机制,结合当前广泛使用的WordNet和VerbNet数据库,通过上下文的内容来确定一个词的确切意思。
     2、提出用于作为记忆模型的扩展型语义网络(extend semantic network,ESN)。传统的语义网络(Semantic Network)[1]仅仅能简单地表示概念实体以及相互之间的单一关系。本文提出的扩展型的语义网络,其顶点或边具有一系列属性,能够表达句子中复杂的意思。这种扩展型语义网络能很好的充当短期和长期记忆的模型,使得推理和记忆再重构顺利进行。
     3、常识库、自然语言到贝叶斯置信网的转换机制。传统的推理引擎的输入数据需要人工从自然语言转换而来,推理的框架也需要人为地从常识和规律中搭建出来。本文的记忆模型能把信息自动从句子转换成推理引擎的数据,也能自动从常识库或者自然语言中学习推理规则构建贝叶斯置信网。
     4、基于情景和主题的常识自适应选取和贝叶斯置信网实时动态组合机制。传统的推理引擎只能对某一固定领域的情形进行推理。本文的推理系统通过词汇disambiguation机制确定句子的情景和主题,自适应的选取常识来构建实时的贝叶斯执行网络,使得系统既能够恰当地推理相关信息,又能够降低贝叶斯网络的计算量,节约了时间。
     5、基于记忆再重构(memory reconsolidation)的贝叶斯置信网参数优化和信息更新。记忆再重构是近年来认在知科学和神经科学提出来的一种新的发现。在实验过程中,认知科学研究者发现人脑记住的信息并不是一成不变的,而是在每次回忆以后,这些旧的记忆会因为结合新的记忆而发生变化(比如从小时候的照片构思出一个人当前的外貌而作为记忆存储下来)。根据这一理论,本文提出的自适应贝叶斯置信网能够根据新的记忆来改变其参数,并重新更新旧的记忆,达到优化推理的目的。
The combination of natural language processing (NLP) and inference egine has play a more and more important role in the pass decay in many application fields such as search engins, data collections in bioinformatics and so on. Researchers have been using statistic supervised and semi-supervised machine learning methods with tagged data to train efficient NLP models. However, with the ever increasing text data, it becomes an impossible task to tag the data in all fields only by hand. Thus people have to refine the selected materials with their own knowledge. On the other hand, inference engins that can be inbedded in to expert systems can logically collect useful data in one field and applied it to another filed. This has suggested that it could be a decent way to process the unlimited data by using inference engine to extend the knowledge in one field to another.
     In this essay, NLP technique and inference engin are combined to simulate the recognition process of text data by human brain. With the theory of memory reconsolidation, the text-based inference system can analize descriptive sentences and give feedback on the input.
     The main contribution of this essay includes:
     1) Propose a mechanism for word sense disambiguation. The word dictionary WordNet and VerbNet are used to provide data to the system to select the right meaning of a word.
     2) Propose the extended sematic network for a memory model to store refined text data. The extended semantic network outperform the traditional semantic network on storing complicated text and make sure the memory reconsolidation process can be perform smoothly.
     3) Construct a data-auto- transferring framework for text store model and inference model.
     4) Select, create and combine beyesian network from knowledge base automatically to make inference on different scenarios.
     5) Simulate a memory reconsolidation process to improve the inference on limited data.
引文
[1] Sowa J.F.. Semantic Networks [M]. Encyclopedia of Artificial Intelligence (1987)
    [2] Booth T.L., Thomson R.A.. Applying probability measures to abstract languages [J]. IEEE Transactions on Computers C (1973) 442–450
    [3] Baker J.K.. Trainable grammars for speech recognition [A]. In: Klatt D.H., Wolf J. (eds.): Speech Communication Papers for the 97th Meeting of the Acoustical Society of America[C] (1979) 547–550
    [4] Hindle D., Rooth M.. Structural ambiguity and lexical relations [J]. Computational Linguistics 19 (1993) 103-120
    [5] Ford M., Bresnan J., Kaplan R.M.. A competence-based theory of syntactic closure [M]. MIT Press, Cambride, MA (1982)
    [6] Charniak E.. Tree-bank grammars [C]. Proc. of the 13th National Conference on Artificial Intelligence (1996) 1031–1036
    [7] Charniak E.. A maximum-entropy-inspired parser [C]. NAACL 1 (2000) 132-139
    [8] Charniak E.. Immediate-head parsing for language models [C]. ACL 39 (2001)
    [9] Collins M.J.. A new statistical parser based on bigram lexical dependencies [C]. ACL (1996) 184-191
    [10] Charniak E.. An improved maximum-entropy-inspired parser [C]. NAACL 1 (2001) 135-178
    [11] Collins M.. Head-Driven Statistical Models for Natural Language Parsing [D]. University of Pennsylvania (1999)
    [12] Klein D., Manning C.D.. Fast Exact Inference with a Factored Model for Natural Language Parsing [C]. In Advances in Neural Information Processing Systems 15 (NIPS 2002). MIT Press, Cambridge, MA (2003) 3-10
    [13] Klein D., Manning C.D.. Accurate Unlexicalized Parsing [C]. Proceedings of the 41st Meeting of the Association for Computational Linguistics (2003) 423-430
    [14] Dahlgren K., McDowell J.. Knowledge Representation for Commonsense Reasoningwith Text [J]. Computational Linguistics 5 (1989) 149-170
    [15] Dahlgren K.. Using Commonsense Knowledge to Disambiguate Word Sense [M]. Amsterdam, North Holland (1988)
    [16] Yarowsky D.. Unsupervised word sense disambiguation rivaling supervised methods [C]. Proceedings of the 33rd annual meeting on Association for Computational Linguistics. Association for Computational Linguistics, Cambridge, Massachusetts (1995)
    [17] Fellbaum C.. WordNet. An Electronic Lexical Database [M]. MIT Press (1998)
    [18] Miller G.. Nouns in WordNet [M]. The MIT Press (1998)
    [19] Véronis N.I.a.J.. Word sense disambiguation: The state of the art [J]. Computational Linguistics 24 (1998) 1-40
    [20] Patwardhan S., Banerjee S., Pedersen T.. Unsupervised Word Sense Disambiguation Using Contextual Semantic Relatedness [C]. Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-2007), Prague (2007) 390–393
    [21] Velardi N.. Structural semantic interconnections: a knowledge-based approach to word sense disambiguation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence 27 (2005) 1075-1086
    [22] Chklovski T., Mihalcea R.. Building a sense tagged corpus with Open Mind Word Expert [C]. Proceedings of the ACL 2002 Workshop on”Word Sense Disambiguation: Recent Successes and Future Directions”, Philadelphia (2002)
    [23] Diab M., Resnik P.. An unsupervised method for word sense tagging using parallel corpora [C]. Proceedings of ACL 2002, Philadelphia (2002)
    [24] Galley M., McKeown K.. Improving word sense disambiguation in lexical chaining [C]. Proceedings of IJCAI 2003, Acapulco, Mexico (2003)
    [25] Mihalcea R.. Using Wikipedia for AutomaticWord Sense Disambiguation [C]. Proceedings of NAACL HLT Rochester, NY, (2007) 196–203
    [26] Gabrilovich E., Markovitch S.. Overcoming the brittleness bottleneck using wikipedia: Enhancing text categorization with encyclopedic knowledge [C]. Proceedings of AAAI2006, Boston (2006)
    [27] Sun M.S., Xu. D.L.. Disyllabic Chinese Word Extraction Based on Character Thesaurus and Semantic Constraints in Word-Formation [C]. Proceding of 11th International Conference on Text Speech and Dialogue, Brno, Czech Republic (2008)
    [28] Lidja I., Myunghee K., Richard K., Benoit L.. LFS. A Knowledge-Based Summarizer for Economic Statistics [J]. Computational Linguistics 22 (1992) 219-223
    [29] Binsted K., Cawsey A., Jones R.. Generating Personalised Patient Information Using the Medical Record [C]. Artificial Intelligence in Medicine. 5th Conference on Artificial Intelligence in Medicine Europe, AIME '95. Proceedings (1995) 29-41
    [30] Lavoie B., Rambow O.. A fast and portable realizer for text generation systems [C]. Proceedings of the 5th. Conference on Applied Natural Language Processing. Association for Computational Linguistics, Washington, D.C (1997) 265-268
    [31] Lester J.C., Porter B.W.. Developing and empirically evaluating robust explanation generators [J]. The Knight experiments Computational Linguistics 23 (1997) 65-101
    [32] Collins A.M., Quillian M.R.. Retrieval time from semantic memory [J]. Journal of verbal learning and verbal behavior 8 (1969) 240–248
    [33] Collins A.M., Quillian M.R.. Does category size affect categorization time? [J] Journal of verbal learning and verbal behavior 9 (1970) 432–438
    [34] Collins A.M., Loftus E.F.. A spreading-activation theory of semantic processing [J]. Psychol. Rev. 82 (1975) 407–428
    [35] Richens R.H.. Interlingual machine translation [J]. Computer Journal 1 (1958) 144 -147
    [36] Sowa J.F.. Conceptual graphs for a data base interface [J]. IBM Journal of Research and Development 20 (1974) 336-357
    [37] Sowa J.F.. Generating language from conceptual graphs [J]. Computers & Mathematics with Applications 9 (1983) 29-43
    [38] Sowa J.F.. On conceptual structures : A response to the review by S.W. Smoliar [J]. Artificial Intelligence 34 (1988) 388-394
    [39] Sowa J.F.. Logical foundations of artificial intelligence [J]: Michael R. Genesereth andNils J. Nilsson, (Morgan Kaufmann, Los Altos, CA, 1987); 406 + xviii pages. Artificial Intelligence 38 (1989) 125-131
    [40] Sowa J.. Conceptual graphs [J]. Knowledge-Based Systems 5 (1992) 171-172
    [41] Sowa J.F.. Conceptual graphs as a universal knowledge representation [J]. Computers & Mathematics with Applications 23 (1992) 75-93
    [42] Kipper K., Palmer M., Rambow O.. Extending PropBank with VerbNet Semantic Predicates [J]. Workshop on Applied Interlinguas, held in conjunction with AMTA-2002, Tiburon, CA (2002)
    [43]参见http://verbs.colorado.edu/~mpalmer/projects/verbnet.html
    [44] Ide N., Véronis J.. Word sense disambiguation: The state of the art [J]. Computational Linguistics 24 (2005) 1-40
    [45] Kilgarriff A., Rosenzweig R.. Framework and results for English Senseval [J]. Computers and the Humanities 34 (2000)
    [46] Mihalcea R.. Using wikipedia for automatic word sense disambiguation [C]. Proceedings of NAACL HLT, Vol. 2007 (2007)
    [47] Mitchell T.M., Shinkareva S.V., Carlson A., et al. Predicting Human Brain Activity Associated with the Meanings of Nouns [J]. Science 320 (2008) 1191-1195
    [48] Turner R., Sripada S., Reiter E., et al.. Generating spatio-temporal descriptions in pollen forecasts. the Eleventh Conference of the European Chapter of the Association for Computational Linguistics [J]. Association for Computational Linguistics, Trento, Italy (2006) 163-166
    [49]参见http://snowball.tartarus.org/index.php
    [50] Young Y.C.a.R.M.. Narrative Generation for Suspense: Modeling and Evaluation[R].
    [51]参见http://en.wikipedia.org/wiki/File:Semantic_Net.svg
    [52] Miller G.A.. The magical number seven, plus or minus two: Some limits on our capacity for processing information [J]. Psychol. Rev. 63 (1956) 81-97
    [53] Baddeley A.D.. The influence of acoustic and semantic similarity on long-term memory for word sequences [J]. Quart. J. exp. Psychol 18 (1966) 302-309
    [54] Collins A.M., Quillian M.R.. Retrieval time from semantic memory [J]. Journal of verbal learning and verbal behavior 8 (1969) 240-248
    [55] Collins A.M., Quillian, M.R.. Does category size affect categorization time [J]. Journal of verbal learning and verbal behavior 9 (1970) 432-438
    [56] Tulving E., Thomson D.M.. Encoding Specificity and Retrieval Presses in Episodic Memory [J]. Psychol. Rev. 80 (1973) 352-373
    [57] Pearl J.. Theoretical Bounds on Complexity of Inexact Computations [J]. IEEE Transactions on Information Theory 22 (1976) 580-586
    [58] Pearl J.. Memory Versus Error Characteristics for Inexact Representations of Linear Orders [J]. IEEE Transactions on Computers 25 (1976) 922-928
    [59] Pearl J.. Summarizing Data Using Probabilistic Assertions [J]. IEEE Transactions on Information Theory 23 (1977) 459-465
    [60] Pearl J.. Framework for processing value judgments [J]. IEEE Transactions on Systems Man and Cybernetics 7 (1977) 349-354
    [61] Pearl J.. Note on management of probability assessors [J]. IEEE Transactions on Systems Man and Cybernetics 7 (1977) 402-403
    [62] Pearl J.. Connection between complexity and credibility of inferred models [J]. International Journal of General Systems 4 (1978) 255-264
    [63] Pearl J.. Economic basis for certain methods of evaluating probabilistic forecasts [J]. International Journal of Man-Machine Studies 10 (1978) 175-183
    [64] Pearl J.. Capacity and error estimates for boolean classifiers with limited complexity [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence 1 (1979) 350-356
    [65] Pearl J.. Asymptotic properties of minimax trees and game-searching procedures [J]. Artificial Intelligence 14 (1980) 113-138
    [66] Pearl J., Crolotte A.. Storage space versus validity of answers in probabilistic question-answering systems [J]. IEEE Transactions on Information Theory 26 (1980) 633-640
    [67] Pearl J.. A space-efficient online method of computing quantile estimates [J]. Journal ofAlgorithms 2 (1981) 164-177
    [68] Pearl J.. The solution for the branching factor of the alpha-beta-pruning algorithm and its optimality [J]. Communications of the ACM 25 (1982) 559-564
    [69] Pearl J., Kim J.H.. Studies in semi-admissible heuristics [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence 4 (1982) 392-399
    [70] Pearl J., Leal A., Saleh J.. Goddess-A goal-directed decision structuring system [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence 4 (1982) 250-262
    [71] Pearl J.. On the nature of pathology in game searching [J]. Artificial Intelligence 20 (1983) 427-453
    [72] Pearl J.. Special issue on search and heuristics-preface [J]. Artificial Intelligence 21 (1983) 1-6
    [73] Pearl J.. Knowledge versus search-A quantitative-analysis using A-star [J]. Artificial Intelligence 20 (1983) 1-13
    [74] Pearl J.. Some recent results in heuristic-search theory [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence 6 (1984) 1-13
    [75] Spear N.E., Mueller C.W.. Consolidation as a function of retrieval [A]. In: Weingartner H., P.E.S. (ed.): Memory consolidation: Psychobiology of cognition [M]. Lawrence Erlbaum Associates, Hillsdale, NJ (1984) 111-147
    [76] Arnborg S.. Efficient algorithms for combinatorial problems on graphs with bounded decomposability-A survey [J]. BIT Numerical Mathematics 25 (1985) 1-23
    [77] Lenat D.B.a.P., Shepherd M.. CYC: Using common sense knowledge to overcome brittleness and knowledge acquisition bottlenecks [J]. AI magazine 6 (1986) 65-85
    [78] Pearl J.. Fusion, Propagation, and Structuring in Belief Networks [J]. Artificial Intelligence 29 (1986) 241-288
    [79] Pearl J.. On evidential reasoning in a hierarchy of hypotheses [J]. Artificial Intelligence 28 (1986) 9-15
    [80] Pearl J.. Probabilistic reasoning using graphs [A]. Lecture Notes in Computer Science [C] 286 (1987) 200-202
    [81] Pearl J.. Distributed revision of composite beliefs [J]. Artificial Intelligence 33 (1987) 173-215
    [82] Pearl J.. Evidential reasoning using stochastic simulation of causal-models [J]. Artificial Intelligence 32 (1987) 245-257
    [83] Pearl J., Korf R.E.. Search techniques [J]. Annual Review of Computer Science 2 (1987) 451-467
    [84] Dahlgren K.. Naive Semantics for Natural Language Understanding [M]. Kluwer Academic Press, Boston, MA (1988)
    [85] Pearl J.. Embracing causality in default reasoning [J]. Artificial Intelligence 35 (1988) 259-271
    [86] Pearl J.. Reasoning under Uncertainty [J]. Annual Review of Computer Science 4 (1989) 37-72
    [87] Pearl J., Geiger D., Verma T.. Conditional-independence and its representations [J]. Kybernetika 25 (1989) 33-42
    [88] Baddeley A.D., Della S.S.. Working memory and executive control [J]. Philosophical Transactions of the Royal Society of London 351 (1996) 1397-1404
    [89] Christina R., Strong M.M., Mishra K., et al. Emotionally Driven Natural Language Generation for Personality Rich Characters in Interactive Games [C]. Third Conference on Artificial Intelligence for Interactive Digital Entertainment (AIIDE-07), (2007) 98-100
    [90] Watt , E., Bui A.A.. Evaluation of a dynamic bayesian belief network to predict osteoarthritic knee pain using data from the osteoarthritis initiative [C]. AMIA Annu Symp Proc (2008) 788-792
    [91] Kipper K., Korhonen A., Ryant N., et al. Extending VerbNet with Novel Verb Classes [C]. Fifth International Conference on Language Resources and Evaluation (LREC 2006), Genoa, Italy (2006)
    [92] Kipper K.A.K., Anna, Ryant, et al. Extending VerbNet with Novel Verb Classes [C]. Fifth International Conference on Language Resources and Evaluation (LREC 2006),Genoa, Italy (2006)
    [93] Manning D.K., Colline D.. Fast Exact Inference with a Factored Model for Natural Language Parsing [M]. (2003) 3-10
    [94] Deemter V.K., E.R., Horacek H.. Formal Issues in Natural Language Generation [J]. Research on Language & Computation 4 (2006) 1-7
    [95] Liao S., Qing H., Yi W.. A functional-dependencies-based Bayesian networks learning method and its application in a mobile commerce system [J]. Systems, Man, and Cybernetics, Part B, IEEE Transactions on 36 (2005) 660-671
    [96] Zheng D.Q., Li S., Zhao T.J.. A Hybrid Chinese Language Model based on a Combination of Ontology with Statistical Method [C]. Proceedings of IJCNLP 2005, Jeju island, Korea (2005)
    [97] Geiger D., Verma T., Pearl, J.. Identifying independence in bayesian networks [J]. Networks 20 (1990) 507-534
    [98] Lin D., Zhao S., Qin L., et al. Identifying synonyms among distributionally similar words [C]. Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI03), Acapulco, Mexico (2003) 1492–1493
    [99] Niculescu-mizil, A.. Inductive transfer for Bayesian network structure learning [A]. In: Shen, M.M.a.X. (ed.): Proceedings of the 11th International Conference on AI and Statistics (AISTATS) [C], Vol. 2 (2007) 339-346
    [100] Loureiro O., Siegelmann H.. Introducing an active cluster-based information retrieval paradigm [J]. J Am Soc Inf Sci Tec 56 (2005) 1024-1030
    [101] Miller G., Beckwith R., Fellbaum C., et al. Introduction to WordNet: an on-line lexical database [J]. International Journal of Lexicography 3 (1990) 235-244
    [102] Levy R., Manning C.D.. Is it harder to parse Chinese, or the Chinese Treebank? [C] ACL (2003)
    [103] Cozman F.G.. JavaBayes system [R]. The ISBA Bulletin 7 (2001) 16-21
    [104] Naphade M., Smith J.R., Tesic J., et al. Large-Scale Concept Ontology for Multimedia [J]. IEEE MultiMedia 13 (2006) 86-91
    [105] Chklovski T.. Learner: a system for acquiring commonsense knowledge by analogy [C]. Proceedings of the 2nd international conference on Knowledge capture. ACM, Sanibel Island, FL, USA (2003)
    [106] Murdock B.B., Kahana M.J.. List-Strength and List-Length Effects: Reply to Shiffrin, Ratcliff, Murnane, and Nobel (1993) [J]. Journal of Experimental Psychoiog: Learning, Memory, and Cognition 19 (1993) 1450-1453
    [107] Reijmers L.G., Perkins B.L., Matsuo N., et al. Localization of a Stable Neural Correlate of Associative Memory [J]. Science 317 (2007) 1230-1233
    [108] Lease M., Charniak E., Johnson M., et al. A Look At Parsing and Its Applications [C]. In Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06) (2006) 16--20
    [109] Zheng D.Q., Zhao T.J., Li S.. Machine Learning for Automatic Acquisition of Chinese Linguistic Ontology Knowledge [C]. Proceedings of ICMLC 2005, Guangzhou, China (2005)
    [110] Eric M., Dennis D.. Machine Learning for Science: State of the Art and Future Prospects [J]. Science 293 (2001) 2051-2055
    [111] Lu S.L., Guo A., Becker K., et al. Microarray analysis of global gene expression in the mammary gland following estrogen and progesterone treatment of ovariectomized mice [J]. Cancer Epidem Biomar 12 (2003) 1295s-1296s
    [112] Lu S.L., Minter L., Guo A.Y., et al. Microarray analysis of hormonal treated mammary gland reveals the role of extracellular matrix in p53 activities [J]. Cancer Epidem Biomar 13 (2004) 1836s-1836s
    [113] MacCartney B., Manning C.D.. Modeling Semantic Containment and Exclusion in Natural Language Inference [C]. In Proceedings of the 22nd International Conference on Computational Linguistics. Coling 2008 Organizing Committee, Manchester, UK (2008) 521--528
    [114] Tronson N.C., Taylor J.R.. Molecular mechanisms of memory consolidation [J]. Nature Reviews Neuroscience 8 (2007) 262-275
    [115] Heckerman D.. A tutorial on learning Bayesian networks [A]. In: Jordan, M.I. (ed.): Learning in Graphical Models [C] (1998)
    [116] Friedman N.. The Bayesian structural em algorithm [A]. In: Mora, G.F.C.a.S. (ed.): Uncertainty in Artificial Intelligence [C]: Proceedings of the Fourteenth Conference. Morgan Kaufmann, Madison, Wisconsin (1998) 129-138
    [117] Russell S.J., Norvig P.. Artificial intelligence: a modern approach [M]. Prentice Hall, Englewood Cliffs, N.J. (1995)

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700