摘要
采用本体与文本文挖掘相结合的方法,提出了基于领域本体的文本挖掘模型。首先创建本体结构,引入本体的"概念-概念"相关度矩阵,利用基于本体概念的向量空间模型代替传统的向量空间模型来表示文档,并在此基础上进行文本挖掘。
The author discusses the text mining methods based on ontology and put forward text miningmodel based on domain ontology. Ontology structure is built at first and the"concept-concept"similaritymatrix is introduced, then a concept vector space model based on ontology is used to take the place of tra-ditional vector space model to represent the documents in order to realize text mining.
引文
1 Feldman R,Dagan I.Knowledge discovery in textualdatabases(KDT)[C]//Proceedings of the First Interna-tional Conference on Knowledge Discovery and DataMining(KDD’95).Montreal,Canada,AAAI press,1995:112-117.
2 Paolo Rosso,Edgardo Ferretti,Daniel Jimenez and Vi-cente Vidal.Text Categorization and Information Re-trieval Using Word Net Senses[Z].USA:The Second Global Wordnet Conference GWC,2004:299-304.
3 Sedding,J.,Kazakov,D.Word Net-based Text Docu-ment Clustering[C].USA:Proceedings of the ThirdWorkshop on Robust Methods in Analysis of NaturalLanguage Data(ROMAND),2004:104-113.
4 Y.Ino,T.Matsui,and H.Ohwada.Extracting CommonConcepts from Word Net to Classify Documents[A].Ar-tificial Intelligence and Applications[C].2005:656-661.
5 Shehata S.A Wordnet-based SemanticModel for Enhancing Text Clustering[C].USA:2009IEEE International Conference on Data Mining Work-shop,2009:477-482.
6 S.Bloedorn,P.Cimiano,A.Hothon and S.Staab.AnOntology-based Framework for Text Mining[J].LDV-Forum,2005,20(1):79-82.
7 A Hotho,S Staab,G Stumme.Ontologies improve textdocument clustering[C].Data Mining,Third IEEE In-ternational Conference on,2003:541-544.
8 Mu-Hee Song,Soo-Yeon Lim,Seong-BaePark,Dong-Jin Kang and Sang-Jo Lee.Ontology-basedautomatic Classification of Web Pages[J].Internation-al Journal of Lateral Computing,2005,1(1).
9邹国兵,向阳.基于领域本体的信息搜索模型[J].同济大学学报(自然科学版),2009,37(4):545-549.
10朱恒民,马静,黄卫东.基于领域本体实现全网信息的智能检索方法研究[J].情报学报,2010,29(1):9-15.
11张玉峰,何超.基于领域本体的语义文本挖掘研究[J].情报学报,2011,30(8):832-839,
12唐晓波,罗毅.基于领域本体和语义相似度的数据挖掘模型[J].情报科学,2011,29(2):275-278.
13陈杰,蒋祖华.领域本体的概念相似度计算[J].计算机工程与应用,2006,23(6):1-7.