用于图分类的频繁子结构挖掘算法研究

英文题名：Research on the Frequent Substructure Mining Algorithm for Graph Classification
作者：邹晓红
论文级别：博士
学科专业名称：电路与系统
中文关键词：图挖掘 ; 子图同构 ; 标准编码 ; 特征图模式 ; 信息熵 ; 关联分类
英文关键词：Graph mining ; Subgraph isomorphism ; Canonical code ; Feature graph pattern ; Information entropy ; Associative classification
学位年度：2011
导师：田乃硕 ; 郭景峰
学科代码：080902
学位授予单位：燕山大学

摘要

随着计算机与信息技术的发展,数据挖掘技术已经广泛应用到人工智能、模式识别、生物信息等许多领域。当前,复杂类型数据的挖掘需求上升,专家学者开始关注这方面的新应用和理论研究,并试图利用结构化数据挖掘方面的经验和方法来帮助解决新问题。
     在计算机科学领域,图具有直观的表达形式,它能够表达更加丰富的语义,同时,图也是最复杂的数据结构之一,与一般的数据相比,这种丰富的语义也增加了数据结构的复杂性和挖掘令人感兴趣的图结构的难度。因此,图挖掘需要综合应用图论知识与数据挖掘的技术。图的挖掘无论在研究领域还是在商业领域都有着广泛的应用,例如:CAD电路分析、分子模型分析、Web浏览中的用户兴趣点的挖掘以及数据的压缩等等。在图挖掘中频繁子图挖掘又是图分类和图聚类的基础,如何从大量的图中挖掘出令人感兴趣的频繁子图模式成为国内数据挖掘领域研究的热点之一。本文的主要内容如下。
     首先,本文介绍了图的相关概念、图挖掘方法的类型和经典的频繁子图挖掘算法,并对现有的经典算法进行了全面的综合分析、归纳和总结,重点指出了有代表性的算法的优缺点以及图挖掘存在的主要问题,为下一步的研究指明了方向。其次,针对基于模式增长算法gSpan所存在的扩展频繁子图时产生冗余的问题,提出了一种改进的算法CSGM。用ADI++存储结构来代替原算法的邻接链表的存储结构,同时还提出了有效的删除非最小DFS编码的方法,保证算法在扩展频繁子图时每一次均能够生成图的标准编码,避免对标准编码不必要的判断以及非标准编码的支持度计算。另外,在挖掘中使用Hash表存储同构图的Hash地址,来计算频繁子图的支持度,避免了对图集的重复扫描,也相应地减少了子图同构判断的次数。在实际数据集和人工合成的模拟数据集上做了全面的实验,对算法的正确性、处理大规模图集的能力以及运行时间上进行了验证。
     再次,以频繁子图挖掘结果作为特征候选集,选取特征来生成一个有辨别力的频繁模式的小集合用作分类特征,本文又提出了一种特征模式选取算法。通过引入信息熵、信息增益这种有判别力的度量手段给出了目标函数,说明了如何选取分类特征的方法。同时,为了缩减图模式的搜索空间和提高挖掘效率,还给出了垂直剪枝和水平剪枝两种剪枝策略。通过实验验证了挖掘结果的质量及算法的运行效率。
     最后,提出了一种基于频繁模式的图分类算法。先从理论上分析了模式的预测能力和支持度之间的关系,给出了最小支持度阈值的设置策略。进一步采用有判别力的频繁子图作为特征图模式来构造分类规则,通过关联分类的方法来训练分类器,识别新图的类别。通过实验对分类器的性能进行了验证。
With the development of computer science and information technology,data mining technology has been widely applied to artificial intelligence,pattern recognition,bioinformatics and many other fields.The demand of mining on complex data is rising now. Experts have paid attention to these fields and tried to solve the problems by virtue of the experience of structured data mining.
     In computer science, the graph is one of the most complicated data structures. Its’rich semantic increases more complexity of the data structure and more difficulty for mining the interesting graph structures. The graph theory is often used together with various technologies in graph data mining. Graph can be intuitively presented and has a wide variety of applications both in research and business, e.g. CAD circuit analysis, molecular model analysis, finding user’s interest in Web browsing and data compression. About graph mining, frequent subgraph mining is the foundation of graph classification and graph clustering. Accordingly, how to derive the frequent subgraph patterns from the great volume of graph-structured data became one of the hottest issues in data mining field. This is also the focus of this paper.
     Firstly, in this paper the basic concept of graph, the type of graph mining approaches and the classic frequent subgraph mining algorithm are introduced. Then the existing classic graph mining algorithms are analysed and pointed out the excellence and the inferior aspects. The problems of graph mining are also introduced in this section, the direction for future research is given.
     Secondly, to resolve the problem that gSpan algorithm will produce a lot of redundant subgraphs while extending frequent subgraphs, an improved algorithm CSGM is proposed, which using ADI++ storage structure instead of the adjacency list storage structure in gSpan, and meanwhile an approache of deleting the non-DFS code is also present. It can handle larger graph dataset and guarantee that a canonical code can be obtained at each extension. It not only avoids calculating the support of the non-canonical code graphs but also avoids the calculation of whether an input code is a canonical code or not, the new algorithm reduces the computation. Hash table is used to store graph’s hash address and calculate the support of frequent subgraphs during mining, it can avoid scanning the databases repeatly and reduce the counts of subgraph isomorphism judgment. The experiments have been done on the actual and simulate date sets to verify the correctness, the efficiency and the ability to handle large graph databases .
     Afterwards, the result of frequent subgraphs mining is used as a feature candidate set, to select a small set of frequent and discriminative patterns for classification from it, an algorithm of selecting feature patterns is provided in this article. The objective function is given by introducing the discriminative measure of information entropy and information gain. The method for classification feature selection is explained. Meawhile, in order to reduce the searching space of graph patterns and improve the efficiency, both vertical pruning and horizontal pruning strategies are proposed. The qualityof mining results and running efficiency are verified by experiment.
     Finally, an algorithm of graph classification based on frequent patterns is provided. By analyzing the relationship between pattern frequency and its predictive power, such a analysis suggests a strategy for setting min_sup. Furthermore,the discriminative frequent subgraphs are used as feature patterns to construct the classification rules, the classifier is trained by associative classification, and the new graph is distinguished. The performance of the classifier is verified by experiment.

引文

1 Jiawei Han, Micheline Kamber.数据挖掘:概念与技术.范明,孟小峰译.北京:机械工业出版社, 2007: 79-165
    2 T. Washio, H. Motoda. State of the Art of Graph-based Data Mining. SIGKDD Explorations Newsletter, 2003,5(1):59-68
    3 D.Chakrabarti and C.Faloutsos. Graph mining: Laws,generators and algorithms. ACM Computing Survey, 2006:38(1):1-69
    4 S. Srinivasa and M. Harjinder Singh, GRACE: A Graph Database System. COMAD, 2005
    5 J. Huan, W. Wang, D. Bandyopadhyay, J. Snoeyink, J. Prins, and A. Tropsha. Mining protein family specific residue packing patterns from protein structure graphs. Proc. of the Intl Conf on Research in Computational Molecular Biology (RECOMB), 2004:308-315
    6 S.Raghavan and H.Garcia-Molina. Representing web graphs. Proc. of thte IEEE Intl.Conference on Data Mining, 2003
    7 S. Srinivasa and S. Kunar. A platform based on the multi-dimensional data model for analysis of bio-molecular structures. VLDB, 2003:975-986
    8 M. Mukherjee and L. B. Holder. Graph-based data mining for social network analysis. ACM KDD,2004
    9 X. Yan, P. S. Yu, and J. Han. Graph Indexing Based on Discriminative Frequent Structure Analysis. ACM Transactions On Database Systems, 2005,30(4):960-993
    10 X. Yan, P. S. Yu, and J. Han. Graph Indexing: A Frequent Structure-based Approach.. Proc.of the ACM SIGMOD, 2004:335-346
    11 Lam, W.M. Winnie, Chan, C.C. Keith, A graph mining algorithm for classifying chemical compounds. Proc.of IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2008: 321-324
    12 S.Kramer, L.Raedt and C.Helma. Molecular feature mining in HIV data. Proc. of the 2001 SIGMOD,2001:136-143
    13 LB. Holder, DJ. Cook and S. njoko. Substructure Discovery in the Subdue System. Proc.of ACM SIGKDD the 1994 International Conference on Knowledge Discovery in Database(KDD’94),July 1994:169-180
    14 D. J. Cook and L. B. Holder. Substructure discovery using minimum description length and background knowledge. Proc. of the National Conference on Artificial Intelligence. 1994:1442
    15 K. Yoshida, H. Motoda and N. Indurkhya. Graph-based Induction as a Unified Learning Framework. Journal of Applied Intelligence, 1994, 4:297-328
    16 L. Dehaspe, H. Toivonen and R. D. King. Finding Frequent Substructures in Chemical Compounds, Proc. of ACM SIGKDD the 1998 International Conference on Knowledge Discovery in Database(KDD’98), R. Agrawal, P. Stolorz and G. Piatetsky-Shapiro(Eds.), AAAI Press, 1998:30-36
    17 S. Nijssen and J. Kok. Faster association rules for multiple relations. Proc. of IJCAI'01, volume 2, 2001:891-896
    18 Inokuchi, T. Washin and H. Motoda. An Apriori-based Algorithm for Mining Frequent Substructures from Graph Data. Proc. of the 2000 Europe Conference on Principle of Data Mining and Knowledge Discovery(PKDD), Lyon, France, September 2000:13-23
    19 Inokuchi, T. Washin, K. Nishimura and H. Motoda. A Fast Algorithm for Mining Frequent Connected Subgraphs. Research Report RT-0448, IBM Tokyo Research Lab, 2002
    20 M. Kuramochi and. G. Karypis. Frequent Subgraph Discovery. Proc. of IEEE the 2001 International Conference on Data Mining (ICDM), November 2001:313-320
    21 L.De Raedt and S.Kramer. The levelwise version space algorithm and its application to molecular fragment finding. IJCAI’01, 2001:853-859
    22 Borgelt and M. R. Berthold. Mining Molecular Fragments: Finding Relevant Substructures of Molecules. Proc. of IEEE the 2002 International Conference on Data Mining, 2002:51-58
    23 N. Vanetik, E. Gudes and S. E. Shimony. Computing Frequent Graph Patterns from Semi-structured Data. Proc. of IEEE the 2002 International Conference on Data Mining(ICDM), 2002:458-465
    24 N. Vanetik, E.and E. Gudes. Mining frequent labeled and partially labeled graph patterns. ICDM,2002
    25 X. Yan and J. Han. gSpan: Graph-based Substructure Patterns Mining. Proc. of IEEE the 2002 International Conference on Data Mining(ICDM), December 2002:721-724
    26 Han J, Pei J, YinY. Mining frenquent patterns without candidate gengeration. Proc. of the 2000ACM SIGMOD Int’l Conf. on Management of Data. New York: ACM Press, 2000:1-12
    27 X. Yan and J. Han. Closegraph: Mining Closed Frequent Graph Patterns. Proc. of ACM SIGKDD the 2003 International Conference on Data Mining and Knowledge Discovery, August 2003:286-295
    28 J. Huan, W. Wang and J. Pfins. Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism. Proc. of IEEE the 2003 International Conference on Data Mining(ICDM), December 2003:449-552
    29 J. Huan, W. Wang, J. Prins and J. Yang. Spin: Mining Maximal Frequent Subgraphs from Graph Database. Proc. of ACM SIGKDD the 2004 International Conference on Data Mining and Knowledge Discovery, August 2004
    30 Nijssen S. and Kok J. A quickstart in frequent structure mining can make a difference. Proc. of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, 2004: 647-652
    31 Chen Wang, Wei Wang, Jian Pei, Yongtai Zhu, Baile Shi: Scalable mining of large disk-based graph databases. KDD’04, 2004:316-325
    32 Feida Zhu, Xifeng Yan, Jiawei Han, Philip S. Yu. A Constraint Pushing Framework for Graph Pattern Mining. Pacific-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD), 2007:388-400
    33 Karsten M.Borgwardt, Hans-Peter Kriegel and Peter Wackersreuther. Pattern Mining in Frequent Dynamic Subgraphs. Proc. of ICDM, 2006
    34 Koyuturk, A. Grama and W. Szpankowski. An Efficient Algorithm for Detecting Frequent Subgtaphs in Biological Networks. Bioinformatics, 2004,(20):200-207
    35 H. Hu, X. Yan, Y. Huang, et. a1. Mining Coherent Dense Subgraphs across Massive Biological Networks for Functional Discovery. Bioinformatics, 2005,21(1):213-221
    36 G. Di Fatta, M. R. Berthold. High Performance Subgraph Mining in Molecular Compounds. Springer’s LNCS Proc.of the 2005 Int. Conf. on High Performance Computing and Communications(HPCC-05), Sorrento, Italy, 2005:866-877
    37 Mohammed Zaki. Efficiently Mining Frequent Trees in a Forest. Proc.of SIGKDD’02, 2002:215-224
    38 Ulrich Rückert, Stefan Kramer. Frequent Free Tree Discovery in Graph Data. SAC’04,2004:13-23
    39 S. Ghazizadeh and S. Chawathe. SEuS: Stmctures Extraction Using Summaries. Proc. of the 2002 International Conference on Discovery Science, 2002:71-85
    40 M. Kuramochi and G. Karypis. Finding Frequent Patterns in a Large Sparse Graph. Proc.of SAIM the 2004 Intemmional Conference on Data Mining, 2004
    41 JunmeiWang WynneHsu Mong Li Lee Chang Sheng. A Partition-based Approach to Graph Mining. Proc. of IEEE Int. Conf. on Data Engineering (ICDE06), 2006:316-325
    42 Son N. Nguyen, Maria E. Orlowska, Xue Li. Graph Mining based on a Data Partitioning Approach. Proc. of Nineteenth Australasian Database Conference (ADC 2008), 2008:31-37
    43 Yuanyuan Tian, Richard A.Hankins and Jignesh M.Patel. Efficient Aggregation for Graph Summarization. SIGMOD Conference, 2008:433-444
    44 S.Navlakha, R.Rastogi and N.Shrivastava.Graph summarization with bounded error. SIGMOD Conference, 2008:419-432
    45 Chen Chen, Cinde X.Lin Matt Fredrikson and etc.Mining Graph Patterns Efficiently via Randomized Summaries. VLDB Conference, 2009
    46 Nikhil S.Ketkar, Lawrence B. Holder and Diane J.COOK. Subdue: Compression-Based Frequent Pattern Discovery in Graph Data.
    47 Christian Borgelt. Combining Ring Extensions and Canonical Form Pruning.
    48 Thomas L T, Valluri S R and Karlapalem K.Margin: Maximal frequent subgraphs mining. Proc. of the 6th International Conference on Data Mining (ICDM’06). Hong Kong, Ching, 2006:1097-1101
    49 Xifeng Yan, HongCheng, Jiawei Han, Philip S.Yu. Mining Significant Graph Patterns by Leap Search. SIGMOD’08, Vancouver, BC, Canada, 2008:433-444
    50 H.He and A.K.Singh. GraphRank: Statistical Modeling and Mining of Significant Subgraph in the Feature Space. Proc. of the 6th International Conference on Data Mining, 2006:885-890
    51 Ranu Sayan, Singh Ambuj K. GraphSig: A scalable approach to mining significant subgraph in large graph databases. ICDE’09. Washington, DC, USA:IEEE Computer Society, 2009:844-855
    52 M.Hasan, M.Zaki. Musk: Uniform sampling of k maximal patterns. Proc. of the 2009 SIAM International Conference on Data Mining. NEVEDA, USA, 2009:650-661
    53 M.A.Hasan, Chaoji V, M. J.Zaki and tec. ORIGAMI: Mining representative orthogonal graphpatterns. Proc. of the ICDM 2007. Omaha NE, USA,2007:153-162
    54 M.Hasan, M.Zaki. Output space sampling for graph patterns. Proc. of the VLDB’09. Lyon France, 2009:730-741
    55 F.Pennerath and A. Napoli. Mining frequent Most Informative Subgraph. The 5th Int.Workshop on Mining and Learning with Graphs, 2007
    56 G.I. Webb. Discovering significant patterns.Machine Learning, 2007:1-33
    57 Hubler, H.Krigegl, K.Borgwardt and Z.Ghahramani. Metroopolis Algorithms for Representative Subgraph Sampling. Proc.of ICDM, 2008
    58 Hintsanen P. The most reliable subgraph problem. Proceedings of the 11th Conference on Principles and Practice of Knowledge Discovery in Databases, Warsaw,2007:471-478
    59 Hintsanen P. Toivnen H. Fining reliable subgraphs from large probabilistic graphs. Data Mining and Knowledge Discovery, 2008,17(1):3-23
    60 M. Deshpande, M. Kuramochi, G. Karypis. Frequent substructure-based approaches for classifying chemical compounds. IEEE Trans on Knowledge and Data Engineering ,2005,17(8):1036-1050
    61 S.Kramer, L.Raedt and C.Helma. Molecular feature mining in HIV data. Proc. of SIGMOD, 2001:236-143
    62 N.Wale and G.Karypis. Comparison of descriptor spaces for chemical compound retrieval and classification. Proc. of ICDM, Hong Kong, 2006:678-689
    63 T. Horvath, T. Gartner, S. W robe1. Cyclic pattern kernels for predictive graph mining. KDD-2004, Seattle, USA, 2004
    64 H. Kashima, K. Tsuda, A. Inokuchi. Marginalized kernels between labeled graphs ICML- 2003 Washington, USA, 2003
    65 K.M. Borgwardt, H.P. Kriege1.Shortest path kernels on graphs. ICDM2005, Houston, USA, 2005
    66 Kudo.T, Maeda.E. and Matsumoto.Y. An application of boosting to graph classification. Advances in neural information processing systems, 2005,(17):729-736
    67 Koji Tsuda. Entire Regularization Paths for Graph Data. Proc. of the 24th International Conference Machine Learning, 2007:919-926
    68 H. Saigo, T.Kadowaki AND k.Tsuda. A Linear programming approach for molecular QSARanalysis. International Workshop on Mining and Learning with Graph, 2006:85-96.
    69 H.Cheng, X.Yan, J.Han and C.Hsu. Discriminative frequent pattern analysis for effective classification. Proc. of ICDE, Istanbul, Turkey, 2007:716-725
    70 Sebastian Nowozin, G?khan H. Bakir, Koji Tsuda. Discriminative Subsequence Mining for Action Classification. ICCV 2007: 1-8
    71 Borgelt, T.Meinl and M.Berthold. MoSS:a program for molecular substructure mining. OSDM’05: Proc. of the 1st international workshop on open source data mining, New York, USA, 2005:6-15
    72 P.Radivojac, Z.Obradovic, A.K.Dunker and S.Vucetic. Feature selection filters based on the permutation test. In Pedreschi, Machine Learning: ECML 2004:334-346
    73 Christoph Helma, Tobias Cramer, Stefan Kramer, Luc De Raedt.Data Mining and Machine Learning Techniques for the Identification of Mutagenicity Inducing Substructures and Structure Activity Relationships of Noncongeneric Compounds. Journal of Chemical Information and Modeling 2004,44(4): 1402-1411
    74 Jeroen Kazius, Siegfried Nijssen, Joost N. Kok, Adriaan P. Ijzerman. Substructure Mining Using Elaborate Chemical Representation. Journal of Chemical Information and Modeling.2006,46(2): 597-605
    75 M.Thoma, H.Cheng, J.Han and etc. Near-optimal supervised feature selection among frequent subgraphs. SAIM Int’l Conf. on Data Mining, 2009
    76 B.Bringmann and A.Zimmermann. Tree2-decision trees for tree structured data. Proc. of 2005 European Symp.Principle of Data Mining and Knowledge Discovery, 2005:46-58
    77 Zimmermann and B.Bringmann. CTC-correlating tree patterns for classification. Proc. of 2005 Int Conf. Data Mining (ICDM05), 2005:833-836
    78 W.Fan, K.Zhang, H.Cheng, J.Gao, X.Yan, J.Han, P.S.Yu and O.Verscheure. Direct mining of discriminative and essential frenquent patterns via model-based search tree. KDD, ACM, 2008:230-238
    79 H.Cheng, X.Yan, J.Han and P.S.Yu. Direct Discriminative Patterns Mining for Effective Classification.
    80 Andreas Fischer, Kaspar Riesen and Horst Bunke. An Experimental Study of Graph Classification Using Prototype Selection. IEEE 2008
    81汪卫,周皓峰,袁晴晴.基于图论的频繁模式挖掘.计算机研究与发展, 2005,42 (2): 230-235
    82 Chen Wang, Wei Wang, Jian Pei, Yongtai Zhu, Baile Shi. Scalable mining of large disk-based graph databases. KDD’04,2004:316-325
    83 Chen Wang, Yongtai Zhu, Tianyi Wu, Wei Wang, Beile Shi. Constraint-Based Graph Mining in Large Database. APWeb,2005:133-144
    84 Chen Wang, Mingsheng Hong, Jian Pei, Haofeng Zhou, Wei Wang, Baile Shi. Efficient Pattern-Growth Methods for Frequent Tree Pattern Mining. PAKDD’04, 2004:441-451
    85 Qingqing Yuan, Yubo Lou, Haofeng Zhou, Wei Wang, Baile Shi. Extract Frequent Pattern from Simple Graph Data. WAIM’02, 2002:158-169
    86李先通,李建中,高宏.一种高效频繁子图挖掘算法.软件学报. 2007,18(10): 2469-2480
    87张硕,高宏,李建中,邹兆年.不确定图数据库中高效查询处理,计算机学报, 2009, 32(10):2066-2079
    88邹兆年,李建中,高宏,张硕.从不确定图中挖掘频繁子图模式.软件学报,2008,19(12):1-12
    89 Zou Zhaonian, Li Jianzhong, Gao Hong and Zhang Shuo. Mining frenquent subgraph patterns from uncertain graph data. IEEE Transactions on Knowledge and Data Engineering, 2010,22(9):1203-1218
    90 Zou Zhaonian, Li Jianzhong, Gao Hong and Zhang Shuo. Finging top-k maximal cliques in an uncertain graph. Proc. of the ICDE 2010. Long Beach, California, USA, 2010:649-652
    91韩蒙,张炜,李建中. RAKING:一种高效的不确定图K-极大频繁模式挖掘算法.计算机学报, 2010,33(8):1387-1395
    92 Liu Y, Li.J, Gao H. Summarizing graph patterns. Proc. of the 24th IEEE International Conference on Data Engineering Cancun, Mexico, 2008:903-912
    93刘勇,高宏,李建中.基于联合意义度量的Top-K图模式挖掘.计算机学报, 2010,33(2):215-230
    94刘勇,李建中,朱敬华.一种新的基于闭显露模式的图分类方法,计算机研究与发展. 2007,44(7):1169-1176
    95 Zhiping Zeng, Jianyong Wang, Jun Zhang, Lizhu Zhou. FOGGER: an algorithm for graph generator discovery.EDBT 2009:517-528.
    96周颖杰,胡光岷,贺伟淞.基于时间序列图挖掘的网络流量异常检测.计算机科学, 2009, 36 (1): 46-50
    97刘俊侠.使用有向图挖掘时间间隔序列模式.计算机科学探索, 2007,2 (6): 666-672
    98 Shrivastava Swapnil, Pa. Supriya. Graph mining framework for finding and visualizing substructures using graph database. Proc. of the 2009 International Conference on Advances in Social Network Analysis and Mining, ASONAM 2009,2: 379-380
    99 Rakhshan, L.B. Holder, D.J. Cook. Structural Web Search Engine. FLAIRS Conference, Augustine, Florida, USA, 2003:319-324
    100 L. Yang, H. Lee, W. Achary. Mning Frequent Query Patterns from XML Queries, Eighth International Conference on Database Systems for Adavanced Application, 2003: 355-363
    101 Tatsuya Asai, Hiroki Arimura, Kenji Abe, Shinji Kawasoe, Setsuo Arikawa. Online Algorithms for Mining Semi-structured Data Stream. Proc. of ICDM’02, 2002:27-34
    102王艳辉,吴斌,王柏.频繁子图挖掘算法综述.计算机科学, 2005, 32 (10): 193-196
    103 David W. Williams, Jun Huan, Wei Wang. Graph database indexing using structured graph decomposition. Proc. of the 23rd IEEE, 2007:231-235
    104李玉华,罗汉果,孙小林.一种基于Apriori思想的频繁子图发现算法.计算机工程与科学, 2007, 29 (4):84-87
    105 West D.B.图论导引(第二版).李建中,骆吉洲译.北京:机械工业出版社,2007: 33-127
    106 Ryuhei Uehara, Seinosuke Toda, Takayuki Nagoya. Graph isomorphism completeness for chordal bipartite graphs and strongly chordal graphs. Discrete Applied Mathematics, 2005, 145 (3): 479-482
    107周杨,王峰. FSM-基于子图同构和结构同构的频繁子图挖掘算法.计算机研究与发展, 2007, 44 (Suppl.): 296-301
    108 L.P. Cordella, P. Foggia, C. Sansone, M. Vento. A (Sub)Graph Isomorphism Algorithm for Matching Large Graphs. Transactions on Pattern Analysis and Machine Intelligence, 2004, 26 (10): 1367-1372
    109 P.Tan, V.Kumar and J.Srivastava.Selecting the right interestingness measure for association patterns. Proc. of SIGKDD, 2002: 32-41
    110 W.Geamsakul, T.Matsuda, T.Yoshida and etc. Classifier construction by graph-based induction for graph-structured data. Proc. of PAKDD’03, 2003:52-62
    111吴卫江,王智广,张晶.两个经典频繁子图挖掘算法的对比与分析.内蒙古师范大学学报:自然科学版, 2009, 38 (2): 167-170
    112 T. Scheer and S. Wrobel. Finding the most interesting patterns in a database quickly by using sequential sampling. J. of Machine Learning Research, 2002, 3:833-862
    113 N. Lachiche, P. Flach. 1BC2: a True first-Order Bayesian Classifier. Proc.of the 12th International Conference on Inductive Logic Programming, Sydney, Australia, 2002:133-148
    114 Ou Guobin, Murphey YiLu. Multi-class pattern classification using neural networks. Pattern Recognition, 2007,40(1): 4-18
    115 X.Yin and J.Han. CPAR: Classification based on predictive association rules. Proc. of SDM, 2003:331-335

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700