名老中医经验传承中的数据挖掘技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
中医学是中华民族的优秀文化遗产,在当今世界回归大自然的浪潮下,其优势越来越突出,地位也越来越重要。中医学是一门临床经验要求比较高的学科,当代中国名老中医的诊疗经验,是他们在临床实践中与中医学理论结合、突破、创新的结果,包含了中医基础理论的原则和名老中医的独创心得或见解,是发展中医药学的宝贵财富。因此对当代名老中医学术思想临证经验的继承不仅能丰富中医药学的理论体系,还能对整个医学科学的发展产生巨大的推动作用。
     对名老中医学术思想和临证经验的研究,传统的方法已经越来越显示其不足,应用现代科学技术对这些名老中医的临床诊疗经验进行科学解析显得尤为迫切。数据挖掘是一种有效的信息处理技术,采用数据挖掘技术对名老中医学术思想和临证经验进行研究,可以全面解析其中的规律,分析名老中医个体化诊疗信息特征,提炼出临证经验中蕴藏的新理论、新方法、新知识,实现名医经验的有效总结与传承。
     本文主要对名老中医经验传承中涉及的相关数据挖掘技术进行了研究,以一位名老中医的慢性胃炎临床诊断医案为原始数据,从不同的角度研究了若干算法在其中的应用。在关联规则挖掘方面,分析了关联规则的经典算法Apriori算法和FP-Orowth算法,并针对基于支持度一置信度的关联规则挖掘算法的不足,研究了一种基于遗传算法的正相关关联规则挖掘算法。最后采用FP-Growth算法和基于遗传算法的正相关关联规则挖掘算法对中医临床数据进行了挖掘,并将两种算法挖掘的结果进行了分析。在决策树分类方面,分析了决策树学习中的两个重要算法ID3算法和C4.5算法,根据C4.5算法具有较高算法精度及较强适应性的特点,将其应用到中医辨证分类中,以慢性胃炎的中医辨证数据为实验数据,建立了关于慢性胃炎的中医辨证分类决策树,并对其进行了分析。
Traditional Chinese Medicine (TCM) is the excellent cultural heritage of the Chinese nation.Today under the waves of returning to the nature, its advantages will be more and more prominent, and its status is also becoming increasingly important.TCM is a subject which requires high clinical experience.The clinical experience of the famous herbalist doctors is a summary of the practice and theory, and also a valuable treasure in development of TCM. It can not only enrich the theoretical system, but also have a tremendous role in promoting the development of TCM,if we research the academic thinking and clinical experience of the famous herbalist doctors.
     For the researching academic thinking and clinical experience of the famous herbalist doctors, the traditional methods have appeared inefficient, so, it is necessary to adopt modern science and technology to achieve above goal. Data mining is an efficient technology, which can be used for above goal. By data mining, academic thinking and clinical experience can be analyzed, new knowledge such as new theories and new rules can be extracted. Accordingly, clinical experience of the famous herbalist doctors can be inherited effectively.
     In this thesis, we focus on some mining technologies used for mining of TCM. A famous herbalist doctor's medical records about chronic gastritis are used as original data, and several applications of different algorithms are researched from different angles. In the part of mining association rules, classical algorithms of association rules such as Apriori algorithm and FP-Growth algorithm are compared, and in view of the limitation of the support-confidence algorithm, a new algorithm for mining positively correlated association rules based on Genetic Algorithms(GAs) is designed. Finally, the new algorithm and FP-Growth algorithm are used to mine association rules from medical records of chronic gastritis, and the results of both algorithms are made comparison. In the part of decision tree, ID3 algorithm and C4.5 algorithm are researched, which are very important in the decision tree. Because C4.5 algorithm has the characteristics of high accuracy and strong adapted ability, it is used in the dialectical classification of TCM. A decision tree about dialectical classification of chronic gastritis is built by using chronic gastritis dialectical data and the result is analyzed.
引文
[1]Jiawei Han,Micheline Kamber.Data Mining:Concepts and Techniques.Morgan Kaufmann Publishers,2001
    [2]安淑芝等.数据仓库与数据挖掘.北京:清华大学出版社,2005
    [3]强永乾,郭佑民,王秋萍.数据挖掘技术在临床医学中的应用.中国高等医学教育,2007(4):92-94
    [4]徐刚,袁兆康.数据挖掘及其在医学领域中的应用和展望.实用临床医学,2007,7(11):196-198
    [5]杨春华,李春花,王桂枝,杨艳荣.数据挖掘技术在药物研究与发现中的应用.医药导报,2005,24(12):1143-1145
    [6]丁维,蒋永光,宋姚屏,吴孟旭,李昆.数据挖掘及其在中医领域的应用研究.数理医药学杂志,2007,20(3):403-404
    [7]刘建平,张柯欣,杨钧.数据挖掘技术及其在中医药领域中的应用.辽宁中医药大学学报,2007,9(6):203-204
    [8]陈明,张书河.关联规则在中医疾病证候诊断中的应用.中华医学丛刊,2004,4(5):14-16
    [9]刘晋平.数据挖掘在中医脉诊研究中的应用。天津中医药大学硕士论文,2002
    [10]蒋永光,胡波,刘娟等.方剂配伍的数据挖掘可行性探索.四川中医,2004,22(8):25-28
    [11]何前锋,崔蒙,吴朝晖等.方剂中配伍知识的发现.中国中医药信息杂志,2004,11(7):655-658
    [12]姚美村,艾路,袁月梅等.消渴病复方配伍规律的关联规则分析.北京中医药大学学报,2002,25(6):48-50
    [13]陈波,蒋永光,胡波等.东垣脾胃方配伍规律之关联分析评述.中医药学刊,2004,22(4):611-612
    [14]薛景,施诚.数据挖掘技术在中医领域中的应用.中医药信息,2005,22(5):6-7
    [15]Agrawal R,Mannila H,Srikant R,et al.Fast discovery of association roles.Advances in Knowledge Discovery and Data Mining.Cambridge:MIT Press,1996
    [16]Jiawei Han,J Pei,Yin Y.Mining frequent patterns without candidate generation.2000ACM SIGMOD Intl.Conference on Management of Data
    [17]周明,孙树栋.遗传算法原理及应用.北京:国防工业出版社,1999
    [18]Goldberg D E.Genetic Algorithms in Search,Optimization and Machine Learning.Adison-Weley,1989
    [19]袁玉波.杨传胜等.数据挖掘与最优化技术及其应用.北京:科学出版社.2007
    [20]Manish Saggar,Ashish Kumar Agrawal,Abhimanyu Lad.Optimization of association rule mining using improved genetic algorithms.IEEE International Conference on Systems,Man and Cybernetics,2004:3725-3729
    [21]Cunrong li,Mingzhong Yang.Association rules data mining in manufacturing information system based on genetic algorithms.International conference on computational electromagnetic e and its applications proceedings,2004(3):153-156
    [22]Edgar Noda,Alex A.Freitas,Heitor S.Lopes.Discovering interesting prediction rules with a genetic algorithm.Evolutionary Computation,1999.CEC 99.Proceedings of the 1999 Congress on Volume 2,6-9 July 1999 Page(s):Digital Object Identifier 10.1109/CEC.1999.782601
    [23]Cunrong li,Mingzhong Yang.Association rules data mining in manufacturing information system based on genetic slgorithms.2004 3~(rd)International Conference on Computational Electromagnetice and Its Applications Proceedings
    [24]Ayahiko Niimi,Eiichiro Tazaki.Genetic programming combined with association rule algorithm for decision tree construction.Fourth International Conference on knowledge-Based Intelligent Engineering Systems & Allied Technologies,30~(th)Aug-1~(st)Sept 2000.Brighton,UK
    [25]许珂,刘希玉.基于遗传算法的关联规则挖掘方法及应用.重庆工学院学报(自然科学版),2007,21(7):131-133
    [26]曾令明,金虎。基于遗传算法的双向关联规则挖掘。微电子学与计算机.2006,23(增刊):35-37
    [27]徐蕾,贺佳,孟虹,王忆勤,贺宪民,范思昌,郎庆波.基于信息熵的决策树在慢性胃炎中医辨证中的应用.第二军医大学学报,2004,25(9):1009-1012
    [28]Tom M.Mitchell.Machine Learning.The McGraw-Hill Companies,Inc,1997
    [29]J.R.Quinlan.Induction of decision tree.Machine Learning,1986(1):81 - 106
    [30]J.R.Quinlan.C4.5:Programs for Machine Learning.Morgan Kaufmann Publishers Inc,1993
    [31]J.R.Quinlan.Simplifying decision trees.International Journal of Man-Machine Studies,1987,27(3):221-234
    [32]Niblett T,Bratko I.Learning decision rules in noisy domains.Proceedings of Expert Systems '86,The 6Th Annual Technical Conference on Research and Development in Expert Systems Ⅲ,1986:25-34
    [33]Fa-Chao Li,Juan Su,Xi-Zhao Wang.Analysis on the fuzzy filter in fuzzy decision trees.Proceedings of the Second International Conference on Machine Learning and Cybernetics,Xi'an,2-5 November,2003
    [34]Ananth Sankar,Richard J.Mammone.Growing and pruning neural tree networks.IEEE Transactions on Computers,Vol.42,No.3,March 1993:291-299
    [35]Zi-Ying You,Hong-Yan Ji.An experimental study on relationship between pruning algorithms and selection of parameters in fuzzy decision tree generation.Proceedings of the Third International Conference on Machine Leaming and Cybernetics,Shanghai,26-29 August,2004
    [36]李道国,苗夺谦,俞冰.决策树剪枝算法的研究与改进.计算机工程,2005,31(8):19-21
    [37]董琳,邱泉,于晓峰等.数据挖掘-实用机器学习技术.北京:机械工业出版社,2006
    [38]魏红宁.决策树剪枝方法的比较.西南交通大学学报,2005,40(1):44-48
    [39]Keeley Crockett,Zuhair Bandar,David Mclean.Growing a fuzzy decision forest.Fuzzy Systems,2001.The 10th IEEE International Conference on Volume 2,2-5 Dec.2001Page(s):614-617 vol.3
    [40]Xu-Min Liu,Hou-Kuan Huang,Wei-Xiang Xu.Simplify the method of decision tree:an example for surface modeling.Proceedings of the Fourth International Conference on Machine Learning and Cybernetics,Guangzhou,18-21 August 2005
    [41]牛兴文.高等代数与解析几何.北京:化学工业出版社,2005
    [42]许後.决策树算法中的连续属性处理方法.河北理工学院学报,2007,29(2):71-80
    [43]贺宪民,孟虹,王忆勤,郎庆波,范思昌.基于熵的决策树理论及其在中医证型研究中的应用.数理统计与管理,2004,23(5):57-62
    [44]罗森林,成华,顾毓清等.C4.5算法在2型糖尿病分类规则建立中的应用.计算机应用研究,2004(7):174-179