基于多元图表示的中药指纹图谱可视化模式分析方法研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
中药现代化是目前国内外医药界研究的一大热点。中药指纹图谱技术是中药现代化的有力工具,可以用来鉴别中药的真伪,控制中药质量及评价其安全性和有效性。中药指纹图谱的解析与处理通常借助于计算机,分析其化学成分测定值及相关的药理作用,快速、准确地寻找出内在规律,作为中药材质量控制的指标。由于中药是多种成分协同作用,所以图谱数据的整体性对研究至关重要。但是,在当前的研究中,有的在分析过程中打破了这一整体性,有的保证了数据的整体性却又直观性差,不易理解。因此,在中药指纹图谱的研究中,如何在保证图谱数据整体性的前提下实现分析过程的可视化是一个重要课题。
     本文基于多元图表示的理论基础,针对中药指纹图谱的数据处理问题,提出了一种可视化的模式分析新方法,实现了中药指纹图谱数据处理的整体性和可视性的统一,为中药的鉴别及评价提供了一种新的信息处理手段。主要研究了三个基本方面的问题:多元数据的多元图表示方法及特征提取方法;中药指纹图谱的雷达图分类模型的建立;标准指纹图谱的建立。
     1.研究了多元数据多元图的表示方法及基于多元图表示原理的特征提取方法。提出了多元数据的类间重叠系数矩阵以剔除方差大而分类信息差的向量;对传统散布矩阵进行了优化以区分类别均值与全局均值之间距离比较相近的向量;采用样本点与各个类别之间的距离表示样本点,再进行分类研究;对人机交互在多元图特征提取中的应用进行了研究。实验证明了这些方法的可行性及优越性。
     2.研究了基于雷达图表示原理的中药指纹图谱的分类。在多元数据雷达图表示及特征提取的基础上,将雷达图的重心图特征进行了拓展,并对相邻幅值比图特征进行了优化。针对中药指纹图谱的分类问题,提出了重心特征峰参数以及多层雷达图表示的分类模型。实验取得了较好的效果。
     3.研究了标准指纹图谱的建立。针对标准指纹图谱传统构建方法存在的问题,基于聚类原理,将多参数距离融合聚类规则作为标准指纹图谱。该方法以分类为目的,结果生成的不再是传统的有意义的特征指纹图谱或是特征库,而是一种分类标准。实验分析了该方法的可行性和优越性。
     4.利用近红外光谱仪分别获得了葛根(纯葛根粉及掺假葛根粉)以及不同种类的人参的指纹图谱,采用多层雷达图模型及相关技术对葛根的真伪及人参的种类进行了鉴别研究,结果验证了本文提出的可视化中药指纹图谱模式分析方法的可行性和有效性。
The study on the modernization of Traditional Chinese Medicine (TCM)is a hot topic.The fingerprint technology is a powerful tool for modernizing the TCM, and it can be usedto identify the authenticity, control the quality and evaluate the safety and effectiveness.We process the TCM fingerprint by means of computer usually, analyzing their chemicalcomposition measured values and the related pharmacological effects, and finding out theinherent laws quickly and accurately as an indicator of quality control of TCM. Theintegrity of the fingerprint data is essential to the research because the TCM is synergisticeffect by multiple components. However, some researches break the integrity in analysisprocess, and some researches ensure the data integrity but with poor intuitive and hard tounderstand in the current study. Therefore, how to realize the visualization of the analysisprocess on the basis of ensuring the integrity of TCM data is an important issue in theTCM fingerprint research.
     Aiming at the data process problems of the TCM, in this thesis, a novel visualizationmethod of pattern analysis based on the multiple graph representation theory wasadvanced, which will help to ensure the unity of integrity and obscurity of the TCM andprovided a novel information process means for the identification and evaluation of theTCM. Three basic problems were focused on in the work of this thesis, multiple graphrepresentation and feature extraction of multivariate data, the construction of classificationmodel in radar plot of the TCM fingerprint data and the construction of the standardfingerprint.
     Firstly, the multivariate graph representation method of the multivariate data and thefeature extraction method based on the multivariate graph principle were studied. Thesorted overlap coefficient matrix was proposed to eliminate the variables with biggervariance and little classification information. The traditional scatter matrix was optimizedto separate the viables with similar sorted mean and whole mean. The sample wasrepresented by the distance between the sample and the class hyperplane and the distancewas as feature for classification. The application of Human-Computer Interaction (HCI)in feature extraction was studied. Experiments demonstrated the feasibility and advantages ofthese methods.
     Secondly, the classification of the TCM fingerprint based on the principle of radardiagram representation was studied. The center feature was expanded and the adjacentamplitude ratio was optimized based on the representation and feature extraction of theradar plot of multivariate data. Aiming at the classification problem of the TCMfingerprint, the center feature peak and the multi-layer radar plot representationclassification model were proposed firstly. Experiments had achieved good results.
     Thirdly, the construction of the standard fingerprint was studied. Aiming at theproblems of traditional methods for building the standard fingerprint, based on theclustering theory, a novel standard fingerprint that uses the clustering rules of themulti-parameter distance was put forward. Because classification is the purpose of themethod, the result is no longer a characteristic fingerprint or library with traditionalmeanings, but a classification standard. Experiments analyzed the feasibility andsuperiority.
     Finally, using the near infrared spectroscopy, the fingerprints of the Radix Puerariae(pure and adulterated goods)and ginseng (different types)were obtained. The distinction ofthe authenticity of radix Puerariae and the species of ginseng were studied. Theexperimental results verified the feasibility and effectiveness of the proposed visualizationanalysis method of TCM fingerprint.
引文
1高学敏主编.中药学.北京:中国中医药出版社, 2002.
    2邹纯才,鄢海燕编著.中药指纹图谱及其数字化.安徽:科学技术出版社, 2008.
    3吴伯平,温丽.中医药在国外的现状与发展趋势.中国中医药科技, 1996, 3(1): 37-39.
    4邓文龙,肖效良.试论中医药现代化与现代中药.中药药理与临床, 2002, 18(5): l-2.
    5高燕萍,周月芳,胡春湘.易混品种的药材鉴别比较.中华现代中医药杂志, 2005, 3(10):932-933.
    6张铁军,姜顺善.决明子的原植物研究.中草药. 1993, 24(1): 40-41.
    7马利飞,唐伯灵,李红,等.决明子及其伪品刺田菁种子的鉴别.中药材, 1993, 16(10): 20-21.
    8郑少臣,蔡少青.药物植物学与生药学(第4版).北京:人民卫生出版社, 2003.
    9杨利平.中药现代化的思路与探索.新中医, 2004, 36(9): 3-4.
    10曹卫民,金波,冯毅凡.中药现代化与超临界流体萃取技术.北京:中国医药科技出版社,2002.
    11国家药典委员会编.中华人民共和国药典(2005年版一部).北京:化学工业出版社, 2005.
    12国家医药管理局中草药情报中心站.植物药有效成分手册.北京:人民卫生出版社, 1986.
    13任德权.中药指纹图谱质控技术的意义与作用.中药材, 2001, 24(4): 235-239.
    14国家药品监督管理局.中成药, 2000, 22(1): 671.
    15谢培山.中药色谱指纹图谱.北京:人民卫生出版社, 2004.
    16李蒙蒙.略谈中成药质量标准制法项中存在的若干问题及其改进建议.中国药品标准, 2006,6(6): 28-31.
    17罗国安,王义明,曹进.多维多息特征谱及其应用.中成药, 2000, 22(6): 395-397.
    18谢培山.中药质量控制模式的发展趋势.中药新药与临床药理, 2001, 12(3): 188-191.
    19罗国安,梁琼麟,王义明.中药指纹图谱-质量评价、质量控制与新药开发.北京:化学工业出版社, 2009.
    20洪筱坤,王智华.中药数字化色谱指纹图谱(第一版).上海:上海科学技术出版社, 2003.
    21石志红,何建涛,常文保.中药指纹图谱技术.大学化学, 2004, 19(1): 33-39.
    22段天璇,马长华,韩祥.中药指纹图谱研究现状浅析.临床药物治疗杂志, 2006, 4(6): 18-21.
    23何勇,李晓丽,邵咏妮.基于主成分分析和神经网络的近红外光谱苹果品种鉴别方法研究.光谱学与光谱分析, 2006, 26(5): 850-853.
    24任卫波,韩建国,张蕴薇,等.近红外光谱紫花苜蓿品种耐盐性鉴别方法研究.光谱学与光谱分析, 2009, 29(2):386-388.
    25周晶,孙素琴,李拥军,等.近红外光谱和聚类分析法无损快速鉴别不同辅料奶粉.光谱学与光谱分析, 2009, 29(1): 110-113.
    26陈斌,李军会2,臧鹏,等.六味地黄丸指纹图谱的近红外光谱分析方法的建立.光谱学与光谱分析, 2010, 30(8): 2124-2128.
    27潘忠孝,邵学广.小波变换用于高效液相色谱的基线校正.分析化学, 1996, 24(2): 149-153.
    28王培训,周联,赖小平.分子生物学技术与中药鉴别.北京:世界图书出版公司, 2001.
    29边肇祺,张学工.模式识别(第二版).北京:清华大学出版社, 2000.
    30 Sharp Gregory C, Lee Sang W, Wehe David K. Maximum-likelihood registration of range imageswith missing data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(1):120-130.
    31 Mezghani Neila, Mitiche Amar, Cheriet Mohamed. Bayes classification of online Arabic charactersby Gibbs modeling of class conditional densities. IEEE Transactions on Pattern Analysis andMachine Intelligence, 2008, 30(7): 1121-1131.
    32 Qian Yuyin, Mita Akira. Structural damage identification using Parzen-window approach andneural networks. Structural Control and Health Monitoring, 2007, 14(4): 576-590.
    33 Li Lili, Zhang Yanxia, Zhao Yongheng. K-Nearest Neighbors for automated classification ofcelestial objects. Science in China, Series G: Physics, Mechanics and Astronomy, 2008, 51(7):916-922.
    34 Yang Chunyu, Zhou Jie. Non-stationary data sequence classification using online class priorsestimation. Pattern Recognition, 2008, 41(8): 2656-2664.
    35 Jain. A. K., Duin R, Mao Jianchang. Statistical Pattern Recognition: a Review. IEEE Transactionson Pattern Analysis and Machine Intelligence, 2000, 22(1): 4-37.
    36 Omachi Shinichiro, Omachi Masako. Fast template matching with polynomials. IEEE Transactionson Image Processing, 2007, 16(8): 2139-2149.
    37 Tatarinov Vladimir. Classification of vigilance based on EEG signal analysis by use of neuralnetwork and statistical pattern recognition. Neural Network World, 2006, 16(1): 71-92.
    38 Yin Runmin, Li Bohu, Fan Shuping. One-dimension angle-oriented projection template matching.Journal of Beijing University of Aeronautics and Astronautics, 2007, 33(5): 561-564.
    39 R. O. Duda, P. E. Hart, D. G. Stork. Pattern Classification, Second Edition. Wiley, 2000.
    40 R. P. W. Duin, F. Roli, D. Ridder. A Note on Core Research Issues for Statistical PatternRecognition. Pattern Recognition Letters, 2002, 23(4): 493-499.
    41 R. P. W. Duin, E. Pekalska. The Science of Pattern Recognition: Achievements and Perspectives.Studies in Computational Intelligence, 2007, 63: 221-259.
    42 Du Hao, Chen Yanqiu. Rectified nearest feature line segment for pattern classification. PatternRecognition, 2007, 40(5): 1486-1497.
    43 Ng Wing W.Y., Yeung Daniel S., Firth Michael, Tsang Eric C.C., Wang Xizhao. Feature selectionusing localized generalization error for supervised classification problems using RBFNN. PatternRecognition, 2008, 41(12): 3706-3719.
    44 Cai D. Michael, Gokhale Maya, Theiler James.Comparison of feature selection and classificationalgorithms in identifying malicious executables. Computational Statistics and Data Analysis, 2007,51(6): 3156-3172.
    45 Salappa A., Doumpos M., Zopounidis C. Feature selection algorithms in classification problems:An experimental evaluation. Optimization Methods and Software, 2007, 22(1): 199-212.
    46 Gunal Serkan, Edizkan Rifat.Subspace based feature selection for pattern recognition.InformationSciences, 2008, 178(19): 3716-3726.
    47 Huang Cheng-Lung, Wang Chieh-Jen. A GA-based feature selection and parametersoptimizationfor support vector machines.Expert Systems with Applications, 2006, 31(2): 231-240.
    48 Sharma Alok, Paliwal Kuldip K., Onwubolu Godfrey C.Class-dependent PCA, MDC and LDA: Acombined classifier for pattern classification. Pattern Recognition, 2006, 39(7): 1215-1229.
    49 Guo Hong , Zhang Qing, Nandi Asoke K.Feature extraction and dimensionality reduction bygenetic prog ramming based on the Fisher criterion. Expert Systems, 2008, 25(5): 444-459.
    50 Cevikalp Hakan , Neamtu Marian, Barkana Atalay. The kernel common vector method: A novelnonlinear subspace classifier for pattern recognition. IEEE Transactions on Systems, Man, andCybernetics, Part B: Cybernetics, 2007, 37(4): 937-951.
    51 Li Yong-Zhi, Yang Jing-Yu, Wu Song-Song. Class-information-incorporated kernel principalcomponent analysis method. Pattern Recognition and Artificial Intelligence, 2008, 21(3): 410-416.
    52 Sicard Rudy, Artieres Thierry, Petit Eric.Learning iteratively a classifier with the Bayesian ModelAveraging Principle. Pattern Recognition, 2008, 41(3): 930-938.
    53 Hui Wang. Nearest Neighbors by Neighborhood Counting. IEEE Transactions on Pattern Analysisand Machine Intelligence, 2006, 28(6): 942-953.
    54 N. Cristinanini and J. Shawe-Taylor. An Introduction to Support Vector Machines. CambridgeUniversity Press, UK, 2000.
    55 El bieta P kalska, Robert P. W. Duin. The Dissimilarity Representation for Pattern Recognition.World Scientific Publishing, 2005.
    56 H. Haken. Pattern Recognition and Synchronization in Pulse-Coupled Neural Networks. NonlinearDynamics, 2006, 44(4): 269-276.
    57 Alan Rogersa, John Keatingb, Robert Shortenc. A novel pattern classification scheme using theBaker’s map. Neurocomputing, 2003, 55(4): 779-786.
    58 L. Goldfarb. A New Approach to Pattern Recognition, in: L.N. Kanal, A. Rosenfeld (Eds.).Progress in Pattern Recognition, Elsevier Science Publishers BV, 1985, 2: 241-402.
    59王守觉.仿生模式识别(拓扑模式识别)—模式识别新模型的理论与应用.电子学报, 2002,30(10): 1417-1420.
    60陈宗海.智能自动化技术的现状与发展趋势.自动化博览, 2001, 18(2): 4-7.
    61何锡文,邢婉丽.模式识别及其在分析化学中的应用.分析科学学报, 1995, 11(4): 64-70.
    62张骏,方勇华,荀毓龙.化学蒸汽红外光谱的遥感与识别.激光与红外, 1997, 27(05): 282-285.
    63李权龙,袁东星,杨竼原,等.烃类气体的智能识别.厦门大学学报(自然版), 1996, 1: 67-72.
    64 Yousr M., Awadallah A.G., Salem T.. Assessment of Nile water quality data using exploratory dataanalysis and clustering of variables. Geoscience Research, 2011, 2(2): 49-60.
    65 J. Schreitmuller, M. Vigneron, R. Bacher, et al. Pattern Analysis of Polychlorinated Biphenyls(PCB)in Marine Air of the Atlantic Ocean. International Journal of Envionmental AnalyticalChemistry. 1994, 57(1): 33-52.
    66 E. Marengo, M. C.Gennaro, D. Giacosa, et al. How Chemometrics Can Helpfully Assist inEvaluating Environment Data Lagoon Water. Analytica Chimica Acta Anal, 1995, 317(3): 53-63.
    67 Philipsom S D. British Herbal Pharmacopocia. British Herbal Medicine Association Publication,1996.
    68 Bard B. Quality Analysis and Standardized extractts of medicinal Herbs. PMAP conference, 1997.
    69 WHO Guideline for the Assessment of herbal Medicines, 1996.
    70任德权.中药指纹图谱质控技术的意义与作用.国际色谱指纹图谱评价中药质量研讨会论文集.广州, 2001.
    71 U. S.. Department of Health and Human Services Food and Drug Administration Center for DrugEvaluation and Research. Draft Guidance for Industry on Botanical Drug Products,2000.
    72杜力军,邢东明.当前中药新药研制中药理学应用基础研究的几个命题及其对策.世界科技研究与发展, 2000, 22(2): 62-65.
    73 Edzard E. Complmentary Medicine-An Objective Appraisal. Butterworth Heinemann, 1996.
    74王龙星,肖红斌,梁鑫淼,等.一种评价中药色谱指纹图谱相似性的新方法:向量夹角法.药学学报, 2002, 37(9): 713-717.
    75刘谦光,陈战国,张尊听.西洋参质量的化学模式识别.中草药, 1999, 30(11): 852-854.
    76汪学昭,宓鹤鸣.女贞子微量元素的模糊聚类分析.第二军医大学学报, 1995, 16(2): 183-184.
    77李永福,胡清宇.聚类分析法在评价延胡类中药质量中的应用.时珍国医国药, 1996.
    78苏薇薇.聚类分析法在黄芩鉴别分类中的应用.中国中药杂志, 1991, 16(10): 579-580.
    79苏薇薇,杨嘉文,吴忠,等.中药枳壳的化学模式识别研究.中药材, 2002, 25(10): 714-716.
    80赵惠茹,王锐平,王燕.聚类分析法在金银花及其伪品鉴别分类中的应用.陕西中医,2005,26(1): 72-73.
    81张耀奇,潘扬,王天山,等.术类中药及其相关成药质量的主成分分析.南京中医药大学学报,1997, 13(3): 149.
    82张兴辉,石力夫.不同产地中药女贞子的化学模式识别研究.解放军药学学报, 2004, 20(6):447-449.
    83周漩,冯毅凡,郭晓玲.主成分分析法用于人参皂苷薄层色谱分离的研究.广东药学院学报,2003, 19(2): 101-102.
    84赵宇,谢培山,梁逸曾,等.中药枳壳HPLC指纹图谱分析及化学模式识别分类研究.中国药学杂志, 2005, 40(11): 55-58.
    85 A. Inselberg, B. Dimsdale. Parallel Coordinates for Visualizing Multi-dimensional Geometry.Proceedings of Computer Graphics International’87, Tokyo, 1987, Springer-Verlag, In T. L. Kunii.
    86 S. M. Joan. Radar Plots: a Useful Way for Presenting Multivariate Health Care Data. Journal ofClinical Epidemiology, 2008, 61(4): 311-317.
    87 B. Alpern, L. Carter. Hyperbox. In Gregory M. Nielson and Larry Rosenblum, editors, Proceedingsof IEEE Visualization’91, San Diego, California, 1991:133-139.
    88 D. Asimov. The Grand Tour: A Tool for Viewing Multidimensional Data. SIAM Journal onScientific and Statistical Computing, 1985, 6(1):128-143.
    89 M. O. Ward. XmdvTool: Integrating Multiple Methods for Visualizing Multivariate Data. In R.Daniel Bergeron and Arie E. Kaufman, editors, Proceeding IEEE Visualization’94, Washington,DC, 1994: 326-336.
    90 John W. Tukey. Exploratory Data Analysis. Addison-Wesley, 1977.
    91 M. C. F. Oliveira, H. Levkowitz. From Visual Data Exploration to Visual Data Mining: a Survey.IEEE Transactions on Visualization and Computer Graphics, 2003, 9(3): 378-393.
    92 P. Compieta, S. Di Martino, M. Bertolotto, et al. Kechadi. Exploratory spatio-temporal data miningand visualization. Journal of Visual Languages & Computing, 2007, 18(3): 255-279.
    93 Y.W. Choong, et al. Mining multiple-level fuzzy blocks from multidimensional data, Fuzzy Setsand Systems, 2008.
    94 Robert A. Amar, John T. Stasko. Knowledge Precepts for Design and Evaluation of InformationVisualizations. IEEE Transactions on visualization and computer graphics, 2005,11(4):432-442.
    95 D. A. Keim, G. G. Robertson, J. J. Thomas, et al. Guest Editorial: Special Section on VisualAnalytics. IEEE Transactions on Visualization and Computer Graphics, 2006, 12(6), 1361-1362.
    96 Y. Tao, Y. Liu, C. Friedman, et al. Information Visualization Technology in Bioinformatics duringthe Post- genomic Era. DDT: Biosilico, 2004, 2(6): 237-245.
    97 Ying Tao,Yang Liu, Carol Friedman,et al. Information visualization technology in bioinformaticsduring the post genomicera. DDT: BIOSILICO 2004, 2(6).
    98 Jinjia Wang, Wenxue Hong. Feature extraction and classification of Graphical representations ofdata. Lecture Notes in Computer Science, 2008, 52(26): 534-541.
    99 S. Theodoridis, K. Koutroumbas. Pattern Recognition. Third Edition, Academic Press, 2006.
    100 Richard A. Johnson, Dean W. Wichern. Applied Multivariate Statistical Analysis, 4th ed., PrenticeHall, 1998: 233-275.
    101 Darinka Brodnjak Voncina, Zdenka Cencic Kodba, Marjana Novic. Multivariate data analysis inclassification of vegetable oils characterized by the content of fatty acids. Chemometrics andIntelligent Laboratory Systems, 2005, 75(1): 31-43.
    102 El bieta P kalska, Robert P. W. Duin. The Dissimilarity Representation for Pattern Recognition.World Scientific Publishing,2005.6.
    103钟珞,潘昊,封筠,等.模式识别.武汉:武汉大学出版社, 2006.
    104 Richard O. Duda, Peter E. Hart, David G. Stork. Pattern classification, 2nd ed., John Wiley & SonsInc: Wiley InterScience, 2000: 1-21.
    105罗毅辉,熊曙初,王四春,等.无监督环境下基于聚类集成的特征选择.微计算机信息, 2008,3(3): 265-267.
    106洪文学.基于多元统计图表示原理的信息融合和模式识别技术.北京:国防工业出版社,2008.
    107 Frank Y.Shih, Kai Zhang. A distance-based separator representation for pattern classification.Image and Vision Computing, 2008, 26: 667-672.
    108 Darinka Brodnjak-Voncina, Zdenka Cencic Kodbba, Marjana Novic. Multivariate data analysis inclassification of vegetable oils characterized by the content of fatty acids. Chemometrics andIntelligent Laboratory Systems, 2005, 75: 31-43.
    109 Tang S, Guo A. Choice behavior of Drosophila facing contradictory visual cues. Science, 2001,294: 1543-1547.
    110 Liu G, Seiler H, Wen A, et al. Distinct memory traces for two visual features in the Drosophilabrain. Nature, 2006, 439(7076): 551-556.
    111 Suykens J.A.K., Van Gestel T., De Brabanter J. Least Squares Support Vector Machines. Singapore:World Scientific, 2002.
    112孙亮,禹晶.模式识别原理.北京:北京工业大学出版社, 2009.
    113张福良.聚类分析与中药质量研究.北京:人民卫生出版社, 1994.
    114陈斌,李军会,臧鹏,等.六味地黄丸指纹图谱的近红外光谱分析方法的建立.光谱学与光谱分析, 2010, 30(8): 2124-2128.
    115任卫波,韩建国,张蕴薇,等.近红外光谱紫花苜蓿品种耐盐性鉴别方法研究.光谱学与光谱分析, 2009, 29(2): 386-388.
    116虞佳佳,何勇,鲍一丹.基于光谱技术的芒果糖度酸度无损检测方法研究.光谱学与光谱分析, 2008, 28(12): 2839-2842.
    117周晶,孙素琴,李拥军,等.近红外光谱和聚类分析法无损快速鉴别不同辅料奶粉.光谱学与光谱分析, 2009, 29 (1): 110-113.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700