胃肠肿瘤标志物诊断大肠癌之检验医学实践
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
数据挖掘(Data Mining),就是从大量数据中获取有效的、新颖的、潜在有用的、最终可理解的模式的非平凡过程。数据挖掘的广义观点:数据挖掘就是从存放在数据库,数据仓库或其他信息库中的大量的数据中“挖掘”有趣知识的过程。数据挖掘,又称为数据库中知识发现(Knowledge Discovery in Database, KDD),也有人把数据挖掘视为数据库中知识发现过程的一个基本步骤。知识发现过程以下步骤组成:(1)数据清理,(2)数据集成,(3)数据选择,(4)数据变换,(5)数据挖掘,(6)模式评估,(7)知识表示。数据挖掘可以与用户或知识库交互。
     数据挖掘利用了来自如下一些领域的思想:(1)来自统计学的抽样、估计和假设检验,(2)人工智能、模式识别和机器学习的搜索算法、建模技术和学习理论。数据挖掘也迅速地接纳了来自其他领域的思想,这些领域包括最优化、进化计算、信息论、信号处理、可视化和信息检索。一些其他领域也起到重要的支撑作用。特别地,需要数据库系统提供有效的存储、索引和查询处理支持。源于高性能(并行)计算的技术在处理海量数据集方面常常是重要的。分布式技术也能帮助处理海量数据,并且当数据不能集中到一起处理时更是至关重要。
     信息技术和生命科学被认为是21世纪的标志性学科。本世纪的人类社会被誉为“信息社会”,信息化,网络化,高科技化已成为社会发展的基本特征。特别是20世纪90年代Internet等现代信息技术的飞速发展和人类基因组计划的完成,使人们面临的不仅仅是一个庞大的信息数据库,而是浩瀚的信息海洋。正是生物技术和信息技术的有机结合,催化一个新的学科——检验医学信息学的诞生。
     医学是一门与试验和信息结合非常紧密的科学,检验医学更不例外。完成一个诊断或治疗的过程,也就是信息的获取,处理和利用的过程。可以说,更广泛地获取信息,更科学地分析信息,更合理地利用信息决定了医疗质量和医疗水平,而计算机技术在其中起到非常重要的作用。也正是由于计算机技术使医学检验发生革命性变化,改变了医学检验的学习理念和工作方式。随着信息技术的发展,主要是基因信息库和蛋白质信息的利用,高度集成的试验室信息系统(Laboratory Information System, LIS)和医院信息系统(Hospital Information System, HIS)的建立,临床医学信息学和疾病信息学的高速发展,医学检验教育的方向适应新的形式,在全体检验同仁的共同努力,医学检验也就很快发展成为不仅仅为临床提供实验数据,而且为临床诊疗决策提供重要信息的检验医学。
     研究目的:将有限的检验信息提炼为高效的诊治信息,从技术层面探索检验医学的临床实践新途径。
     研究方法:以CA72-4,CA199和CEA三项血清标志物检验诊断大肠癌为例,依托实验信息系统(LIS)与医院信息系统(HIS)的数据信息平台,利用人工神经网络(Artificial Neural Network, ANN)为数据挖掘工具和SPSS统计软件构建受试者工作特征曲线(Receiver Operating Characteristic, ROC)数据集,以验后概率解释每一份胃肠肿瘤标志物检验报告。
     研究结果:纳入研究的1206份胃肠道肿瘤标志物检验标本中大肠癌占11.5%;构建了CA199,CA72-4,和CEA检验筛查和诊断大肠癌的ROC数据集;大肠癌组三项血清标志物浓度均显著高于健康对照组和其他疾病组(<0.01);CA199,CA72-4,CEA和人工神经网络诊断模型预测值筛查大肠癌的ROC曲线下面积分别是0.624,0.692,0.721和0.785。而诊断大肠癌的ROC曲线下面积分别是:0.607,0.762,0.687和0.795。赋予验后概率的检验报告客观地提供了检测结果的参考价值。
     研究结论:人工神经网络(Artificial Neural Network, ANN)模型在多项检验项目分析具有更高的诊断效率,构建ROC数据集并赋予验后概率的检验报告是检验医学临床实践切实可行的新途径。
Data mining, it is processes that like taking effective, original, and potential serviceable the last comprehensible model form mass data. The broad definition of data mining is a process that to dig interesting intellective from the mass data to deposit database, data warehouse and other information bank. The other name is regard the data mining as a basic step that knowledge discovery from the database, knowledge discovery's process have seven steps to make up: (1) data purging (2) data integration (3) data selection (4) data covert (5) data mining (6) mode evaluation (7) knowledge representation. Data mining can together with consumer or knowledge base alternately.
     Data mining to utilize some areas' thoughts as follow:(1) from statistician's sampling estimation and hypothesis testing (2) artificial intelligence, pattern recognition and machine learning's search algorithm, modeling technique and theory of learning. Data mining accept fleetly from other areas thoughts. These areas contain optimization, evolutionary computing, information theory, signal processing, visualization and information retrieval. Some other areas are playing important prop up support role. Extraordinarily, to demand database provide effective memory, indexes and query processing support. Originate from high performance calculating technique which to deal with mass data is very important. Distributed technology can help to deal with mass data, and when the data can't concentration together to deal with that is more important.
     Information technology and Bioscience are known as 21 century's signal subject. This century's human society is reputation "information society", information-based, networking, high-technology have already to become society development's fundamental characteristics. Especially 20 century 90 eras, Such as Internet the modern time information technology's progress at full speed and human genome project accomplish. People to be faced with not only are enormous information data, but also vast information oceans. Just to connect Biotechnology with information technology, Catalysis is a new branch of study Laboratory Medical Informatics has birth.
     Medicine is a science that a trial and the information to integrate very tightly, Laboratory Medicine is not exception. Accomplish a diagnosis or therapeutics'process i.e. information acquisition, Handling and utilization's process. By means of obtain in information more widely, analyze information more science, and utilize information more reasonably to decide the Medical treatment quality and Medical treatment level, and computer technology play a very important role in it. Because of computer technology make Laboratory Medicine to take place revolution change. Changed Laboratory Medicine's learning philosophy and work style. With the information technology develops, Principal gene library and protein information utilize. Height integrated laboratory information system and Hospital Information System are built. Clinical Medicine informatics and disease informatics in high speed developed, the direction of Laboratory Medicine educative to adapt a new style. In all Laboratory colleagues strive in common. Laboratory Medicine also developed quickly that not only to offer experimental data for clinical, but also to offer very important information for clinical to diagnose and treatment.
     To make the limited laboratory information extraction for efficient diagnosis and treatment of information, and to explore a clinical new way for laboratory medicine from a technical level.
     CA72-4, CA199, and CEA, the three blood serum makers are used to diagnose carcinoma of large intestine, for example, relying on laboratory information system (LIS) and hospital information system (HIS) data information platform, artificial neural network (ANN) is used for data digging tools,and by means of SPSS statistical software to build ROC data sets, Depend on posterior probability to comment gastrointestinal tumor markers in each inspection reports.
     In 1206 samples, gastrointestinal tumor marker test specimens of colorectal cancer is accounted for 11.25%, to build CA199, CA72-4 and CEA testing,seeking and diagnosis carcinoma of large intestine of ROC data sets; the three carcinoma of large intestine serum markers'concentrations are significantly higher than the healthy control group and the other disease groups (P<0.01); CA199, CA72-4, CEA, and artificial neural network diagnostic model for carcinoma of large intestine screening predictive value of area under the ROC curve are 0.624、0.692、0.721and 0.785,while the diagnosis of carcinoma of large intestine in the area under the ROC curve are 0.607、0.762、0.687 and 0.795.Respectively, survey report assigned test posterior probability objectively provides a reference value.
     ANN model has a higher diagnostic efficiency analysis in a number of test items, to build ROC data sets,and a inspection report satisfied with the ROC data sets,which has been given the posterior probability,is a feasible new way in clinical practice of laboratory medicine.
引文
[1]丛玉隆.加强检验科与临床交流促进检验科与临床结合.中国检验医学杂志,2006,29(1):1-4.
    [2]鲁辛辛,岳燕,吴薇.检验医师队伍的建设与探索.中华检验医学杂志,2008,31(5):585—586.
    [3]杨美琴,郁琳琳,余捷凯,姜铁军,郑树.人工神经网络肿瘤标志物模型的建立及应用研究.中华检验医学杂志,2005,28(5):492-494.
    [4]郑磊,王前,张鹏,张继瑜.检验医师规范化培训实践和体会.西北医学教育,2008,16(5):1000—1002.
    [5]邱冰,刘颖,王燕颖,李东夏.CEA、CA19-9、CA72-4联合检测在诊断胃、结-直肠癌中的应用.中国实验诊断学,2008,12(8):1017—1019.
    [6]赵一鸣.诊断试验中的先验概率和后验概率.华西医学,1999,14(3):263.
    [7]姜伟,王开正,白克镇,邓剑,蔡美珠.精神分裂症患者血清蛋白标志物的临床应用研究.中国神经精神疾病杂志,2008,34(1):27—30.
    [8]王雪萍,佟素香.血清肿瘤标志物人工神经网络模型在胃癌诊断中的临床应用.实用医学杂志,2007,23(12):1821—1822
    [9]刘靳波,余晓林,夏乾峰,涂植光.ROC曲线及Logistic回归评价肿瘤标志物在胃结肠肿瘤的诊断价值.第四军医大学学报,2008, 29(14):1298—1300.
    [10]秦晓光.诊断性试验临床应用.江西医学检验,
    [11]宇传华译.诊断医学统计学.北京:人民卫生出版社,2005:13-37.
    [12]李丽.浅谈由医学检验向检验医学的转变.医疗装备,2010,(1):54—55.
    [13]秦耀春.检验医学面临的挑战.当代医学,2009,15(28):17.
    [14]秦晓光.诊断性试验临床应用——验后概率的探讨.江西医学检验,2003,21(2):65—66.
    [15]Tafeit E, Reibnegger G. Artificial neural networks in laboratory medicine and medicine outcome prediction. Clin Chem Lab Med.1999 Sep;37(9):845-853.
    [16]Vineis P, Ciccone G, Magnino A. Asbestos exposure, physical activity and colon cancer:a case-control study. Tumori.1993 Oct 31; 79(5):301-3.
    [17]Takahashi H, Miyazaki H, Deura M, Shimizu Y, Asukata I, Karneda H. Clinical evaluation of CA19-9, TPA, IAP and 5'-NPD-V as tumor markers of hepatocellular, bile duct and pancreas carcinoma. Gan No Rinsho.1985 May;31(6):623-30.
    [18]Pereira T Jr, Torres RA, Nogueira AM. Lymph node evaluation in colorectal cancer. Arq Gastroenterol.2006 Apr-Jun;43(2):89-93.
    [19]Song X, Mitnitski A, Cox J, Rockwood K. Comparison of machine learning techniques with classical statistical models in predicting health outcomes. Stud Health Technol Inform.2004,107(pt1):736-740.
    [20]Clermont G.Artificial neural networks as prediction tools in the critically ill.Crit Care.2005 Apr;9(2):153-154.
    [21]Chao ZC, Bakkum DJ, Potter SM. Shaping embodied neural networks for adaptive goal-directed behavior. PLoS Comput Biol.2008 Mar 28;4(3):e1000042.
    [22]Takabayashi K, Ho TB, Yokoi H, Nguyen TD, Kawasaki S, Le SQ, Suzuki T, Yokosuka O. Temporal abstraction and data mining with visualization of laboratory data. Stud Health Technol Inform.2007;129(Pt2):1304-1308.
    [23]Zhou X, Chen S, Liu B, Zhang R, Wang Y, Li P, Guo Y, Zhang H, Gao Z, Yan X. Development of traditional Chinese medicine clinical data warehouse for medical knowledge discovery and decision support. Artif Intell Med.2010 Feb-Mar;48(2-3):139-152.
    [24]Southworth H, O'Connell M. Data mining and statistically guided clinical review of adverse event data in clinical trials. J Biopharm Stat.2009 Sep;19(5):803-817.
    1.沈荣,张保文.人工神经网络初探.中国科技信息,2009,(18):30.
    2.汤素丽,罗宇锋.人工神经网络技术的发展与应用.电脑开发与应用,2009,22(10):59-61.
    3.陈格.人工神经网络技术发展综述.中国科技信息,2009,(17):88—89.
    4.邓平基,吴静.人工神经网络在临床应用中的伦理思考.医学与哲学(临床决策论坛版),2009,30(9):78-80.
    5.解庭波,张蕾.人工神经网络在消化道肿瘤检测和诊断中的应用进展.长江大学学报,2009,6(3):69-72.
    6.陈俊.神经网络的应用与展望.佛山科学技术学院学报(自然科学版),2009,27(5):30-32.
    7.邓平基,吴静.影响人工神经网络应用于临床的社会因素.医学与社会,2009,22(10):33—37.
    8.仲崇明,张子龄,蔡培培.ROC曲线评估相关肿瘤标志物对肺癌的诊断临界值.标记免疫分析与临床,2009,16(5):308—309.
    9.王敬瀚.ROC曲线在临床医学诊断实验中的应用.中华高血压杂志,2008,16(2):175—177.
    10.高晓虹,安庆玉,李晓枫.大肠癌相关因素的条件logistic回归分析.中国卫生统计,2009,26(6):605—607.
    11.王海霞,王先民.概述检测肿瘤标志物的方法.中国现代药物应用,2009,3(15):112—113.
    12.唐素枚,杨铁生.结肠癌相关肿瘤标志物及临床意义.中国实验诊断学,2009,13(8):1128—1133.
    13.袁成斌,赵任.结肠直肠肿瘤标志物的临床价值认识及进展.外科理 论与实践,2009,14(6):685—689.
    14.叶维洁,金冶宁.结直肠恶性肿瘤的新标志物—肿瘤型丙酮酸激酶.现代肿瘤医学,2009,17(10):2018—2020.
    15.李学鹏,周军.了解大肠癌治疗新进展,改变大肠癌治疗观念.中国全科医学,2009,12:48.
    16.朱宇均,周军.易致大肠癌的不良生活方式有哪些.中国全科医学,2009,12:13.
    17.顾克东,张雅青.肿瘤标志物最新研究技术及其临床检测方法.西北民族大学学报(自然科学版),2005,26(59):85—88.
    18.王华妮,陈文旭.肿瘤标志物检测技术及应用.福建医药杂志,2009,31(5):80-811.
    19.严丽丽.肿瘤分子标记物检测的进展.局解手术学杂志,2007,16(6):428—429.
    20.靳晓亮,杨波,关方霞,宋来君等.肿瘤与肿瘤标志物研究中证据的思考.医学与哲学(临床决策论坛版),2009,2(30):48—50.
    21.杨浏,黎增文.用Excel制作ROC曲线.现代检验医学杂志,2005,20(4):81.
    22. Hoff G. Colorectal cancer screening is urgent. Tidsskr Nor Laegeforen. 2010 May 6;130(9):925.
    23. Hamilton WT, Astin MP, Griffin T, Neal RD, Rose PW. Colorectal cancer secondary care may mislead. BMJ,2010 May 11;340:c2503.
    24. Andersen KN, Fried RG. Performing digital rectal examination can detect cancers. Am Fam Physician.2010 May 1;81(9):1073.
    25.Kono S. Host and environmental factors predisposing to cancer development. Gan To Kagaku Ryoho.2010 Apr;37(4):571-6.
    26. Dogan L, Kararnan N, Yilrnaz KB, Ozaslan C, Atalay C, Altinok M. Characteristics and risk factors for colorectal cancer recurrence. JBUON,2010 Jan-Mar;15(1):61-7.
    27. Doherty G, Walsh P, Sheridan J, Kevans D, Keegan D, Nolan B, White A, McDerrnott E, Sheahan k, O'Shea D, Hyland J, O'Donoghue D, O'Sullivan J, Mulcahy H. Clinical and pathological factors associated with colorectal cancer at the upper extreme of life. Jam Geriatr Soc.2010 Apr;58(4):794-5.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700