胃粘膜肿瘤显微图像分类器设计
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
计算机技术在医学诊断方面得到了广泛应用,其中运用模式识别技术自动识别肿瘤显微图像是其在医学领域中的重要应用之一。由于医学图像自身的复杂性,细胞之间、腺体之间、细胞和腺体之间存在严重粘连,基于单一特征的单一分类器的分类性能常常难以满足临床要求,为了有效改进单一分类器的分类性能,本文以胃粘膜肿瘤显微图像为主要研究对象,设计了一个基于多特征多级分类器融合的分类系统。
     特征提取是分类器设计的前提。根据病理医生提供的诊断依据,通过分析图像的各种特征,提取了其中的4个腺体特征、7个细胞特征及图像像素点的R通道颜色值,将特征进行融合,作为分类器设计的基础。
     胃粘膜显微图像分为正常、癌变和增生三大类。为了对其进行自动分类,设计了一个两级分类器:第一级分类器将图像识别为正常或异常(包括癌变和增生);第二级分类器将第一级分类中得到的异常图像分为癌变或增生。
     第一级分类器融合了基于腺体特征的全局分类器和基于细胞特征的局部分类器。全局分类器由一个基于4个腺体特征的马氏距离分类器和两个基于单个腺体特征的最小距离分类器组成,这三个分类器以递进的关系排列。局部分类器是一个由细胞特征构造的决策树。首先由全局分类器判断图像为正常或异常,如果判断为正常,则再经过局部分类器进行正常和异常的判断,通过全局或局部分类器分类为异常的图像需经过第二级分类器识别,判断图像为癌变或增生。
     第二级分类器是依据图像像素点的R通道颜色值,设计的一个基于改进的PCA+LDA的预分类器。该方法首先运用PCA进行降维,将特征值投影到PCA子空间。在该子空间中,首先假定测试样本为癌变类,将其与训练样本同时做LDA变换,计算在该LDA子空间中测试样本与癌变训练样本均值的距离d1;再假定其为增生类,像以上做同样的操作,计算测试样本与增生训练样本均值的距离d2;然后比较d1和d2的大小,即比较不同假定情况下的类内紧凑度,距离越小,类内紧凑度越高,则测试样本判为该类。改进的PCA+LDA很好的克服了传统PCA+LDA对测试样本泛化能力差的缺点。
     对该分类系统进行实验测试,达到了较高的分类准确率,对医学研究及临床诊断方面有着现实的意义和很好的应用前景。
With the wide application of computer, it’s an important topic in the field of medicine image to make research on tumor microscopic image auto-recognition with computer pattern recognition technology. As the inherent complexity of medical images, the severe adhesion between cells, glands, cells and glands, the classification performance of a single classifier based on a single feature is often difficult to meet the clinical requirements. In order to improve the performance of single classifier, this paper designs the classification system based multi-feature fusion and multi-classifier fusion for gastric mucosa tumor microscopic images.
     Feature extraction is the precondition of classifier design. According to the diagnostic basis provided by pathologist, by analyzing the various features of images, we extract four gland features, seven cell features and the image pixel color value in the R channel, fuse these features, as an basis of the classifier design.
     Gastric microscopic images are divided into three categories: normal, cancerous and dysplasia. In order to classify them automatically, we design a two-level classifier: the first-level classifier classify the images into normal or abnormal (including cancerous and dysplasia); the second-level classifier classify the abnormal images obtained from the first-level classification into cancerous or dysplasia.
     The first-level classifier is the fusion of the global classifier based on gland features and the local classifier based on cell features. The global classifier also is the fusion of one Mahalanobis distance classifier based on for gland features and two minimum distance classifiers based on single gland feature, the three classifiers is progressive arrangement. The local classifier is a decision tree constructed by the cell features. Firstly, we classify the image into normal or abnormal by the global classifier, if judged as normal, then, classify the image into normal or abnormal through the local classifier. We classify the abnormal image judged by the global and local classifier into cancerous or dysplasia.
     The second-level classifier is a pre-classifier based on the improved PCA+LDA, which is based on the image pixel color value in the R channel. In this method, Firstly, we use PCA for feature dimension reduction, the feature values are projected on to the PCA sub-space. In the PCA sub-space, firstly, we suppose that the test image belongs to cancerous, then, do LDA transform for the test image and the training samples at same time, in the LDA sub-space, calculate the distance between the test image and the mean of the cancerous training samples, namely d1; Then, we assume that the test image belongs to dysplasia, do the same operation like that, calculate the distance between the test image and the mean of the dysplasia training samples, namely d2; At last, compare the distance d1 and d2, namely, compare the within-class compact ratios in different assumptions, the smaller the distance is, the higher the within-class compact ratio is, so the test sample were judged to that class. The improved PCA+LDA well overcome the shortcomings that traditional PCA+LDA is poor generalization ability for test sample.
     We do the test for the classification system, get a high classification accuracy rate, it has a practical significance for medical research and clinical diagnosis and a good application prospects.
引文
[1]章普生.胃腺癌细胞显微图像分割算方法研究[D]: [硕士学位论文].长沙:国防科学技术大学研究生院, 2007.
    [2]郭戈.图像分割算法研究及其在癌细胞诊断中的应用[D]: [硕士学位论文].郑州:解放军信息工程大学信息科学系, 2005.
    [3]谢华,夏顺仁,张赞超.医学图像识别中多分类器融合方法的研究进展[J].国际生物医学工程杂志, 2006, Vol.29 (3): 152-157.
    [4] A.Kusiak, K.H.Kernstine, J.A.Kern. Data Mining: Medical and Engineering Case Studies[J]. Proceedings of the Industrial Engineering Research 2000 Conference, Cleveland, Ohio, 2000, 1-7.
    [5] Thiran JP, M.B. Morphological feature extraction for the classifition of digital images of cancerous tissues[J]. IEEE Trans. BME, 1996, Vol.43 (10): 1011-1013.
    [6] Maria-Luiza Antonie, Osmar R.Zayane, Alexandru Coman. Application of Data Mining Techniques for Medical Image Classification. Proceedings of the Second International Workshop on Multimedia Data Mining(MDM/KDD’2001), in conjunction with ACM SIGKDD conference. San Francisco, USA, August 26, 2001.
    [7] SPyridonosP, R.P. Cavouras D, Berberidis K. Computer-based grading of haematoxylin-eosin stained tissue sections of urinary bladder carcinomas[J]. Med Inform Internet Med, 2001, 26 (3): 179-190.
    [8] ROSITO Mario A, DAMIN Daniel C, MOREIRA Luis F. Nuclear chromatin texture in rectal carcinoma: Prognostic value[J]. Analytical and quantitative cytology and histology, 2003, Vol.25 (4): 215-220.
    [9] P. Karakitsos, A. loakim-Liossi, A. Pouliakis. A comparative study of three variations of the learning vector quantizer in the discrimination of bengn from malignant gastric cells. DIGITAL OBJECT IDENTIFIER, 1998, Cytopathology9: 114-125.
    [10] Esgiar A.N, Naguib R.N.G, Sharif B.S. Microscopic image analysis for quantitative measurement and feature identification of normal and cancerous colonic mucosa. Information Technology in Biomedicine, 2002, Vol.2 (3): 197-203.
    [11]陆新泉,李宁,陈世福.形态学和色度学在肺癌早期诊断系统中的研究与应用[J].模式识别与人工智能, 2001, Vol.13 (1): 116-120.
    [12]田捷,包尚联,周明全.医学影像处理与分析[M].北京:电子工业出版社, 2003: 234-237.
    [13]朱玉全,杨鹤标.数据挖掘技术[M].南京:东南大学出版社, 2003.
    [14]蒋智铭,张道中,吴锡琛.胃粘膜异型增生与高分化腺癌的形态定量分析及自动化诊断系统[J].中华医学杂志, 1989,vol.69 (7): 372-374.
    [15]夏顺仁,王太君,何振亚.胃粘膜癌前病变显微彩色图像识别系统的研制和使用[J].数据采集与处理, 1992, vol.7 (2): 117-121.
    [16]陈先来,肖晓旦,杨荣.基于误差反向传播神经网络的胃癌细胞识别研究[J].中国循证医学杂志, 2007, vol.7 (9): 637-640.
    [17]张春芬.分类器组合及其在医学图像分类中的应用[D]: [硕士学位论文].镇江:江苏大学研究生院, 2007.
    [18] Ahoun M A, Nagaty KA, El-AriefT I. A. robust content-based image retrieval system using multiplefeatures representation[J]. Networking, Sensing and Control, 2005, 116—122.
    [19]王亚杰,李殿起,付萍.多类特征显微图像的识别[J].东北大学学报(自然科学版), 2006, Vol.27(6).
    [20]郭依正.基于多特征融合的医学图像识别研究[D].镇江:江苏大学, 2007.
    [21]戴剑彬,张大力.图像分析中的松弛标记法[J].中国图像图形学报, 1998, 3A (2): 96-99.
    [22]刘相滨,向坚持,阳波.基于八邻域边界跟踪的标号算法[J].计算机工程与应用, 2001, 23 (1): 125-126.
    [23]余锦华,杨维权.多元统计分析与应用[M].广州:中山大学出版社, 2005, 135-136.
    [24]李建军,丁正生,张海燕.常用判别分类方法分析[J].西安科技大学学报, 2007, vol.27 (1): 138-142.
    [25] Kilian Q. Weinberger, Lawrence K. Saul . Distance metric Learning for Largin Nearest Neighbor Classification[J]. The Journal of Machine Learning Research, 2009, Vol.10 : 207-244-565.
    [26]朱明.数据挖掘[M].合肥:中国科学技术大学出版社, 2002, 86-94.
    [27]李文静.浅谈数据挖掘中的分类算法[J].信息技术, 2007, vol.36 (3): 14-15.
    [28]王曙燕.医学图像智能分类算法研究[D]: [博士学位论文].西安:西北大学, 2006.
    [29]邵峰晶,于忠清.数据挖掘原理与算法[M].北京:中国水利水电出版社, 2003.
    [30] Jain A K, Duin R P W, Mao J. Statistical pattern recognition: A review[J]. IEEE PAM I, 2000, Vol.22 (1):4-37.
    [31]李春光.流形学习及其在模式识别中的应用[D]: [博士学位论文].北京:北京邮电大学信息工程学院, 2007.
    [32]王卫东,郑宇杰,杨静宇.一种基于预分类的高效最近邻分类器算法[J].计算机科学, 2007, Vol.34 (2): 198-200.
    [33] Yang Jian, Yang Jing-Yu. Why can LDA be performed in PCA transformed space?[J]. Pattern Recognition, 2003, Vol.36 (2): 563-566.
    [34] Ralf Tautenhahn, Alexander Ihlow, Udo Seiffert. Adaptive Feature Selection for classification of Microscope Images[J]. Lecture Note in Computer Science, 2006, Vol.3849: 215-222.
    [35] Jonathon Shlens. A Tutorial on Principal Component Analysis[EB/OL]. http://www. snl. salk. edu/ ~shlens/pub/notes/pca.pdf.
    [36]赵选民,徐伟,师义民.数理统计(第二版)[M].北京:科学出版社, 2002.
    [37]吕西林,金国芳,吴晓涵.钢筋混凝土结构非线性有限元理论与应用[M].上海:同济大学出版社, 1997.
    [38]何灿芝,罗汉.应用统计学[M].长沙:湖南大学出版社, 2004.
    [39]赵玮,温小霓.应用统计学(下册)[M].西安:西安电子科技大学出版社, 2003.
    [40] Jun Liu, Songcan Chen. Resampling LDA/QR and PCA+LDA for Face Recognition[J]. AI2005: Adavances in Artificial Intelligence, 2005, Vol.3809: 1221-1224.
    [41] Yang J, Yang J Y, Ye H. Theory of fisher linear discriminant analysis and its application[J]. Acta automatic Sinica, 2003, 29 (4): 482-493.
    [42]余冰,金连甫,陈平.利用标准化LDA进行人脸识别[J].计算机辅助设计与图形学学报, 2003, Vol.15(3): 302-306.
    [43]伍世虔,徐军.动态模糊神经网络[M].北京:清华大学出版社, 2003, 167-168.
    [44] Lotlikar R, Kothari R. Fractional-Step Dimensionality Reduction. IEEE Trans. Pattern Analysis andMachine Intelligence, 2000, Vol.22: 623-627.
    [45] Liu C, Wechsler H. Learning the Face Space-Representation and Recognition. Proc. 15th Int. Conf. Patt Recog, Spanish, 2000, 249-256.
    [46] Xiao-gang Wang, Xiao-ou Tang. Experimental Study on Multiple LDA Classifier Combination for High Dimensional Data Classification[J]. Multiple Classifier Systems, 2004, Vol.3077: 344-353.
    [47]王文豪,严云洋.基于图像分块的LDA人脸识别[J].计算机工程与设计, 2007, Vol.28 (12): 2889-2891.
    [48]华顺刚,周羽,刘婷.基于PCA+LDA的热红外成像人脸识别[J].模式识别与人工智能, 2008, Vol.21(2): 160-165.
    [49] Sara lana-Serrano, Julio Villena-Roman, Jose Carlos Gonzalez-Cristobal. Nearest Neighbour Classification of Image Feature Vectors for Medical Image Annotation[J]. MIRACLE at Image CLEFannot 2008, 2009, Vol.5706: 728-731.
    [50]张著英,黄玉龙,王翰虎.一个高效的KNN分类算法[J].计算机科学, 2008, Vol.35 (3): 170-172.
    [51] Amit David, Boaz Lerner. Supprrt vector machine-based image classification for genetic syndrome diagnosis[J]. Pattern Recognition, 2005, 26: 1029-1038.
    [52]蒋芸,李战怀.基于改进的SVM分类器的医学图像分类新方法[J].计算机应用研究, 2008, Vol.25(1): 53-55.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700