本文对细胞图像的分割、纹理提取及识别中的关键技术进行了深入研究,主要包括单细胞图像中细胞核与细胞质边缘的精确提取、细胞图像的纹理提取及细胞图像多特征融合分类。此外,本文还尝试改进极限学习机(Extreme LearningMachine,ELM)分类器处理细胞图像分类中常存在的不平衡数据问题。主要研究成果如下:
     1.本文提出一种基于射线梯度的GVF Snake主动轮廓模型,用以从单细胞图像中精确定位细胞核与细胞质的边缘。GVF Snake主动轮廓模型是种应用广泛的目标边缘跟踪算法。在细胞图像中,细胞质与背景间的界限相对模糊、细胞核与细胞质边缘附近常分散有干扰性的血细胞及炎症细胞、染色浓度不均匀,这些都容易将GVF Snake轮廓吸附到错误的位置。为解决这些问题,本文结合细胞图像中染色浓度的分布特点,提出了如下改进:(1)充分利用梯度的方向性信息,提出了基于射线梯度的边缘图计算思路,相比传统梯度边缘图,能有效提取模糊边缘;(2)提出了基于栈的灰度差补偿算法,结合正灰度差抑制,能有效克服由噪声、血细胞及炎症细胞等引起的虚假梯度的影响。Herlev宫颈细胞数据集上的实验验证了这种方法的有效性。
     3.本文提出一种基于局部最大熵多值模式的细胞图像纹理提取方法。局部多值模式(Local Multiple Pattern,LMP)是对局部二值模式(Local Binary Pattern,LBP)的改进,是种高效的图像纹理特征提取方法。但局部多值模式需手工设定多个阈值,且其特征维数过高。为解决这两个问题,提出如下改进:首先,统计每幅图像的灰度差直方图,依据最大熵原理在此直方图基础上自动计算各阈值,以保留最多的不确定性及分类信息。然后,使用平面切分组合编码机制取代原有的编码机制,将特征维度控制在多项式范围内。在Outex与KTH-TIPS纹理数据集上与局部二值模式及局部多值模式进行了全方位比较,验证了本方法的有效性。HEp-2细胞染色型别分类实验也取得了满意的效果。
This thesis is focused on some issues related to segmentation,texture extraction andclassification of cell images. These issues mainly include a method to accurately extractboth the nucleus and cytoplasm, two new texture extraction methods, as well as a fusionmethod of different kinds of features. In addition, using ELM to deal with the imbalanceddataset classification problem, which is common in medical data, is also discussed in thisthesis. The main contributes can be exhibited by the following aspects:
     1. A radiating GVF Snake (RGVF) model is proposed aiming at accurate extrac-tion of both the nucleus and cytoplasm from a single-cell image. GVF Snake model isa widely used contour tracking method in image processing. However, when used toextract the nuclei and cytoplasm from cell images, GVF Snake may be easily absorbedto wrong positions due to the fact that (1) the boundaries between the cytoplasm and thebackground are oftenquite obscure;(2) alot of inferences exist nearthe edgeof the nucleiand the cytoplasm, including inflammatory cells, blood cells and other noises. To solvethese problems, RGVF involves a new edge map computation method and a stack-basedrefinement, and is thus robust to contaminations and can effectively locate the obscureboundaries. The boundaries can also be correctly traced even if there are interferencesnear the cytoplasm and nucleus regions. Experiments performed on the Herlev dataset,which contains917images show the effectiveness of the proposed algorithm.
     2. A novel texture extraction method based on Gabor filters is proposed. Texturefeatures play important roles in cell classifications. The cell image is first decomposedby convolving with multi-scale and multi-orientation Gabor filters, then separated intoseveral blocks. The Block Feature Vector (BFV) can be obtained through statistical tech-niques. The Total Feature Vector (TFV) of the whole image is then constructed by conju-gating the BFVs in row column order. In the classification stage, a robust classificationmethod which performs multi-class classification is built based on many two-class clas-sifiers using voting mechanism. Before each two-class classifier, a feature extractionmodule adaptively selects the most important features. The results compared with thepublished results on Yale face database verify the validity of the proposed method. Thestaining pattern classification results on HEp-2cell dataset also prove that the proposemethod is effective.
     3. A novel texture extraction method named Maximum Entropy based Local Mul-tiple Pattern (MELMP) is proposed. Local multiple patterns(LMP) has been proved tobe an efficient and robust texture extraction method. However, the thresholds have to beset manually and the the feature dimension is quite high in LMP. To solve these prob-lems, a maximum entropy based thresholding scheme, which computes the thresholds bydividing the intensity difference histogram of an image equally, is adopted, and the split-concatenate encoding is used to form shorter and more effective feature vectors. Exper-imental results on four test suits with an SVM classifier show that the proposed methodachieves overall better performances than both LBP and LMP in texture classification.The staining pattern classification results on HEp-2cell dataset are also very satisfying.
     4. A feature fusion framework based on posteriori probability classifier and Ad-aBoost.M1framework is introduced. How to fuse different features together is quiteimportant in achieving better cell image classification results. In this thesis:(1) withineach boosting round, several posterior probability classifiers are trained corresponding todifferent descriptors, and then combined to an integrated classifier;(2) AdaBoost.M1ismodifiedtoenhancetheperformanceoftheintegratedclassifiers. ExperimentalresultsonHEp-2cell dataset show the proposed method is effective and can significantly improvethe classification accuracy.
     5. Two strategies to deal with imbalanced classification are proposed, namely cost-sensitive ELM (CS-ELM) and ELM based cost-sensitive AdaBoost (ELM-AdaCx). First,cost-sensitive information is introduced into the training process of ELM to form CS-ELM. A genetic algorithm (GA) is utilized to find the optimal weights. Second, the pro-posed CS-ELM is utilized as the meta classifier and embedded into a cost-sensitive Ad-aBoost.M1frameworktoformELM-AdaCx. Experimentalresultson19datasetsfromtheKEEL repository show that the proposed strategies could achieve more balanced resultsthan the basic ELM.
