用户名: 密码: 验证码:
肝脏CT辅助诊断系统中特征选择和提取研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
本文致力于解决肝脏计算机辅助诊断系统(Computer Aided Diagnosis)(简称CAD)中的难题,目标是建立一个可用于临床诊断训练的肝脏计算机辅助诊断平台。整个肝脏计算机辅助诊断系统主要由预处理,感兴趣区域的提取,特征提取和特征选择以及分类识别这几个模块组成。开始是将肝脏的多期图片进行一个简单的配准,使得取到的四期图片是人体肝脏同一位置的图片,接着进行一个半自动的感兴趣区域(Rejion of Interest,ROI)的提取,这是在有经验的医生的指导下完成的。对于提取出的ROI,进行肝脏纹理特征提取和特征选择,最后将得到的特征集合输入分类器进行分类识别。文章中重点研究肝脏计算机辅助诊断系统中的肝脏纹理特征提取和特征选择算法。
     由于肝脏病变的形状非常多样,并且没有一定的规律可循,所以并不能够像乳腺和肺结节那样用形状特征来分析并对病变进行分类,所以本文主要选取肝脏的纹理特征进行研究。本文从灰度信息,空间信息和时间信息三个方面进行考虑,提取出基于一阶统计矩的平均灰度值、标准差、熵、协方差、峰度、偏度等7个特征,基于灰度共生矩阵的四个方向上的角二阶矩、对比度、倒数差分矩、同质性等32个特征和基于多期图片的相对增强信息、增强变化趋势、信息增强比率等9个特征,这样共提取出48个肝脏纹理特征。利用所提取出的纹理特征就可以对肝脏CT图像的四种情况(正常,囊肿,血管瘤,肝癌)进行分类识别。但是这些特征有些包含重复信息,所以在特征提取之后设置一个特征选择模块。
     根据是否将分类器的设计作为评价准则函数的一部分,可以将特征选择算法分为过滤式(filter)和封装式(wrapper)两种。文中的特征选择模块分别采用了过滤式特征选择算法和封装式特征选择算法以及将二者相结合的组合式特征选择算法。过滤式特征选择算法的设计是将各种传统的顺序搜索算法包括顺序前向搜索算法,顺序后向搜索算法,增l减r法,顺序前向浮动搜索算法和顺序后向搜索算法与评价准则函数——马氏距离函数相结合。封装式特征选择算法则是利用遗传算法进行特征空间的搜索,将支持向量机的分类准确率用于评价准则函数的设计。组合式特征选择算法分为两步进行,第一步运用过滤式算法进行初步的特征筛选,得到五组不同的特征集合后,对这五组特征集合进行求并运算,得到的新的特征集合作为第二步的输入,第二步运用基于遗传算法和支持向量机的封装算法进行特征选择,得到最终的一个接近最优的特征集合。组合式特征选择算法既结合了传统的特征顺序过滤式方法又在此基础上运用了基于遗传算法和支持向量机的封装方法,既利用遗传算法优化搜索特征空间,又克服了传统统计学和单纯使用封装方法的效率问题。
     基于肝脏CT图片的计算机辅助诊断系统中的特征提取和特征选择算法的研究使得诊断的效率和准确率大大提高。文中提取出肝脏的纹理特征,对于各种组合的高维特征集合进行特征选择,找出最接近最优的特征组合用于分类识别,并用实验结果对各种算法的性能进行比较和验证。论文的主要创新点如下:
     1.运用肝脏的多期CT图片。
     2.运用多种评价准则。
     3.运用多种顺序特征选择算法相结合,有效的做到采取各种算法的优点,避免其缺点。将过滤方法(filter)和封装方法(wrapper)相结合。
     4.运用支持向量机(Support Vector Machine)方法和遗传算法相结合。
The proposed liver lesion diagnosis system mainly consists of 4 steps: (Rejion of Interest) ROI extraction, feature extraction, features selection and classification modules. Multi-phase abdominal CT images are first fed into a image registration module to eliminate their different in spatial positions. Then experienced radiologist could draw the ROIs just using one of the images. After that, all the features are extracted based on the ROIs. Feature selection module is then applied on some of the features to create the feature set. Finally (Support Vector Machine) SVM-based classifier is used to categorize the type of lesions.
     A huge difference between Liver (Computer Aided Diagnosis) CAD and other CAD like breast CAD and lung CAD is that shape prior has no effect in the detection and diagnosis of hepatic lesions due to the fact that liver disease is prone to diffusing–like and even the same type of the disease always varying greatly in shape from case to case. Therefore texture-based features become the optimal candidate target for feature extraction task. In this paper, Image texture feature extraction method includes first-order statistics (FOS), Gray Level Co-occurrence Matrix (GLCM) and temporal method. For image recognition, more inputted characteristic items do not means better, they may produce a lot of false positive findings.“Information overload”will weaken the classification performance. In addition, when inputted features are increased, the training samples required for classification will grow in exponential. Therefore, in the liver diagnose system, how to choose the right feature set which contribute to the high classification accuracy from a number of features is the key issue.
     The basic task of feature selection is how to find out the most effective feature subset from high-dimensional features. It includes the following two sub-problems: Search strategy and the issue of evaluation functions. Genetic Algorithm (GA) is the randomized algorithm; the method is a searching method for solving in Local minima which adds randomly. (3) Sequential algorithms are added or subtracted features sequentially, Such as Sequential Forward Selection (SFS), Sequential Backward Elimination (SBE), Plus l Take-Away r Selection (PTA), Sequential Floating Forward Selection (SFFS) and Sequential Floating Backward Elimination (SFBE) etc. On the other hand, there are two types of feature selection framework, derived from the nature of the evaluation function J ? used: filters and wrappers. In a filter framework, J(?) measures the performance of a feature set in a manner that does not include the classification algorithm which will eventually use the features. In a wrapper framework, J(?) incorporates the classification algorithm.
     In this paper, based on filter and wrapper method, Support Vector Machine (SVM) and Genetic Algorithm (GA), a new method that utilizes the two-step selection approach was proposed to choose the most relevant features from a large feature set. This two-step selection method can be described as: firstly, apply traditional sequential algorithms such as SFS, SBE, SFFS, SFBE, PTA to obtain five different feature subsets which will be used to generate a new feature set and then utilizes GA to search feature space from the new feature group by the fitness function designed by the accuracy of SVM. The advantages of this approach include the ability to accommodate different feature selection search strategies and combine filter and wrapper method, which makes the system can find a small optimal feature subsets that perform well for a particular inductive learning algorithm of interest to build the classifier.
     The main innovation points are as follows:
     1. The use of multi-phase liver CT images.
     2. The use of a variety of assessment criteria.
     3. Accommodate different feature selection search strategies and combine filter and wrapper method.
     4. The use of support vector machine and genetic algorithm.
引文
[1] Doi K, MacMahon H, Katsuragawa S, et al.Computer-aided diagnosis in radiology: potential and pitfalls.European Journal of Radiology, 1999, 31: 97-109.
    [2] Quckel L, Kessels A, Goci R,et al. Miss rate of lung cancer on the radiography in clinical practice. Chest,1999,115:720-724.
    [3] P. Asvestas, G. K. Matsopoulos, and K. S. Nikita. A power differentiation method of fractal dimension estimation for 2-D signals, Journal of Visual Communication and Image Representation, 1998, 19(4): 392-400.
    [4] C. M. Wu and Y. C. Chen. Texture features for classification of ultrasonic liver images, IEEE Trans. Med. Image, 1992, 11(2): 141-152.
    [5] H. Suiana, S. Swarnamani, S. Suresh. Application of artificial neural networks for the classification of liver lesions by image texture parameters, Ultras. Med. Biol. , 1996, 22(9): 1177-1181.
    [6] M. Gletsos, S.G. Mougiakakou, G.K. Matopoulos, K.S. Nikita, A. Nikita, D. Kelekis,“A Computer-Aided Diagnosis System to Characterize CT focal Liver Lesions: Design and Optimization of a neural network classifier,”IEEE Trans. Inform. Techn. Biomed. Vol.7, pp. 153-162, 2003.
    [7] I. K. Valavanis, S. G. Mougiakakou, K.S. Nikita. Computer Aided Diagnosis of CT Focal Liver Lesions by an Ensemble of Neural Network and Statistical Classifiers. IEEE Neural Networks, 2004, 3: 1929-1934.
    [8] E.L. Chen, P.C. Chung, C. L. Chen, H. M. Tsai, C.l. Chang“An automatic diagnosis system for CT liver image classification,”IEEE trans. Biomed. Eng., vol. 45, no.6, pp.783-794, 1998.
    [9] C.C Lee, S.H. Chen, H.M. Tsai, P.C. Chung, Y. C. Chiang,“Discrimination for Liver Diseases from CT images based on Gabor Filters,”Proceedings of IEEE Symposium on CBMS Conference, 2006.
    [10] I. K. Valavanis, S. G. Mougiakakou, A. Nikita, K. S. Nikita,“Evaluation of Texture Features in Hepatic Tissue Characterization from non-enhanced CT Images,”Porceedings of the IEEE EMBS conference 2007.
    [11] D.Balasubramanian, P. Srinivasan, R. Gurupatham,“Automatic Classification of Focal Lesions in Ultrasound Liver Images using Principal Component Analysis and Neural Networks,”Proceedings of the IEEE EMBS conference 2007.
    [12] Baron RL, Brancatelli G. Computed tomographic imaging of hepaticellular carcinoma. Gastroenterology 2004 Nov; 127(5 Suppll): S133-43.
    [13] Michael M,Lin W C.Experimental Study of Information Measure and Inter—-Intra Class Distance Ratios on Feature Selection and Orderings. IEEE Trans on System , M an,and Cybernetics,1973,3(2): 172—181.
    [14] Haering N, Lobo N D V. Feature and Classification Methods to Locate Deciduous Trees in Images. Computer Vision and Image Understan ding,1999,75(1/2): 133—149.
    [15] Furey T S, Cristianini N, Duffy N, et a1. Support Vector Machine Classification and Validation of Cancer Tissue Samples Using Microarray Expression Data. Bio informatics, 2000, 16 (10):906—914.
    [16] Tabus I, Astola J. On the Use of MDI Principle in Gene Expression Prediction. EURASIP Journal of Applied Signal Processing,2001, 4:297—303.
    [17] Inza I,Larranaga P,Blanco R,et a1. Filter Versus Wrapper Gene Selection Approaches in DN A Microarray Domains.Artificial Intelligence in Medicine,2004, 31(2):91—103.
    [18] Sindhwani V., Rakshit S.,Deodhare D.,et a1. Feature Selection in MLPs and SVM s Based on Maximum Output Information.IEEE Trans on Neural Networks,2004,15(4):937-948.
    [19]边肇祺,张学工等著。模式识别(第2版)。北京:清华大学出版社,2000.
    [20] P. Miller, K. Bowyer. Classification of Breast Tissue by Textural Analysis. Image and Vision Computing. 1992.10(5): 277~283.
    [21] Richard Pfisterer, Farzin Aghdasi. Comparison of texture based algorithms for the detection of masses in digitized mammograms.1999 IEEE AFRICON. Cape Town, South Africa,IEEE.1999:383~388.
    [22]章毓晋.图像工程(上册)图像处理和分析.北京:清华大学出版社.1999.
    [23] A. P. Dhawan, Y. Chitre, C. Kaiser-Bonasso. Analysis of mammogramphic Microcalcifications using gray-level image structure features. IEEE Trans Med Imag. 1996.15(3):246~259.
    [24]严福华,徐鹏举,凌志青,等.硬化型肝癌的螺旋CT多期扫描表现[ J ].中国医学计算机成像杂志, 2003, 9 (6) : 417 - 420.
    [25]李果珍.临床CT诊断学[M].北京:中国科学技术出版社,1997:404 - 425.
    [26]王月,胥立兵,陈军。小肝癌CT误诊原因分析与对策[ J ].徐州医学院学报, 2004, 24 (1) : 43 - 44.
    [27] Huang Yali, Li Fenhua, Zhao Zhen. Experimental Study of Texture Feature Extraction Methods of Liver Ultrasonic Imaging. China Medical Imaging Technology, 2004, Vol 20, No.12.
    [28] Sergios Theodoridis, Konstantinos Koutroumbas etc. Pattern Recognition (Third Edition), 2006, 214-217.
    [29] Baron RL, Brancatelli G. Computed tomographic imaging of hepaticellular carcinoma. Gastroenterology 2004 Nov; 127(5 Suppll): S133-43.
    [30] T. Niemeyer, C.Wood, K. Stegbauer, J. Smith.“Comparison of automatic time curve selection methods for breast MR CAD,”SPIE Vol.5370, 2004.
    [31] George H. John, Ron Kohavi, Karl Pfleger. Irrelevant Features and the Subset Selection Problem. Machine Learning: Proceedings of the 11th International Conference, pp. 121-129, San Francisco, California, 1994.
    [32] M. J. J. Scott, M. Niranjan, R. W. Prager.“Feature subset selection in variable cost domains”CUED/F-INFENG/TR.323 May 1998.
    [33]周概容,概率与数理统计。北京:高等教育出版社,1984 .
    [34] Tom M. Mitchell的主页http://www.cs.cmu.edu/~tom/.
    [35] Richard O. Duda, Peter E. Hart, David G. Stork. Pattern Recognition. USA: Elsevier Science, 2003.
    [36]钟珞,潘昊,封筠等著。武汉:武汉大学出版社,2006.9.
    [37]边肇祺、张学工.《模式识别》第二版.清华大学出版社,2000年7月.
    [38]李金宗.《模式识别导论》.高等教育出版社,1994年.
    [39]孙即祥著.现代模式识别.长沙:国防科技大学出版社,2002.
    [40]蔡元龙.《模式识别》.西安电子科技大学出版社,1986年6月.
    [41] J. Kittler. Feature set search algorithms. In C.H. Chen, editor, Pattern Recognition and Signal Processing, pages 41-60. Sitho and Noordho, 1978.
    [42] Chen Xuewen. An Improved Branch and Bound Algorithm for Feature Selection. Pattern Recognition Letters,2003,24(12): 1925—1933.
    [43] Patrenahalli M. Narendra, Keinosuke Fukunaga: A Branch and Bound Algorithm for Feature Subset Selection. IEEE Trans. Computers 26(9): 917-922 (1977).
    [44] Hamamoto Y, Uchimura S, Matsuura Y, et a1. Evaluation of the Branch and Bound Algorithm for Feature Selection. Pattern Recognition Letters,1990,11(7): 453—456.
    [45] Webb R A. Statistical Pattern Recognition. New York, USA: John Wiley& Son.2002.
    [46] J. Novovicova P. Pudil and J. Kittler. Floating search methods in feature selection. Pattern Recognition, 28(9): 1389–1398, 1995.
    [47] P.A. Devijer and J. Kittler. Pattern Recognition: a Statistical Approach. Prentice Hall, 1982.
    [48] T. Marill, D. Green. On the efectiveness of receptors in recognition systems, IEEE Trans. Inform. Theory 9 (1963)11–17.
    [49] S. Stearns, On selecting features for pattern classi1ers, The Third International Conference of Pattern Recognition, 1976, pp. 71–75.
    [50] Kirkpatrick, C. D. Gelatt. Optimization by Simulated Annealing M. P. Vecchi Science, New Series, Vol. 220, No. 4598. (May 13, 1983), pp. 671-680.
    [51] Hsu C W, Lin C J. A Comparison of Methods for Multi-Class Support Vector Machines. IEEE Trans on Neural Networks,2002,13(2): 415-425.
    [52] J.Weston, S.MuKherjee, O.Chapelle, M.Pontil, T.Poggio,V.Vapnik“Feature selection for SVMs”Barnhill BioInformatics.com, Savannah, Georgia, USA.CBCL MIT, Cambridge, Massachusetts, USA. AT&T Research Laboratories, Red Bank, USA. Royal Holloway, University of London, Egham, Surrey, UK.
    [53]杨淑莹,《模式识别与智能计算》。北京:电子工业出版社,2008,1.
    [54] Ownby, R. L. ROC curve analyses of neuropsychological tests in Alzheimer's disease, Archives of Clinical Neuropsychology,,2000, 15(8): 748.
    [55] Son Hye-Kyung, Yun Mi jin, Jeon Tae Joo et al.ROC analysis of ordered subset expectation maximization and filtered back projection technique for FDG-PET in lung cancer, IEEE Transactions on Nuclear Science,2003,50(1): 37-41.
    [56] Amulage P, Colton T. Encyclopedia of blostatistics. New York: John. 1998.3738-3744.
    [57]邹洪侠,秦锋,程泽凯,王晓宇。二类分类器的ROC曲线生成算法。计算机技术与发展,第19卷第6期,2009年6月.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700