多目标的图像检测

英文题名：Detection of Multi-target Image
副题名：人脸人眼检测
英文副题名：Face and Eyes Detection
作者：徐来
论文级别：硕士
学科专业名称：计算机应用技术
中文关键词：目标检测 ; 多尺度搜索 ; Adaboost分类器 ; Gabor变换
英文关键词：target detection ; multi-scale search ; adaboost ; wavelet transform
学位年度：2010
导师：陈庆章 ; 周德龙
学科代码：081203
学位授予单位：浙江工业大学
论文提交日期：2010-04-06

摘要

随着模式识别和人工智能的高速发展,多目标图像检测技术得到国内外社会各界的广泛的关注和深入研究。图像检测识别在科技领域和安全领域方面上都具有很强大的发展前景和广大的市场潜在的经济价值。同时,在人们的社会生活中,还是在市场商用领域,多目标图像检测也都发挥着重要的作用。
     但随着社会的进步和技术的更新发展,人们对图像检测识别技术的要求也越来越高。检测率的高低和实时性的好坏是图像检测识别技术的最主要的两个性能指标,也是该技术目前所要研究突破的方向。由于图像检测技术一直受限于目标形态的多样性,目标的遮挡问题以及背景复杂度或外部环境等诸多影响,从而对降低了检测速度和精度。
     本文就针对提高图像目标检测系统的整体综合性能,并实现多个目标同时检测做了以下主要工作和成果:
     1.通过DirectShow构建视频采集系统,作为图像目标检测的图像采集模块,并有效的结合WDM视频捕捉,共同协作完成对构成视频图像的预览模型。由于Directshow能让复杂的数据流在不同硬件上同步传输变得简单有效,保证了后续图像检测的实时性。
     2.为了提高训练的速度,解决传统Adaboost训练计算量大、时间长的问题,提高检测的精度。在Adaboost弱分类器训练时,采用了特征选取时确定阈值搜索范围,减少搜索的时间,提高阈值的最优化,使得减少了弱分类的个数,最终提高Adaboost训练的训练速度。
     3.为了满足动态的图像检测实时性的需要,本文结合了分层思想和半像素匹配的快速搜索算法对图像进行实时精确的匹配,大大的减少了检测时搜索匹配的计算量和时间,提高了匹配的精度。同时,研究了小波变换理论,根据小波各频域表征的特性,在低频区域采用加权平均法,在高频区域采用基于区域的方法。
     4.通过Adaboost算法分类器的训练对多姿态不同的人脸进行检测,并在检测到的人脸区域再对人眼进行粗定位,这里分别应用了Gabor小波变换的人眼定位、DCT变换人眼模板匹配的人眼定位以及adaboost训练分类器的人眼定位方法,并进行了简单的比较分析各自的优缺点及算法性能。然后再用二值化处理,积分投影变换,并结合人眼的几何知识的方法对人眼进行精确的定位并标注。由于,经过人脸检测后,图像进行了有效的归一化,已大大的缩小了人眼的检测范围,再通过图像处理,从而有效的缓解了光照的影响,以及多种方法的融合,能快速准确的锁定瞳孔位置,综合的提高了检测系统的性能。
     实验结果表明,系统的整体检测效果与传统的检测方法相比,有了较明显的提高。但仍存在一些弊端,如旋转角度大的目标,检测效果就大大的降低了。如何校正旋转目标的匹配问题,需要进一步的研究改进。
With rapid development of pattern recognition and artificial intelligence, both domestic and international researchers pay a lot of attention into the technology of multi-target image detection. Image detection and recognition has powerful prospects and huge potential economic value in area of science and security. At the same time, multi-target image detection technology plays an important role both in people's life and business activities.
     As the proceeding of society and technology, people propose higher demand on image detection and recognition technology. The real-time performance and detection rate are two primary measure standards, and further directions of this technology. However, some problems, such as diversity and block of the target, complex background, external environment and so on, become the bottleneck of the Image detection technology, all of them pull down the speed and accuracy of the final result. The main purpose of this paper is to improve the overall performance of the image detection system, and realize multi-target detection. Brief work and results are shown as follows:
     1. Using DirectShow to architecture video capture system, as a modle for capturing target image, Combining with WDM video capture to establish the complete video preview model. The DirectShow could make complex data stream transport easy and effective on multi-hardware platforms, it can ensure the real-time performance of the following detecting operations.
     2. To improve training speed and the precision of detection result, resolving the problem of large computation and spending long time in traditional AdaBoost method, we use a definite threshold range in feature selection, when training the weak classifier, so as to reduce search time, to improve threshold optimization, to reduce the number of weak classifier, and finally to improve the training speed.
     3. To satisfy the demands of real-time dynamic image detection, this passage combined stratification method with half pixel match algorithm to match the image accuracy and real-time performance, and largely reducing the match time while performing detection. Meanwhile, we study the theory of wavelet transform. According to the feature of frequency domain characterization in wavelet, we use weighted average method into the low-frequency region, and the method based on region into the high-frequency region.
     4. Using Adaboost algorithm to train and detect human face in multi-postures, and then obtain the position eyes based on the region of face. Here we used Gabor wavelet transform, DCT transform and adaboost training classifier to locate human eyes. And conduct a simple comparative analysis of their strengths, weaknesses and algorithm performance. Then we perform binarization and integral projection transformation, according to geometry knowledge of eyes, to locate the eyes and mark them. Because of the face detection, the image has normalized effectively, narrowing the range of search area into a smaller range. Processing the image again, the impact of light is relieved effectively. We integrate a variety of methods, and the system can locate eyes quickly with a improved performance of the detection.
     Comparing to the traditional method, the Experiment results shows that the performance of this system significantly improved. While there are also some problems, for example, detection result will fall down when the target has a big rotation. The problem of matching rotation target needs further research.

引文

[1]余松煌,周源华,吴时光.数字图像处理[M].北京:电子工业出版社,1998.
    [2]戚飞虎.模式识别与图像处理[M].上海:上海交通大学出版社,1990.
    [3]李介谷.图像处理技术[M].上海交通大学出版社,1988.
    [4]沈清,胡德文,时春.神经网络应用技术[M].国防科技大学出版社,1993.
    [5]焦李成.神经网络的应用与实现[M].西安电子科技大学出版社,1993.
    [6]包约翰.自适应模式识别与神经网络[M].科学出版社,1992.
    [7] Collins R, LiPton A and Kanade T. Introduction to the special section on video surveillance[J] .IEEE Trans Pattern Analysis and Machine Intelligence,2000, 22(8): 745-746.
    [8] Maybank S, Tan T. Introduction-special section on visual surveillance[J]. International Journal of Computer Vision,2000,37(2):173-173.
    [9]李健,廖秋筠.人脸识别的方法研究[J].微计算机信息, 2006, 22 (4-2): 254-256.
    [10] Zhu Zhiwei, Fuj Imura k, Ji Qiang. Real-time eye detection and tracking under various light conditions [C]//Proc of ACM SIGCHI Symposium on Eye Tracking Research and Applications . New Orleans:[s.n.],2002:139-144.
    [11] Ji Qiang, Zhu Zhiwei, Lan Peilin. Real-time nonintrusive monito-ring and prediction of driver fatigue[J]. IEEE Trans on Vehicular . Technology, 2004, 53 (4):1052-1068.
    [12] Sun Chengzheng, Clarence E. Operational transformation in real-time group editors: issues, algorithms, and achievements[C]//Proc Of ACM Confon Computer-Supported Cooperative Work, 1998: 59-68.
    [13] EB ISAWA Y, SATON S. Effectiveness of pupil area detection technique using two light sources and images difference method[C]//Proc of the 15 th Annual Int Conference of IEEE Eng in Medicine and Biology Society. San Diego: [s.n.], 1993: 1268-1269.
    [14] Mor Imoto C, FL Ickner M. Real time multiple face detection using active illumination[C]/ /Proc of IEEE International Conference on Automatic Face andGesture Recongnition, 2000.
    [15] Haro A, FL ICKNER M, ESSA I. Detecting and tracking eyes by using their physiological properties, dynamics and appearance [C]/ /Proc of IEEE Conference on Computer Vision and Pattern Recognition. 2000.
    [16] Qi Hui. Research on arts comparing of underwater multi-sensors information fusion of target recognition oriented [J]. Ship Science and Technology,2009,31(1): 107-111.
    [17]杨正,曹志耀.防空警戒雷达以一种频率对空中目标的检测概率计算[J].电光与控制,2007,14(2):41-43.
    [18]尹方平,阮邦志.基于谷算子的人眼特征点的检测方法[J].计算机应用研究,2006,23(8):180-185.
    [19] Bledsoe W . Man-Machine Facial Recognition[J]. Panoramic Research Inc, Palo Alto, CA, 1996, Rep. PRI: 22.
    [20] Brunelli R. Poggio T. Face Recognition: Features Versus Templates [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1993, 15(10): 1042-1052.
    [21] Samal A., Iyengar P. A. Automatic Recognition and Analysis of Human Faces and Facial Expressions: A Survey [J]. Pattern Recognition, 1992, 25(1): 65-77.
    [22] Kanade T .Computer Recognition of Human Faces[J]. Ph. D Dissertation . Kyoto University, Japan, 1994.
    [23] Huang T S, Yang G Z. Human Face Detection in a Complex Background[J]. Pattern Recognition, 1994 ,27(1) :53–631.
    [24] Sirohey S A .Human Face Segmentation and Identification Technical Report CS - TR - 3176[R]. Univ. of Maryland,1993.
    [25] Phillips P J;Moon H;Rizvi S A .The FERET evaluation methodology for face recognition algorithms[C].IEEE Transactions on Pattern Analysis and Machine Intelligence , 2000, 244-261.
    [26] Kapfer M, Benois-Pineau J. Detection of human faces in color image sequences with arbitrary motions for very low bit-rate videophone coding[J].Pattern Recognition Letters, 1997, 18(14): 1503-1518.
    [27] Craw I, Ellis H J. Automatic Extraction of Face Features [J].Pattern Recognition,1987,5:183–187.
    [28] Miao J.A Hierarchical Multiscale and Multiangle System for Human Face Detectionin a Complex Background Using Gravity - Center Template[J].Pattern Recognition,1999,32 (7):1237–1248.
    [29] Yuille A L, Hallinan P W, Cohen D S. Feature extraction from faces using deformable template [J].International Journal of computer vision,1992,8(2):99-111.
    [30]黄万军,尹宝才,陈通波,等.基于三维可变形模版的眼睛特征提取[J].计算机研究与发展, 2002,39(4): 495-501.
    [31] Lades M, Vorgrubben J C. Distortion invariant object recognition in the dynamic link architecture [J].IEEE Trans, On Computer, 1993,42(3):300-310.
    [32] Rowley H , Baluja S, Kanade T. Neural Network - based Face Detection[C].IEEE Conf. Computer Vision and Pattern Recognition,1996: 203-281.
    [33]石华伟,夏利民.基于mean shift算法和粒子滤波器的人眼跟踪[J].计算机工程与应用,2006, 42(19):26-28.
    [34] Comaniciu D ,Ramesh V. Mean Shift and Optimal Prediction for Efficient Object Tracking[C].In Proc of the IEEE International Conference on Image Processing,Vancouver,Canada,2000,3:70-73.
    [35] Samaria F S , Young S. HMM Based Architecture for Face Identification[J].Image and Vision Computing,1994,12:537–583.
    [36] Huang J, Li D, Shao X. Pose discrimination and eye detection using support vector machines(SVM)[C].Proc of Na toast on Face Recognition: from Theory to Applications.1998:528-536.
    [37] Podolak I T, Lee S W. Facial component extraction and face recognition with support vector machines[C]Proc of Automatic Face and Gesture Recognition. 2002:76-81.
    [38] Microsoft Corporation .About WDM Video Capture[Z],2000.
    [39] Microsoft Corporation. How to write a Capture Appliction[Z],2000.
    [40] Mei R, Ratsch G. An Introduction to Boosting and Leveraging [C]. Lecture Notes In Artificial Intelligence archive Adanced lectures on machine learning archive, 2003,118-183.
    [41] Schapire R E. The strength of weak learnability.Machine learning [J], 1990,5(2):197-227.
    [42]孙显,王宏琦,张正.基于对象的Boosting方法自动提取高分辨率遥感图像中建筑物目标[J].电子与信息学报,2009,31(1): 177-181.
    [43] Wu J, Rehg J M. Learning a rare event detection cascade by direct feature selection[C]. NIPS, 2004,16:1523-1530.
    [44] Valiant L G. A theory of the learnable[C]. Communications of the ACM,1984,27(11): 1134-1142.
    [45] Mitchell T M. Machine learning[M].Columbus: The McGraw-Hill Companies, Inc.,1997.
    [46] Jones M J, Viola P. Face recognition using boosted local feature[C].ICCV,北京,中国2003.
    [47] Jones M J, Viola P. Rapid object detection using a boosted cascade of simple features[C].Conf. on Computer Vision and Patter Recognition,2001,511~518.
    [48] Freund Y, Schapire R E. A decision-theoretic generalization of on-line learning and an application to boosting[C]. Second European Conf. on Computational Learning Theory, Barcelona, Spain,1995.
    [49] Quinlan J R. Bagging, Boosting, and C4.5 [R].AAAI/IAAI,1996.
    [50]周春光,梁艳春.计算智能—人工神经网络,模糊系统,进化计算[M].长春:吉林大学出版社,2001.
    [51]胡守仁,余少波,戴葵.神经网络导论[M].长沙:国防科技大学出版社,1993.
    [52] Viola P, Jones M. Robust real-time object detection [J]. International Journal of Computer Vision,2004,57(2):137-154.
    [53] Lienhart R, Maydt J. An extended set of haar-like features for rapid object Detection[C].IEEE ICIP,2002,900-903.
    [54] Papageorgiou C P, Oren M, Poggio T. A general framework for object detection[C]. Proceeding of International Conference on Computer Vision, Bombay, India,1998.
    [55] Zhang L, Li S Z,Qu Z, et al. Boosting local feature based classifier for face recognition[R]. Microsoft Research Asia,2001.
    [56] Castleman K. R.数字图像处理[M].北京:电子工业出版社, 2002:261-302.
    [57]崔锦泰.小波分析导论[M].西安:西安交通大学出版社,1995:24-28.
    [58]勒济芳Visual C++小波变换技术与工程实现[M] .北京:人民邮电出版社,2002.
    [59] Mallat S. G. Multiresolution Approximation and Wavelet Orthogonal Based of L2 ( R )[J]. Trans American Math Society, 1989, 315 (1):69-89.
    [60] Malalt著,杨力华等译.信号处理的小波导引[M].北京:机械工业出版社, 2002.
    [61]段锦.人脸自动识别中若干问题的研究[D].长春:吉林大学,2004.
    [62]张锐.人像识别技术的研究及其实际应用[D].长春:吉林大学,2001.
    [63] Kalocsai P,Biederman I. Differences of Face and Object recognition in utilizing early visual information [C].Proceedings of the NATO Advanced Study Instituteon Face Recognition: From Theory to Application,1998,493-502.
    [64]王海川,张立明.一种新的AdaBoost快速训练算法[J].复旦大学学报(自然科学版), 2004, 43(1): 27-33.
    [65]刘党辉.鲁棒的人脸识别技术研究[D].北京工业大学,2004.
    [66] Lee T. S. Image Representation Using 2D Gabor wavelets[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1996, 18(10):959-971.
    [67] Hafed Z M, Levine M D.Face Recognition Using the Discrete Cosine Transform[J]. International Journal of Computer Vision,2001,43(3):167-188.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700