图像局部不变量特征描述方法研究

英文题名：Study on Method of Image Local Feature Description
作者：梁胤程
论文级别：硕士
学科专业名称：计算机应用技术
中文关键词：图像局部不变量特征 ; 尺度不变特征 ; 圆投影 ; 词包模型 ; 场景图像分类
英文关键词：image local feature ; SIFT ; rounded projection ; word bags ; scene image
英文关键词：classification
学位年度：2012
导师：邹海林
学科代码：081203
学位授予单位：鲁东大学
论文提交日期：2012-04-30
答辩委员会主席：任满杰

摘要

用机器来感知和识别自然界的物体和场景，即使是很简单的物体，对于计算机来说也是很困难的事情。难点是如何来表达自然界的物体，既要区分其他物体，还要克服由于尺度变化，缩放，平移带来的差异性。选择什么样的特征来描述待识别的物体是计算机视觉的关键。近几年，图像局部特征的出现使计算机视觉的研究取得了重大进展。局部特征根据图像局部信息采用多尺度分析，统计学等相关技术形成特征向量，对图像形成了更好的表达，广泛应用于物体识别、配准、全景图像拼接和机器人视觉等领域。
     本文对当前的各种图像局部特征进行了分析，通过对主流的局部特征Harris角点检测、尺度不变特征转换(SIFT)、加速鲁棒特征(SURF)、最大稳定极致区域(MSER)进行对比分析，选择当前最流行的尺度不变特征转换算法为着手点，针对当前算法存在的不足提出了改进，并将改进的算法应用于词包模型的场景图像分类。具体内容如下：
     1. Lowe提出的尺度不变转换算法效率比较低，无法满足实时性的需要。论文提出了一种基于圆投影的尺度不变转换算法，通过对投影后的局部区域的快速傅里叶变换后计算一次谐波分量，对尺度不变转换算法提取的特征点进行预筛选。通过对筛选后的特征点计算局部区域描述子进行图像的匹配。实验结果表明：经过预筛选，该算法可以有效的减少待匹配特征点的个数，提高算法的执行效率和配准率。
     2.词包模型通过对SIFT算法检测的特征点在特征空间聚类来构造视觉单词。本文提出一种基于Fan-SIFT的词包模型，利用Fan-SIFT对不同角度的LOG算子响应值，检测出图像中的扇形斑点和圆形斑点，并利用扇形区域构造的特征描述符来构造视觉单词。相比于SIFT算法只检测图像中的圆形斑点构造的单词，本文算法构造的视觉单词更加具有针对性。在13类场景图像和Caltech101数据集上进行实验表明，基于Fan-SIFT算法生成的词包模型对场景图像的分类具有更高的准确率。
     另外，本文对主流的图像斑点局部特征进行了对比实验，重点关注了各种特征在尺度缩放、视角变化、光照变化、图像模糊情况下的匹配结果。对主流斑点特征的描述性能有了直观的表示。
It’s very complicated for computer when it comes to the ability of perception andrecognition, even it’s a very simple object to be recognized. The most difficult point forcomputer recognition is how to express the object. It needs the ability to distinguish oneobject from another no matter its different size, different view and different position. Featureselection is the key process in computer vision which can greatly affects the results. Duringthe past decade, the progresses of local feature prompt computer vision research. With thehelp of multi-scale analysis technology and statistics technology, people draw various kind oflocal image feature from each block of image which has a better express of image. It’s widelyused in the area of object recognition, registration, image stitch and robot vision etc.
     We have a deep research on various local image features. A comparison study hasdeveloped on Harris, SIFT, SURF and MSER. SIFT algorithm is selected as the start point forits good effect. We put forward some improvement according the shortcomings. What’s more,the improved feature is used on scene image classification and the experiments demonstrateits good effect on image classification. The details and the innovation are as follows:
     1. The scale-invariant feature transform algorithm proposed by Lowe has a low efficiencyand restricts its application. The algorithm based on rounded projection proposed in our paperapplies Fast Fourier Transform algorithm (FFT) on the projection function to compute thefirst harmonic components which are used to pre-screen the feature points that extracted bySIFT algorithm. After the pre-screening, we get the descriptors according to the local areafeatures of left points. The experiments shows that it has a smaller number of feature pointsthan the original SIFT algorithm, so it improves the efficiency and has a better performance.
     2. The model of word bags use the SIFT descriptors to formulate the image vision word bycluster method. SIFT algorithm is a detector of blob region of image by LOG kernel function.Instead, we substitute the SIFT detector for Fan-SIFT algorithm. Fan-SIFT not only detectsthe blob region in image, but also the fan region. Accordingly, we use a feature descriptor offan shapes. Fan-SIFT can find different kinds of blob region in image and form the descriptorswith a smaller dimension. Experiments are processed on the data set of13scene images anddata set of Caltech101. The results show a better effect on image classification.
     We also process the comparison experiments on the blob image features which focus on the match results of different scale, different size and different position, analyze the repeatabilityof different feature detectors. We give an intuitive description on the quality of blob imagefeatures.

引文

[1]高隽,谢昭.图像理解理论与方法[M].北京：科学出版社2009.
    [2]艾海舟,苏延超等译.图像处理、分析与机器视觉[M].北京：清华大学出版社2011.
    [3]王永明,王贵锦.图像局部不变性特征与描述[M].北京：国防工业出版社2010.
    [4] H. Moravec. Towards Automatic Visual Obstacle Avoidance．Proceedings of the5tllInternational Joints Conference on Artificial Intelligence, Cambridge,1977:584-590.
    [5] Harris C, Stephens.M A Combined Corner and Edge Detector[C]. Alvey VisionConf,University. Manchester,1988,147-151.
    [6] C Schmid, R Mohr, C Bauckhage. Evaluation of interest point detectors. InternationalJournal of Computer Vision,2000,37(2):151-172.
    [7] Smith S M, Brady M. SUSAN-A New Approach to Low Level Image Processing[J].International Journal of Computer Vision,1997,23(1).
    [8] J. L. Crowley, A.C. Parker. A representation for shape based on peaks and ridges in thedifference of low-pass transform[J]. IEEE Transactions on Pattern Analysis and MachineIntelligence,1984.6(2):156-170.
    [9] D. Lowe. Object recognition from local scale-invariant features[C]. Proceedings of theInternational Conference on Computer Vision, Corfu,1999:1150-1157.
    [10]D. Lowe. Distinctive Image Features from Scale-invariant key points [J]. InternationalJournal of Computer Vision,2004,60(2):91-110.
    [11]Tuytelaars T, Van Gool L. Matching widely separated views based on affine invariantregions[J]. International Journal of Computer Vision,2004,59(1).
    [12]Tuytelaars T, Van G001L. Wide baseline stereo matching based on local，affinelyinvariant regions．Proceedings of the11th British Machine Vision Conference, Bristol,UK,2000,412-425.
    [13]Matas J, Chum O, Urban M. Robust wide baseline stereo from maximally stable extremalregions[C]. BMVC,2002,761-767.
    [14]Mandelbrot B B.1983. The Fractal Geometry of Nature. New York: Freeman.
    [15]Pentland A P.9184. Fractal-based description of natural scenes. IEEE transactions onPattern Analysis and Machine Intelligence,6(6):661-674.
    [16]Witkin A P. Scale-space filtering. Proc.8thInt. Joint Conf. Art. Intell. Karlsruhe, Germany,1983.
    [17]Koenderink J J. Jan The structure of image. Biological Cybernetics,1984,(50).
    [18]Lindeberg T．Scale Space for discrete Signals[J]．IEEE Transactions on Pattern Analysisand Machine Intelligence,1990,12(3):187-217.
    [19]Florack L M J. ter Haar Romeny B M, Koenderink J J, et.al. Scale and the differentialstructure of image. Image and Vision Computing,1992,10(6).
    [20]Mikolajczyk K, Schmid C. An affine invariant interest point detector [C]. In Proceedingsof the8th International Conference on Computer Vision, Vancouver, Canada,2002,63-86
    [21]Bay H, Tuytelaars T, van Gool L. Speeded-Up Robust Feature(SURF)[J]. ComputerVision and Image Understanding,2008,110
    [22]Vincent L, Soile P. Watersheds in Digital Space: An Efficient Algorithm Based onImmersion Simulations [C]. TPAMI,1991.
    [23]Bay H. From wide-baseline Point and Line Correspondences to3D [D]. ETH Zurich,2006
    [24]Bay H, Fasel B, van Gool L. Interactive museum guide: fast and robust recognition ofmuseum objects[C]. Proceedings of the First International Workshop on Mobile Vision,2006
    [25]Viola P, Jones M. Rapid object detection using a boosted cascade of simple features [C].In IEEE Conference on Computer Vision and Pattern Recognition,2001,1.
    [26]Ye K, Sukthankar R. PCA-SIFT: A more distinctive representation for local imagedescriptors.[C]. Proceedings of the Conference on Computer Vision and PatternRecognition,2004, USA:IEEE,1,511-517.
    [27]Mikolajczyk K, Schmid C. A Performance Evaluation of Local Descriptors [J]. IEEETransactions on Pattern Analysis and Machine Intelligence,2005,27(10):1615-1630.
    [28]Mortensen E N, Deng Hongli, Shapiro L. A SIFT Descriptor with Global Context[C] Proc.of CVPR’05. San Diego, California, USA: IEEE Press,2005.
    [29]Alhwarin F. Wang C. Ristic-Durrant D. and Graser A. Improved SIFT-features matchingfor object recognition[C]. Visions of Computer Science-BCS International AcademicConf. September2008, Imperial College, London, UK.
    [30]Choi M S. A novel two stage template matching method for rotation and illuminationinvariance [J]. Pattern Recognition,2002,35:119-129
    [31]陈涛.图像仿射不变特征提取方法研究[D]．长沙：国防科学技术大学,2006．
    [32]Grauman K, Darrell T. Pyramid match kernels: Discriminative classification with sets ofimage features [C] Proceedings of International Conference on Computer Vision.IEEE,2005:1458-465.
    [33]Lazebnik S, Schmid C, Ponce J. Beyond bags of features: Spatial pyramid matchingfor recognizing natural scene categories [C] Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition. IEEE,2006:2169-178.
    [34]Chunhui Cui, King Ngi Ngan. Scale and Affine Invariant Fan Feature[J]. IEEETransaction on Image Processing.2011,20,(6):1627-1640.
    [35]Li F F, Perona P. A bayesian hierarchical model for learning natural scene categories[C].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. NewYork, USA: IEEE Press,2005,2:524-531.
    [36]高常鑫，桑农.整合局部特征和滤波器特征的空间金字塔匹配模型[J].电子学报2011,9(39)2034-2038.
    [37]Hu M K. Pattern recognition by moment invariant. IEEE Transactions on InformationTheory,1962.8(2):179-187.
    [38]Belongie S, Malik J, Puzicha J. Shape Matching and Object Recognition Using ShapeContexts[C], IEEE Trans. PAMI,1999,21(5)
    [39]Bauer J, Sunderhauf N, Protzel P. Comparing Several Implementations of Two RecentlyPublished Feature Detectors [C]. In Process Of International Conference of Intelligentand Autonomous System, IAV, Toulouse, France,2006
    [40]Kadir T, Brady M. Scale, Saliency and Image Description. International Journal ofComputer Vision [J],2001.45(2),83-105
    [41]Mikolajczyk K, Schmid C. Scale&affine invariant interest point detectors. InternationalJournal on Computer Vision [J],2004,60(1).
    [42]Fischler M A, Bolles R C. Random Sample Consensus: A Paradigm for Model Fittingwith Applications to Image Analysis and Automated Cartography[J]. Communications ofACM,1981,24(6):381-395.
    [43]Yi, Z., Zhiguo, C., and Yang, X. Multi-spectral remote image registration based on SIFT[J], Electron. Lett.,2008,44,(2), pp.107–108
    [44]张洁玉图像局部不变特征提取与匹配及应用研究[D]南京：南京理工大学2010.
    [45]H. Moravec. Rover visual obstacle avoidance[C]. Proceedings of the7th InternationalJoint Conference on Artificial Intelligence,Vancouver,1981:785-790.
    [46]张迁，刘政凯，庞彦伟等.一种遥感影像的自动配准方法[J].小型微型计算机系统.2004，25(7)：1129-1131.
    [47]顾华，苏光大，杜成.人脸关键特征点的自动定位[J].光电子·激光.2004，15(8)：975-979．
    [48]J H. Mauricio, M. Geovanni. Facial feature extraction based on the Smallest UnivalueSegment Assimilating Nucleus(SUSAN) Algorithm[C]．Proceedings of the InternationalConference on Picture Coding Symposiump,2004:261-266.
    [49]C. Kyu-Yeol, D. Won-Pyo, J. Chang-Sung. SUSAN window based cost calculation forfast stereo matching[C]．Proceedings of the International Conference Oil ComputationalIntelligence and Security.2005,947-952.
    [50]M. ArtaC，A. Leonardis. Outdoor mobile robot localization using global and localfeatures[C]. Proceedings of the9th Computer Vision Winter Workshop, Slovenian PatternRecognition Society, Piran,2004:175-184.
    [51]L. Ledwich, S. Williams. Reduced SIFT features for image retrieval and indoorlocalization[C]. Proceedings of Australian Conference on Robotics and Automation,2004.
    [52]M. Banerjee, M. K. Kundu. Edge based features for content based image retrieval[J].Pattern Recognition,2003,36(11):2649-2661.
    [53]L. Hua, Y. Shui-Cheng, P. Li-Zhong. Robust non-frontal face alignment with edge basedtexture[J]. Journal of Computer Science and Technology,2005,20(6):849-854.
    [54]K. Mikolajczyk, A. Zisserman, C. Schmid. Shape recognition with edge-basedfeatures[C]. Proceedings of the British Machine Vision Conference,2003,2:779-788.
    [55]V. Etienne, L. Robert. Detecting and matching feature points[J]．Journal of VisualCommunication and Image Representation.2005,16(1):38-54.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700