基于自适应阈值的SIFT算法研究及应用

英文题名：Adaptive Threshold Based SIFT Algorithm and Application
作者：黄令允
论文级别：硕士
学科专业名称：电路与系统
中文关键词：图像匹配 ; SIFT ; 自适应阈值 ; 尺寸压缩 ; 相位相关
英文关键词：Image Matehing ; SIFT ; AdaPtive Threshold ; Size ComPression ; PhaseCorFelation
学位年度：2010
导师：林秋华
学科代码：080902
学位授予单位：大连理工大学
论文提交日期：2010-11-14

摘要

计算机视觉一直是人们研究的热点。简单地说,计算机视觉就是使用计算机智能地认知周围物体的科学。图像匹配是计算机视觉中的一个基本问题。图像匹配的条件是存在一幅待匹配图像和一幅目标图像,待匹配图像一般是目标图像的一部分,但有一定尺度、旋转和光照等方面的变化。图像匹配的任务就是在目标图像中找到待匹配图像的位置。尺度不变特征变换(Scale Invariant Feature Transform, SIFT)是目前图像匹配领域中最活跃的算法之一。SIFT特征对于图像缩放、平移和旋转都具有良好的不变性,对于光照变化和仿射变换或三维投影也具有一定的鲁棒性。由于SIFT特征的不变性优势,SIFT算法被广泛应用到图像匹配领域。但是,SIFT算法具有计算量大、计算时间长的问题,在处理实时问题时有一定的局限性。
     为了进一步提高SIFT的实际应用能力,本文做了以下几方面工作：
     (1)深入研究了SIFT算法,发现SIFT计算时间主要耗费在极值点检测和特征向量描述这两个步骤。对于一般图像,SIFT算法可以提取出数百甚至数千的匹配点对,这些匹配点对于图像拼接而言已经远远超过要求。因此,本文提出了一种自适应阈值改进SIFT算法,主要通过自动调整尺度空间极值检测中的阈值,将SIFT特征点的数量控制在一定范围,进而减小运算量。
     (2)实现了基于图像尺寸压缩与自适应阈值SIFT算法的图像快速拼接方法。本文将最近邻插值法、双线性插值法和双立方插值法等图像尺寸压缩方法与自适应阈值SIFT方法相结合并应用到图像拼接领域。实验结果表明,在处理由一个较大的图像序列生成全景图,要求整体效果好,拼接速度快,但对图像细节要求不高这一类问题时,该算法性能优于传统算法。
     (3)研究了基于相位相关与自适应阈值SIFT算法的图像拼接算法。在实际中进行图像拼接时,相邻图像的重叠区域是一定的,而且重叠区域只占图像的一小部分,在重叠区域以外进行SIFT极值点检测及描述对图像拼接没有作用。本文将相位相关法与本文提出的自适应阈值SIFT算法相结合,应用到图像拼接问题,并和传统SIFT算法、相位相关结合传统SIFT的算法进行了比较,实验结果表明本文算法有一定的速度优势。
Computer vision has always been one of the hot topics. It uses computer intelligence to identify surrounding objects. Image matching is a fundamental aspect of many problems in computer vision. SIFT (Scale Invariant Feature Transform) is one of the most efficient and commonly used image matching algorithms. SIFT features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection.
     Although SIFT has many advantages, it suffers from the problem of high computation load and high cost of computation time. As such, SIFT algorithm has some limitations in dealing with practical problems. To further enhance the ability of the practical application of SIFT, the thesis includes the following researches:
     (1) The SIFT algorithm is deeply studied. It is found that the computation time is mainly spent in the detection and description of extreme points. Generally, SIFT algorithm can extract hundreds or even thousands of matching points, the matching points have been enough for the image mosaic. This thesis thus presents a novel improved SIFT algorithm to decrease the computation complexity. Specifically, the new algorithm controls the number of SIFT feature points within a certain range by adaptively adjusting thresholds in the detection of scale-space extreme.
     (2) A fast image mosaic is achieved based on image size compression and the adaptive threshold SIFT algorithm. We combined image size compression methods including nearest neighbor interpolation, bilinear interpolation and cubic convolution interpolation and the proposed SIFT method, and made an application to the image mosaic. The results show that, the algorithm outperforms the traditional algorithm in dealing with the formation of a larger panorama image sequence, requiring the overall effect, fast speed, but not much on image details.
     (3) This thesis presents a fast image mosaic method based on phase correlation and the proposed adaptive threshold SIFT. In practice, there are certain overlaps for the adjacent images and the overlapped part is only a small portion of the image. The detection and description of extreme points outside the overlapped region is useless for the image mosaic. This thesis then combines the phase correlation method with the proposed algorithm to do fast image mosaic with a comparison with the traditional SIFT algorithm and the combined method by the phase correlation and the traditional SIFT algorithm. The experimental results show that the proposed algorithm has a certain speed advantage over the traditional one.

引文

[1]Moravec H P. Towards Automatic Visual Obstacle Avoidance[C]. Proceedings of the 5th International Joint Conference on Artificial Intelligence. August,1977. pp:584.
    [2]Harris C, Stephens M. A Combined Corner and Edge Detector[C]. Proceedings of the 4th Alvey Vision Conference. University. Manchester.1988. pp:147-151.
    [3]Lindeberg T. Edge detection and ridge detection with automatic scale selection[J]. International Journal of Computer Vision.30,2,1998. pp:117-154
    [4]Mikolajcayk K, Schmid C. An affine invariant interest point detector[C]. Proceedings of the 8th International Conference on Computer Vision, Vancouver, Canada,2002. pp: 128-142.
    [5]Lindeberg T. Shape from texture from a multi-scale perspective[C]. Proceedings of 4th International Conference on Computer Vision. Germany,1993. pp:683-691
    [6]Lindeberg T. Detecting salient blob-like image structures and their scales with a scale-space primal sketch:a method for focus-of-attention[J]. International Journal of Computer Vision.1993,11(3). pp:283-318.
    [7]Kadir T, Zisserman A, Brady M. An affine invariant salient region detector[C]. Proceedings of the 8th European Conference on Computer Vision. Prague, Czech Republic, 2004. pp:345-457.
    [8]Matas J. Chum 0, Urban M. Robust wide baseline stereo from maximally stable extremal regions[C]. Proceedings of British Machine Vision Conference.2002. pp:384-396.
    [9]Tuytelaars T, Van Gool L. Matching widely separated views based on affine invariant regions[J]. International Journal of Computer Vision.2004,59(1). pp:61-85.
    [10]Lowe D G. Object recognition from local scale-invariant features[C]. Proceedings of International Conference on Computer Vision. Greece,1999. pp:1150-1157.
    [11]Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision.2004,60(2). pp:91-110
    [12]Brown Lisa G. A Survey of Image Registration Techniques[J]. ACM Computing Surveys, 1992,24(4):325-376.
    [13]Huttenlocher D P, Rucklidge W J. Comparing images using the Hausdorff distance under translation[C]. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition,1993. pp:850-863.
    [14]Fookes C, Bennamoun M. The use of mutual information for rigid medical image registration:A review[C]. Proceedings of IEEE International Conference on Systems. Man and Cybernetics,2002.
    [15]田金文,杨磊等.基于局部分形特征的快速图像匹配方法[J].华中理工大学学报,1996, 24(2)：12-14.
    [16]张桂林,徐捷.频域相关技术在图像匹配中的应用[J].模式识别与人工智能,1997,10(1)：87-92.
    [17]DeAngelis G C, Ohzawa I, Freeman R D. Receptive field dynamics in the central visual pathways[J]. Trends in Neurosciences,1995,18(10). pp:451-458.
    [18]Olshausen B, Anderson C, Van Essen D. A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information[J]. Journal of Neuroscience. vol.13 1993. pp:4700-4719.
    [19]T. Tuytelaars and K. Mikolajczyk, Local Invariant Feature Detectors:A Survey[J]. Foundations and Trends in Computer Graphics and Vision. vol.3 2008. pp:177-280.
    [20]R. Szeliski. Video mosaics for virtual environments[J]. Computer Graphics and Applications. vol.16 March 1996. pp:22-30.
    [21]L.Willianms. Pyramida Parametrics[J]. Computer Graphics. vol.17 July 1983, pp:1-11.
    [22]章权兵,罗斌,韦穗,杨尚骏.基于仿射变换模型的图象特征点集配准方法研究[J].中国图象图形学报,第8卷第10期,2003.10：1121-1127.
    [23]Y. Ke and R. Sukthankar, A more distinctive representation for local image descriptors[C]. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA,2004. pp:506-513.
    [24]Krystian Mikolajczyk and Cordelia Schmid. A performance evaluation of local descriptors[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence.10, 27,2005. pp:1615-1630.
    [25]Xiaohua Wang and Weiping Fu. Optimized SIFT Image Matching Algorithm[C].Proceedings of IEEE Conference on Automation and Logistics. Qingdao, China, September 2008. pp:843-847.
    [26]Michael Grabner, Helmut Grabner, and Horst Bischof. Fast approximated SIFT[C]. Proceedings of Asian Conference on Computer Vision.2006. pp:918-927.
    [27]Andrea Vedaldi. Invariant Representations and Learning for Computer Vision[D]. Ph. D. thesis, University of California, Los Angeles,2008. pp.175-182.
    [28]黄佳琪.基于尺寸压缩的图像快速拼接技术研究[D].大连：大连理工大学,2010.
    [29]George J. Grevera, Jayaram. Shape-based Interpolation of Multidimensional Grey-level Images[J]. IEEE Transactions on Medical Imaging.15(6) 1996. pp:881-892.
    [30]J. Anthony Parker, Robert V. Kenyon, and Donald E. Troxel. Comparison of Interpolating Methods for Image Resembling [J]. IEEE Transactions on medical imaging. vol. MI-2, NO.1, March 1983. pp:31-39.
    [31]Jung Woo Hwang and Hwang Soo Lee, Member. Adaptive Image Interpolation Based on Local Gradient Features[J]. IEEE Transactions on image Processing. vol.10, NO.10, October 2001. pp:359-362.
    [32]Robert G. Keys. Cubic Convolution Interpolation for Digital Image Processing[J]. IEEE Transactions on Acoustics, Speech, and Signal Processing. Vol. ASSP-29, NO.6, December 1981. pp:22-38.
    [33]Kuglin C.D, Mines D. C. The Phase Correlation Image Alignment Method[C]. Proceedings of SEE International Conference on Cybernetics and Society, New York.1975. pp: 163-165.
    [34]Despain A M. Very fast Fourier transform algorithms hardware for implementation[J]. IEEE Transactions on Computers. vol.28(5) 1997. pp:333-341.
    [35]Heymann S, Mller K, Smolic A. SIFT Implementation and Optimization for General-Purpose GPU[C]. Proceedings of the 15th International Conference in Central Europe on Comupter Graphics, Visualization and Computer Vision,2007. pp:317-322.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700