面向交通场景的图像分类研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着智能交通监控技术迅速发展,交通监控图像和视频数量飞速增长。仅依靠人工分析海量的交通图像和视频费时耗力,智能地快速检索和有效管理海量的交通图像和视频正面临着巨大的挑战。面向交通场景的图像分类是智能检索和管理交通图像和视频的基础,也是实现智能监控要解决的关键技术之一,它的研究具有理论价值和应用价值。
     本文的主要目标是实现交通场景图像的分类,围绕交通场景图像特征提取,图像表述及其分类展开研究。主要内容如下:
     第一,本文提取局部二值模式图像低层特征,采用支持向量机分类器进行图像分类,实验结果并不能达到预期的效果,其主要原因是在复杂背景下,图像的低层特征不能很好地描述图像语义内容。视觉词汇模型能描述图像的中层语义特征。因此本文提取图像SIFT特征,形成图像的视觉词汇表述,并使用支持向量机进行交通场景的图像分类。比较两种方法的实验结果,基于SIFT的视觉词汇模型的图像分类准确率较高。
     第二,视觉词汇模型忽略了图像的空间信息,本文引入空间金字塔模型对图像进行表述,该模型是对视觉词汇模型的改进,在图像特征空间上结合了图像块的上下文信息。采用这种图像表述并结合支持向量机分类器进行交通场景的图像分类,与视觉词汇模型法作比较,图像分类准确率有显著提高。
     第三,传统空间金字塔模型中向量量化误差较大,并且基于它的图像分类运算复杂度高,运行时间长。为了解决这个问题,本文引入局部线性编码改进向量量化编码,采用这种图像表述并结合Liblinear分类器进行交通场景的图像分类,该方法降低了图像分类运算复杂度和运行时间,提高了分类准确率。
With the flourish of intelligent traffic monitoring technology, it brings the number of traffic surveillance images and videos growing rapidly. It is time-consuming and labor-intensive to analysis all the videos manually. Intelligently fast retrieving and managing traffic images/videos are facing a great challenge. Traffic scene oriented image classification is the ground for traffic image/video intelligently retrieval and management, and it is one of the key technologies to be solved in realizing intelligent monitoring. So the research of traffic scene oriented image classification has theoretical and practical value.
     The main goal is realizing the traffic scene oriented image classification. This thesis focuses on image feature extraction, image representation and classification. The main research contents of this thesis are as follows:
     Firstly, Local Binary Pattern based image low-level feature is extracted in thesis. Then support vector machine is adopted to realize the traffic scene oriented image classification. The experimental results do not achieve the desired effect. The main reason is low-level features of images can't describe image semantic content very well. Visual words can describe image middle-level semantic content. So SIFT is used in this thesis to form the visual words representation of image and support vector machine is adopted to realize the traffic scene oriented image classification. Through comparing experiment results of the two methods, the image classification performance of SIFT based visual words model is better.
     Secondly, the visual words model ignores the spatial information of image. Spatial pyramid matching model is introduced to represent the images in this thesis. This model which makes use of the image block context in image feature space is the improvement of visual words. This image statement combining with support vector machine is used for traffic scene oriented image classification. Compared with visual words model, the precision is improved significantly.
     Thirdly. The vector quantization of traditional spatial pyramid model has large quantization errors. The computing complexity of spatial pyramid matching based image classification is relatively high and the run time is too long. In order to solve these problems, locality-constrained linear coding method is introduced to improve the vector quantization coding. This image statement combining with Liblinear classifier is used for traffic scene oriented image classification. This method reduces the computing complexity and running time and improves image classification precision.
引文
[1]李晶皎,王爱侠等.模式识别.2010.
    [2]江悦.场景图像内容表述和分类研究.国防科学技术大学博士学位论文.2010.
    [3]Boutell M, Brown C, Luo J. Review of the State of the Art in Semantic Scene Classification[D]. Rochester:University of Rochester.,2002.
    [4]http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2011/results/index.html#KEY_NLP R_SS_VW_PLS.
    [5]Junge Zhang, Yinan Yu, Yongzhen Huang, Chong Wang, Weiqiang Ren, Jinchen Wu. Object detection based on data decomposition, spatial mixture modeling and context. PASCAL.2011.
    [6]Datta R, Joshi D, Li J et al. Image retrieval:ideas, influences, and trends of the new age. ACM Computing Surveys.2008,40(2):1-60.
    [7]Teng Li, Tao Mei, In-So Kweon and Xian-Sheng Hua. Contextual Bag-of-Words for visual categorization. IEEE Trancation on Circuits and Systems for Video Technology, vol.2011:381-392.
    [8]L. Fei-Fei and P. Perona. A Bayesian hierarchical model for learning natural scene categories. In Proc. CVPR, 2005.
    [9]陈震宇.基于内容的图像检索.海峡科学.2009.
    [10]Yuangqing Lin, Fengjun Lv, Shenghuo Zhu, Ming Yang, Timothee Cour and Kai Yu. Large-scale image classification:fast feature extraction and SVM training. CVPR, 2011:1689-1696.
    [11]Young D.C., Sang Y.S., Nam C.K., Image Retrieval using BDIP and BVLC Moments. IEEE Transactions on Circuits and Systems for video Technology, 2003,13(9):951-957.
    [12]Vladimir A. Krylov, Gabriele Moser, Enhanced dictionary-based SAR amplitude distribution estimation and its validation with very high-resolution data, IEEE Geoscience and Remote Sensing Letters, vol.8,2011.
    [13]Szummer M, Picard R W. Indoor-Outdoor Image Classification [C]. Bombay: Proceedings of IEEE International Workshop on Content-based Access of Image and Video Databases,1998:42-52.
    [14]Paek S, Chang S F. A knowledge engineering approach for image classification based on probabilistic reasoning systems [C]. New York:IEEE International Conference on Multimedia and Expo,2000:1133-1136.
    [15]Vailaya A, Figueiredo M, Jain A. Content-based hierarchical classification of vacation images [C]. Florence:Proceedings of IEEE International Conference on Multimedia Computing and Systems,1999:9518-9523.
    [16]Vailaya A, Jain A, Zhang H. On image classification:city vs. landscapes [J] Pattern Recognition,1998,31(12):1921 - 1935.
    [17]刘世芳,刘叶冰.车辆类型识别技术研究.计算机与数字工程.2005.
    [18]Lipson P., Grimson E., Sinha. P. Configuration based scene classification and image indexing [C]. Puertoco:IEEE Computer Society Conference on Computer Vision and Pattern Recognition.,1997:1007-1013.
    [19]Wang J Z, Jia L, Wiederhold G. SIMPLIcity: semantics-sensitive integrated matching for picture libraries [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001,23(9):947-963.
    [20]程刚,王春恒.基于结构和纹理特征融合的场景图像分类.计算机工程.2011.
    [21]Josef Sivic, Andrew Zisserman, Video google:a text retrieval approach to object matching in videos. ICCV, 2003.
    [22]Hofmann T. Unsupervised Learning by Probabilistic Latent Semantic Analysis [J]. Machine Learning,2001,42(1-2):177-196.
    [23]AARON J. CHAVEZ. Image classification with dense SIFT sampling:an exploration of optimal parameters.2012.
    [24]http://pascallin.ecs.soton.ac.uk/challenges/VOC/databases.html.
    [25]http://pascallin.ecs. soton.ac.uk/challenges/VOC/.
    [26]刘萍等.图像检索技术研究进展.科技广场.2008.
    [27]Wang, Shengjun. A robust CBIR approach using local color histograms. Technical Report TR 01-13, University of Alberta,2001.
    [28]Navneet Dalai, Bill Triggs. Histograms of Oriented Gradients for Human Detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2005.
    [29]Aditya Vailaya, A.T. Figueiredo, Anil K. Jain, H.J. Zhang. Image classification for content-based indexing. IEEE Transection on Image Processing,2001, vol 10.
    [30]Tinne Tuytelaars, Krystian Miikolajczyk. Local invariant feature detectors:A survey. FnT Company, Graphics and Vision,2008.
    [31]孙君顶.基于内容的图像检索技术研究.西安电子科技大学博士学位论.2005.
    [32]Zhang D. S. Image retrieval based on shape, PhD Thesis, Monash University, March. 2002.
    [33]M. Swain and D. Ballard. Color indexing. IJCV,1991,7(1):11-32.
    [34]Odone F, Barla A, Verri A. Building kernels from binary strings for image matching[J]. IEEE Transactions on Image Processing.2005.14(2):169-180.
    [35]Lowe D. Distinctive image features from scale-invariant key points [J]. Int J Compute Vis,2004(1):91-110.
    [36]杨柳等.基于改进的SIFT特征提取的自主移动机器人环境识别方法研究.可编程控制器与工厂自动化(PLC FA).2009.
    [37]谢志宏,颜巾惠,白羽,魏磊.完全仿射不变的图像特征提取算法研究.计算机工程与应用.2011.47(8s):375-379.
    [38]Koenderink, J.J. The structure of images. Biollgical Cybernetics, 1984. 50:363-369.
    [39]Lindeberg, T. Scale-space theory: A basic tool for analyzing structures at different scales. Journal of Applied Statistics,1994.21(2):224-270.
    [40]Lowe, D.G. Object recognition from local scale-invariant features. In International Conference on Computer Vision, Corfu, Greece, pp.1999.1150-1157.
    [41]Mikolajczyk, K. Detection of local features invariant to affine transformations, Ph. D. thesis, Institute National Poly technique de Grenoble France.2002.
    [42]夏一民等.基于多尺度下特征点的检测.计算机工程与设计.2008.
    [43]WESTON J, WATKINS C. Multi-class support vector machines [J]. [s. n.],1998.
    [44]Sergios Theodoridis, Konstantinos Koutroumbas. Pattern Recognition. Fourth Edition. Academic Press,2009:5-10.
    [45]Sebe N. Lew M. S. Texture features for content-based retrieval,2001.
    [46]Timo Ojala, Matti Pietikainen, Topi Maenpaa. Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence.2002,24(7):971-987.
    [47]韦妍.基于LBP的人脸表情识别方法.现代计算机下半月版.2011.
    [48]Chih-Chung Chang and Chih-Jen Lin, http://www.csie.ntu.edu.tw/-cjlin/libsvm/.
    [49]贾世杰,孔祥维.一种新的直方图核函数及在图像分类中的应用.电子与信息学报.2011.
    [50]Barla A, Odone F, and Verri A. Histogram intersection kernel for image classification. Proceedings of the Internationa] Conference on Image Proceeding, Barcelona, Catalonia, Spain, Sept, 2003.
    [51]K. Grauman and R. Picard. Pyramid match kernels:Discriminative classification with sets of image features. In Proc. ICCV, 2005.
    [52]Lazebnik S, Schmid C, Ponce J. Beyond bags of features:spatial pyramid matching for recognizing natural scene categories[C]. IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), New York, NY, United states, June 17-22,2006:2169-2178.
    [53]J.J. Wang, J.C. Yang, K. Yu, et. Locality-constrained Linear Coding for Image Classification. Proceedings of CVPR 10. San Francisco.2010.3360-3367.
    [54]J. Yang, K. Yu, Y. Gong, and T. Huang. Linear spatial pyramid matching using sparse coding for image classification. Proc. of CVPR, 2009.
    [55]K. Yu, T Zhang, and Y. Gong. Nonlinear learning using local coordinate coding. Proc. of NIPS,2009.
    [56]边肇祺,张学工.模式识别.第2版.北京:清华大学出版社,2000.
    [57]C.H.Ho and C. J. Lin. Large-scale Linear Support Vector Regression. Machine Learning Group at National Taiwan University, 2012.
    [58]C.H.Ho and C. J. Lin. http://www.csie.ntu.edu.tw/-cjlin/liblinear/.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700