局部描述特征结合概率潜在语义模型的场景分类技术研究

英文题名：Research on Scene Classification Technologies with the Local Region Description Feature and Probabilistic Latent Semantic Analysis Model
作者：戎怡
论文级别：硕士
学科专业名称：信号与信息处理
中文关键词：场景图像分类 ; 边缘改进局部二值模式特征(EILBP) ; 边缘改进中心对称二值模式特征(EICS-LBP) ; 统计边缘主色对特征 ; 上下文语义信息 ; 视觉词汇 ; 概率潜在语义分析(PLSA)模型
英文关键词：Scene image classification ; Edge Improved Local Binary Pattern (EILBP) feature ; Edge Improved Center Symmetric Local Binary Pattern (EICS-LBP) feature ; Statistical edge domain color pairs ; contextual semantic information ; Visual words ; Probabilistic Latent Semantic Analysis (PLSA) model
学位年度：2010
导师：胡正平
学科代码：081002
学位授予单位：燕山大学
论文提交日期：2010-10-01

摘要

场景图像分类研究是对包含若干语义信息的图像集合进行分类的过程,可以对海量图像进行有效浏览与检索,成为当今计算机视觉研究领域的一个核心问题。鉴于图像与文本的关联性,将文本的词包模型与潜在语义分析模型运用到场景图像描述与分类上,具有重要的研究意义。针对当前图像特征提取算法有效性与复杂性相互制约的问题,展开以下研究:
     首先,构建了基于灰度图像局部边缘稠密采样区域的边缘改进局部二值模式(Edge Improved Local Binary Pattern,EILBP)特征,算法简单,性能稳定,能够对边缘信息丰富的图像进行合理描述,结合概率潜在语义分析(Probabilistic Latent Semantic Analysis,PLSA)模型完成场景分类,实验结果表明该特征提取算法应用在场景图像分类是有效的。
     然后,在EILBP特征的基础上,根据对称性构建了图像局部区域的边缘改进中心对称二值模式(Edge Improved Center Symmetric Local Binary Pattern,EICS-LBP)特征;针对彩色图像的颜色信息,构建了统计边缘主色对特征描述局部区域的边缘主色对信息;然后结合扩展PLSA模型完成场景分类,实验结果表明该方法具有较好的分类性能,对具有边缘轮廓的彩色图像分类精度高。
     最后,针对传统的视觉单词没有考虑特征间的依赖关系,不能充分表达图像主题这一问题,在彩色图像的EICS-LBP与统计边缘主色对特征的基础上,构造了一种含有上下文信息的视觉特征,之后结合扩展的PLSA模型实现场景分类。实验结果表明该方法具有较好的分类性能,对上下文信息丰富,具有边缘轮廓的彩色图像分类性能较好。
The research of scene image classification is that how to make computer vision systems to classify the image sets which contain semantic information, according to understanding and discriminating the scene image of human. Scene classification is the core issue in the computer vision and image understanding research area, which could organize and process large mount of image data, and then used to retrieval or scan images reasonably or effectively. In consideration of the relationship between image and text, it is significant to make the bag-of-words model which is used to text corpus research area to describe the image and use the probabilistic latent semantic analysis model to classify the image. For the mutual restriction between the effectiveness and complexity of the image feature extraction algorithm, we employ the following research:
     First of all, we built the Edge Improved Local Binary Pattern (EILBP) feature of the local region which center point is formed by dense sampling the edge of gray image, it is simple, stable, and it can give a reasonable description about the gray image which contain rich contour information, and then we can obtain potential semantic of image by Probabilistic Latent Semantic Analysis (PLSA) model, after that we can accomplish the scene classification by K-nearest Neighbours Classier (KNN) classifier. The experiment results show that this method could achieve a higher accuracy, especially perform well in multi-edge gray images.
     Then, we construct the Edge Improved Center Symmetric Local Binary Pattern (EICS-LBP) feature of the local region which center point is formed by dense sampling the edge of gray image, it is produced based on the EILBP feature and the symmetry. For the color information of the color image, we construct the feature of statistical edge domain color pairs, it can describe the edge domain color pairs information of the local region. After that the new visual vocabulary is created by linear combination of the two species of visual vocabulary which are formed by clustering the corresponding features from dense sampling regions respectively. At last, we can obtain potential semantic by extended PLSA model and accomplish the scene classification by KNN classifier. The experiment results show that this method could achieve a higher accuracy, especially perform well in multi-edge color images.
     Finally, the traditional visual words method considers nothing about the reliance among of features, it couldn’t express the theme of image well. To overcome its defect, we propose the contextual visual features based on the EICS-LBP feature and the statistical edge domain color pairs feature of the color image. At last, we can obtain potential semantic by extended PLSA model and accomplish the scene classification by KNN classifier. The experiment results show that this method could achieve a higher accuracy, especially perform well in color images which contain multi-edge and rich contextual information.

引文

1章毓晋.基于内容的视觉信息检索.北京:科学出版社, 2003:152-169
    2 J. Vogel, B. Schiele. Semantic modeling of natural scenes for content based image retrieval. International Journal of Computer Vision, 2007,72(2):133-157
    3 Foster, H. David, Marín-Franch, et al. Approaching ideal observer efficiency in using color to retrieve information from natural scenes. Optics and Image Science and Vision, 2009,26(11):14-24
    4于永健,王向阳,吴俊峰.基于颜色复杂度的加权颜色直方图图像检索算法.小型微型计算机系统, 2009,30(3):507-511
    5 Timo Ojala, Matti Pietikainen. Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns. IEEE Transactionson Pattern Analysis and Machine Intelligence, 2002,24(7):971-987
    6 S. Belongie, J. Malik, J. Puzicha. Shape matching and object recognition using shape contexts. IEEE Transaction on Pattern Analysis and Machine Intelligence, 2002,24(24):509-522
    7 A. Vailaya, A. Figueiredo, A.Jain, et al. Image classification for content-based indexing. IEEE Transactions on Image Processing, 2001,(10):117-129
    8 M. Szummer, R. W. Picard. Indoor-outdoor image classification. IEEE International Workshop on Content-based Access of Image and Video Databases, Bombay, India,1998:42-51
    9 A. Oliva, A. Torralba. Modeling the shape of the scenel a holistic representation of the spatial envelope. International Journal of Computer Vision, 2001,42(3):145-175
    10 J. Fan, Y Gao, H. Luo, et al. Statistical modeling and conceptualization of natural images. Pattern Recognition, 2005,38(6):865-885
    11 J. Luo, A. E. Savakis, A. Singhal. A Bayesian network-based framework for semantic image understanding. Pattern Recognition, 2005,38(6):919-934
    12 S. Aksoy, K. Koperski, C. Tusk, et al. Learning Bayesian classifiers for sceneclassification with a visual grammar. IEEE Transactions on Geoscience and Remote Sensing, 2004,43(3):581-589
    13 C. Fredembach, M. Schrfder, S. Sfisstrunk. Eigenregions for image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004,26(12):1645-1649
    14 A. Mojsiovic, J. Gomes, B. Rogowitz. ISee: Perceptual features for image library navigation. In Proc. SPIE Human vision and electronic imaging, San Jose, California. 2002, 4662:266-277
    15 J. Vogel, B. Schiele. Natural scene retrieval based on a semantic modeling step. International Conference on Image and Video RetrievaI, Dublin, Ireland. LNCS, 2004:207-215
    16 G. Csurka, C. Bray, C. Dance, et al. Visual categorization with bags of keypoints. Proceedings of 8th European Conference on Computer Vision, Prague, 2004:1-8
    17 T. Hofrnann. Unsupervised learning by probabilistic latent semantic analysis, Machine Learning, 2001,41(2):177-196
    18 D. Blei, A. Ng, M. Jordan. Latent Dirichlet Allocation. Journal of Machine Learning Research, 2003,(3):993-1022
    19 Anna Bosch, Andrew Zisserman, Xavier Mu?oz. Scene classification using a hybrid generative/discriminative approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008,30(4):717-727
    20 L. Fei-Fei, P. Perona. A Bayesian Hierarchical Model for Learning Natural Scene Categories. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington, DC, USA,2005:524-531
    21 F. Perronin, C. Dance, G. Csurka,et al. Adapted vocabularies for generic visual categorization, Proceedings of the 9th European Conference on Computer Vision, Graz, Austria, 2006,(4):464-475
    22 R. Fergus, L. Fei-Fei, E. Perona, et al. Learning object categories from google’s image search. In International Conference on Computer Vision, Beijing, China, 2005,2:1816-1823
    23程环环,王润生.面向自然场景分类的贝叶斯网络局部语义建模方法.信号处理,2010, 26(2):234-240
    24谢昭,高隽.基于高斯统计模型的场景分类及约束机制新方法.电子学报, 2009, 37(4):733-738
    25 P. Quelhas, P. F. Monay, J. Odobez. Modeling scenes with local descriptors and latent aspects. International Conference on Computer Vision. Beijing, China, 2005:883-890
    26李文波,孙乐,张大鲲.基于Labeled-LDA模型的文本分类新算法.计算机学报, 2008, 31(4):1-3
    27 L. Feifei. CVPR 2007 tutorial bag of words [EB/OL]. [2007-06-24]. http: // vision.cs. princeton.Edu/documents/CVPR2007_tutorial_bag_of_words.ppt. 2007
    28石晶,范猛,李万龙.基于LDA模型的主题分析.自动化学报,2009,35(12): 1586-1592
    29 Wesley E. Snyder, Hairong Qi.机器视觉教程.林雪闵,崔锦实,赵清杰.北京:机械工业出版社, 2005:267-268
    30于林森,张田文.用于图像分割的滤波EM算法.计算机学报, 2006,29(6):928-931
    31石晶,戴国忠.基于PLSA模型的文本分割.计算机研究与发展, 2007, 44 (2):242-24
    32 D. Lowe. Distinctive image features from scale-invariant keypoints. International Journal on Computer Vision, 2004, 60(2):91-110
    33 M. Vetterli, J. Kovacevic. Wavelets and Subband Coding. New Jersey: Prentice Hall, 1995:352-342
    34 L. M. J. Florack, M. B. Ter Haar Romeny, J. J. Koenderink, M. A. Viergever. General intensity transformations and second order invariants. Proceedings of the 7th Scandinavian Conference on Image Analysis. Aalborg, Denmark: Springer, 1991: 338-345
    35 C. Schmid, R Mohr. Local gray value invariants for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997,19(5):530-535
    36 William T. Freeman, Edward H. Adelson. The design and use of steerable filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1991,13(9):891-906
    37 T. Leung, J. Malik. Representing and recognizing the visual appearance of materials using three-dimensional textons. International Journal of Computer Vision, 2001,43(1):29-44
    38 Jihyo Lee, Hanseok Ko. Gradient-based local affine invariant feature extraction for mobile robot localization in indoor environments. Pattern Recognition Letters, 2008, 29 (14): 1934-1940
    39 L. V. Gool, T. Moons, D. Ungureanu. A?ne/Photometric invariants for planar intensity patterns. Proceedings of the 4th European Conference on Computer Vision. London, UK. Springer, 1996:642-651
    40 J. van de Weijer, C. Schmid. Coloring local feature extraction. Proceedings of the 9th European Conference on Computer Vision. Graz, Austria: Springer, 2006:334-348
    41 J. Li, Allinson N M. A comprehensive review of current local features for computer vision. Neurocomputing, 2008, 71(10-12):1771-1787
    42 Jianxin Wu, James M. Rehg. Where am I: Place instance and category recognition using spatial PACT. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Anchorage, Alaska, 2008,7:24-26
    43 Y. Ke, R. Sukthankar. PCA-SIFT: a more distinctive representation for local image descriptors. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington D.C., USA. IEEE, 2004:506-513
    44 K. Mikolajczyk, C. Schmid. A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005,27(10): 1615-1630
    45 H. Bay, T. Tuytelaars, L. V. Gool. SURF: speeded up robust features. Proceedings of the
    9th European Conference on Computer Vision, Graz, Austria: Springer, 2006.:404-417
    46 Marko Heikkil?, Matti Pietik?inen, Cordelia Schmid. Description of Interest Regions with Local Binary Patterns. Pattern Recognition, 2009,42(3):425-436
    47刘万军,姜庆玲,张闯.基于CNN彩色图像边缘检测的车牌定位方法.自动化学报, 2009,35(12):1503-1512
    48孙君顶,毋小省.基于分块主色和形状特征的彩色图像检索.光电工程, 2006, 33(12):85-90
    49 Zhou Ke, Gui-Rong Xue, Yang Qiang, et al. Learning with positive and unlabeled examples using topic-sensitive PLSA. IEEE Transactions on Knowledge and DataEngineering, 2010, 22(1): 46-58
    50江悦,王润生.基于多特征扩展pLSA模型的场景图像分类.信号处理, 2010, 26(4): 539-544
    51 A. Torralba. Contextual priming for object detection. International Journal of Computer Vision, 2003,53(2):169-191
    52 S. Kumar, M. Hebert. Discriminative random fields: a discriminative framework for contextual interaction in classification. Proceedings of the IEEE International Conference on Computer Vision, France, 2003,2:1150-1157
    53 X. He, R.S. Zemel, M.A. Carreira-Perpinan, Multiscale conditional random fields for image labeling. Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington, DC, 2004,(2):695-702
    54 G. Heitz, D. Koller. Learning Spatial Context: Learning Spatial Context: Using Stuff to Find Things. 10th European Conference on Computer Vision, LNCS, INRIA Ville de Marseille, France. Springer Verlag, 2008:30-43
    55 J. Vogel, B. Schiele. A semantic typicality measure for natural scene categorization, Deutsche Arbeitsgemeinschaft für Mustererkennung( DAGM ), Springer, Berlin, 2004: 195-203
    56 L. J. Li, R. Socher, L. Fei-Fei. Towards total scene understanding: classification, annotation and segmentation in an automatic framework, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2009:2036-2043
    57刘硕研,须德,冯松鹤,等.一种基于上下文语义信息的图像块视觉单词生成算法.电子学报, 2010,38(5):1156-1161
    58 Jianzhao Qin, Nelson H.C. Yung. Scene categorization via contextual visual words. Pattern Recognition, 2010,43(5): 1847-1888
    59 Tinglin Liu, Jing Liu, Qinshan LIU, et al. Expanded bag of words representation for object classification. Image Processing, Proceedings of the 16th IEEE International Conference, Cairo, 2009:297-300

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700