基于判别学习的图像目标分类研究

英文题名：Image Object Classfication Research Based on Discriminative Learning
作者：陈海林
论文级别：博士
学科专业名称：信号与信息处理
中文关键词：支持向量机 ; 局部特征 ; 核函数 ; 层次模型 ; 角点分布 ; 层叠分类器
英文关键词：Sopport vector machine ; local feature ; kernel ; hierarchical model ; corner distribution ; cascade classifier
学位年度：2009
导师：吴秀清
学科代码：081002
学位授予单位：中国科学技术大学
论文提交日期：2009-04-01

摘要

图像内容分析与理解是视觉智能的重要内容之一,图像目标分类是图像内容分析与理解领域的研究热点,图像目标分类在实际生活中有着重要的应用,已经获得广泛研究。当前图像目标分类的基本思想是先建立图像目标的描述,然后利用机器学习方法学习图像目标类型,最后利用学习得到的模型对未知图像目标进行分类或识别。计算机表示的底层特征与人类理解的高层语义特征存在语义鸿沟,使得图像目标分类面临着很大挑战,图像目标分类有待进一步研究。由于判别学习具有很好的实际应用性能,本文主要研究如何将图像目标描述与判别学习进行融合,并应用于图像目标分类。
     本文主要从两个大的方面研究图像目标分类,即通用图像目标分类和特定图像目标分类,对于通用图像目标分类采用基于局部特征的图像描述与判别学习算法相融合的方法,对于特定图像目标分类根据特定图像目标的特性提取不同的全局特征,然后结合相应的判别分类方法进行图像目标分类。本文的主要研究工作和创新点归纳如下:
     1.充分挖掘局部特征在特征空间的结构特性,提出密度导向的树型结构核函数,该核函数是非参数核函数,具有与特征点数目成线性关系的计算复杂度,能够计算出具有不等势的两个特征点集之间局部匹配关系,具有较好的匹配能力,无需用户指定特定参数,满足正定条件,可以用于基于核函数的学习算法,能够将图像目标的描述和判别分类器进行良好的融合,进行图像目标定位或识别。实验表明该核函数具有良好的局部匹配性能和图像目标的分类能力。
     2.研究局部特征在图像空间的位置相关性,提出局部特征空间相关核函数,该核函数可以较好地描述局部特征在图像中相对位置关系,满足正定条件,可以嵌入基于核的学习算法,且具有较好的时间效率。实验结果表明局部特征空间相关核函数具有较好的分类性能。
     3.研究局部特征同时在图像空间和特征空间的关系,提出双空间金字塔匹配核函数,该核函数可以满足正定条件,具有线性计算复杂度,可以嵌入基于核的学习算法。实验结果表明双空间金字塔匹配核具有较好的分类性能。
     4.仔细分析遥感图像的语义内容,设计一种遥感图像语义内容层次模型,可以将遥感图像语义层次模型应用于遥感图像分类、检索,目标检测和识别等。提出基于角点分布特征的中低分辨率遥感图像飞机检测方法,该方法利用飞机的角点分布特征可以快速地进行目标粗定位,为后面的分类判别减少计算量,然后使用简单有效的空间结构特征和决策树对飞机进行判别,实验取得良好的效果。
     5.针对基于相机的中英文字符语言种类自动识别问题,提出一种基于后验概率估计的层叠分类器,该层叠分类器的节点分类器采用判别学习算法,采用两种方法设计层叠分类器的节点阈值,即独立阈值设计和非独立阈值设计,并从理论上设计满足整体要求的层叠分类器。该层叠分类器的设计为高分类率的分类器设计提供了一种理论方法。为了能够很好地挖掘中英文字符之间的结构差异,提出采用基于象素梯度信息的水平垂直笔画向量和梯度方向相关图,以及基于位置相关象素的相对灰度信息的Census变换(Census Transform)直方图,它们对光照、噪声以及分辨率等都具有良好的鲁棒性,可以应用于基于相机的图像。理论分析和实验结果表明,非独立阈值设计可以使层叠分类器获得更高的分类率,提出的方法对于基于相机的中英文字符语言种类具有良好的分类能力。
The analysis and understanding of the image content is one of the important contents for the visual intelligence,the image object classification is a research focus in the field of the analysis and understanding for the image content,there are very important applications by the image object classification in the practical life which has been researched widely.Currently,the basic thinking of the imgae object classification is firstly building the image object presentation,secondly learning the image object class by the machine learning,and then classifying or recognizing the the unseen image objects by the learned models.The semantic gap occurs between the low level features represented by computers and the high level semantic features understood by the human,it makes the image object classification face the great challenges,and the image object classificaiton should be researched further.Because the discrimative learning has the good practical ability,this thesis mainly researchs how to fuse the imgae object presentation and the discriminative learning in order to classify the image objects.
     This thesis researchs the image object classification mainly from two great issues including the general image object classification and the special image object classification,for the general image object classification the fusion of the local feature based image presentation and the discriminative learning is used,and for the special image object classification,the different global features can be extracted according to the characteristic of the special image objects,and then the corresponding discriminative classification methods are combined to classify the image objects.The main research works and creative points in this thesis are summed up as followings:
     1.Sufficiently mining the structural characteristic of feature space from local features,the density-guided tree-structured kernel is proposed,which is a non-paramatic kernel,has the linear compuation cost with the number of feature points,can compute the partial matching relations between two feature sets with unequal cardinality,has better matching ability,does not require the users specify the special paramaters,satisfies the positive define condition,can be used to the kernel based learning algorithms,can fuse the image object presentation and classifier well, and can also locate or recognize the image objects.The experimental results show that the prosed kernel has the good matching ability and the image object classification ability.
     2.Researching the location correlation from the local features in the image space, local feature spatial correlation kemel is proposed,which can describe the relative location relationship from the local features in the image space,satisfies the positive define condition,can be emmbeded into kemel based learning algorithm,and has the better time efficiency.The experimental results show the local feature spatial correlation kernel has the better classification ability.
     3.Researching the relations for local features in both image space and feature space,the bi-space pyramid matching kernel is proposed,which can satisfy positive define condition,has linear computation cost,can be emmbeded into kernel based learning algorithm.The experimental results show the bi-space pyramid matching kernel has better classification performance.
     4.Carefully analysing the semantic content of the remote sensing images,a hierarchical model of semantemes for remote sensing images is designed,which can be used to the remote sensing image classification,retrieval,object detection and recognition,etc.Comer distribution based airplane detection in the middle/low resolution remote sensing images is also proposed,the coarse locations of the object can be achieved fastly using comer distribution feature of the airplane,the computation cost for the classification can be reduced,and then the airplane can be discriminated using the simple and efficient spatial structure feature and the decision tree.The experment achieves the good performance.
     5.Aiming at the automatic recognition of the camera-based chinese and english character language types,the posterior probability estimation based cascade classifier is proposed,in which the discriminative learning algorithm is used,there are two methods for designing the node thresholds of cascade classifier,such as independent threshold designing and dependent threshold designing,and the cascade classifier satisfying the whole requirements is designed from the theory.The designing of the proposed cascade classifier can provide a theoretic method to design the classifier with high classificaiton rate.In order to mine the structure difference of the chinese and english characters,the gradient information of the pixel based horizontal and vertical stroke vector,gradient orientation correlogram and the relative gray information of the location correlation pixels based census transform histogram are used,they can be robust to illumination,noise and resolution and so on,and can be applied to camera-based images.Both the theoretic analysis and experimental results show that the dependent threshold designing can make the cascade classifier achieve the higher classification rate,the proposed method has the good classification performance to the camera-based chinese and english character language types.

引文

蔡红苹,耿振伟,粟毅.2007.遥感图像飞机检测新方法.圆周频率滤波法[J].信号处理,23(4):539-543.
    邓乃扬,田英杰.2004.数据挖掘中的新方法:支持向量机[M].北京:科学出版社.
    范志刚.2007.快速人脸检测和识别理论与算法研究[D]:[博士].上海:上海交通大学.
    郭育生,谭怒涛,黄磊,等.2008.一种中文文档的数学公式定位方法[J].中文信息学报,22(4):83-87.
    梅建新,段汕,秦前清.2004.基于支持向量机的特定目标检测方法[J].武汉大学学报(信息科学版),29(10):912-915.
    江志伟.2007.基于内容的WEB图像过滤技术研究[D]:[博士].杭州:浙江大学.
    梁路宏,艾海舟,徐光祜,张钹.2002.人脸检测研究综述[J].计算机学报,25(5):449-458.
    李闯,丁晓青,吴佑寿.2006.一种基于直方图特征和AdaBoost的图像中的文字定位算法[J].中国图象图形学报,11(3):325-331.
    李海月.2006.遥感图像中建筑物自动识别与标绘方法研究[D]:[硕士].北京:中国科学院电子学研究所.
    刘博文,余松煜,徐奕,等.2007.宽基线主动视觉中感兴趣目标的对应技术[J].中国图像图形学报,12(10):1917-1921.
    王君秋,查红彬.2006.结合兴趣点和边缘的建筑物和物体识别方法[J].计算机辅助设计与图形学学报,18(8):1257-1263.
    王忠武,赵忠明.2008.高分辨率遥感图像飞机目标定位新算法[J].光电工程,35(8):97-101.
    夏勇,王春恒,戴汝为.2006.基于自适应特征与多级反馈模型的中英文混排文档分割[J].自动化学报,32(3):353-359.
    徐光祜.2002.计算机视觉[M].
    于玲,吴铁军.2004.集成学习:Boosting算法综述[J].模式识别与人工智能,17(1):52-59.
    张浩然.2003.支持向量机算法及应用研究[D]:[博士].上海:上海交通大学.
    张学工.2000.关于统计学习理论与支持向量机[J].自动化学报,26(1):32-42.
    张志伟,孔凡让,刘维来,等.2007.中文科技文档中的数学表达式定位[J].中文信息学报,21(4):86-91.
    Alvarez L,Morales F 1997.Affine morphological multi-scale analysis Of corners and multiple junctions[J].International Journal of Computer Vision.2(25):95-107.
    Andrew Busch,Wageeh W Boles,Sridha Sridharan.2005.Texture for script identification[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,27(11):1720-1732.
    Anifantis D,Dermatas E,Kokkinakis G 1999.A neural network method for accurate face detection on arbitray images[C]//In Proceedings of Conference on Electronics,Circuits and Systems.Pafos,Cypyrus,1:109-112.
    Anthony M,Biggs N.1992.Computational learning theory:an introduction[M].Cambridge University Press,England.
    Antonio Torralba,Kevin P Murphy,William T Freeman.2005.Sharing visual features for multi-class and multi-view object detection [J].IEEE Transactions on Pattern Analysis and Machine Intelligence.29(5):854-869.
    Asada H,Brady M.1986.The curvature primal sketch [J].Pattern Analysis and Applications.8(1).2-14.
    Ashbrook A,Thacker N,Rockett P,Brown C.1995.Robust recognition of scaled shapes using pairwise geometric histograms [C]//In Proceedings of British Machine Vision Conference.Birmingham,UK,503-512.
    Ayers.B,Boutell.M.2007.Home interior classification using SIFT keypoint histograms [C]//In Proceedings of IEEE International Conference on Computer Vision and Partern Recognition.17-22.
    Bay Herbert,Tuytelaars Tinne,Van Gool Luc.2006.SURF:speeded up robust features [C]//In Proceedings of European Conference on Computer Vision.Graz,Austria.3951:404-417.
    Belongie S,Malik J,Puzicha J.2002.Shape matching and object recognition using shape contexts [J].IEEE Transactions on Pattern Analysis and Machine Intelligence.24(4):509-522.
    Berg A,Berg T,J Malik.2005.Shape matching and object recognition using low distortion correspondences[C]//In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition.26-33.
    Blei David M,Ng Andrew Y,Jordan Michael 1.2003.Latent dirichlet allocation[J].Journal of Machine Learning Research.993-1022.
    Brand P,Mohr R.1994.Accuracy in image measure [C]//SPIE Conference on Videometrics.218-228.
    Bosch A,Zisserman A,Munoz X.2006.Scene classification via pLSA [C]//In Proceedings of European Conference of Computer Vision.4:517-530.
    Bosch A,Zisserman A,Munoz X.2008.Scene classification using a hybrid generative/discriminative approach [J].IEEE Transactions on Pattern Analysis and Machine Intelligence.30(4):712-727.
    Bouchard G,Triggs B.2005.Hierarchical part-based visual object categorization [C]//In Proceedings of International Conference on Computer Vision and Pattern Recognition.1:20-25.
    Boughhorbel S,Tarel J-P,Fleuret F.2004.Non-mercer kernels for SVM object recognition [C]//In Proceedings of British Machine Vision Conference.London,UK.137-146.
    Brubaker S Charles,Matthew D Mullin,James M Rehg.2006.Towards optimal training of cascaded detectors [C]//In Proceedings of European Conference on Computer Vision.3951:325-337.
    Carneiro Gustavo,Lowe David.2006.Sparse flexible models of local features [C]//In Proceedings of European Conference on Computer Vision.Springer Berlin/ Heidelberg.29-43.
    Chandal S,Pal U,Kimura F.2007a.Identification of Japanese and English script from a single document page [C]// In Proceedings of IEEE International Conference on Computer and Information Technology.656-661.
    Chandal S,Oriol Ramos Terrades,Pal U.2007b.SVM based scheme for Thai and English script identification [C]// In Proceedings of International Conference on Document Analysis and Recognition.551-555.
    Chen D,Bourlard H,Thiran JP.2001.Text identification in complex background using SVM [C]//In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition.2:621-626.
    Chen Datong,Odobez Jean-Marc,Bourlard Herve.2004.Text Ddetection and recognition in images and video frames [J].Pattern Recognition.37:595-608.
    Chen Xiangrong,Yuille A.L.2004.Detecting and reading text in natural scences [C]//In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.366-373.
    Chen Xiangrong,Yuille A L.2005.A time-efficient cascade for real-time object detection:with applications for the visually impaired [C]//In Proceedings of International Workshop on Computer Vision Applications for the Visually Impaired(CVACVI).Workshop In Association with CVPR.28-28.
    Cherkassky V,Muler F.1997.Learning from data:concepts,theory and methods [M].NY:John Viley& Sons.
    Crandall D,Felzenszwalb P,Huttenlocher D.2005.Spatial priors for part-based recognition using statistical models [C]//In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.San Diego.1:10-17.
    Cristianini N,Shawe-Taylor J,Lodhi H.2002.Latent semantic kernels [J].Journal of Intelligent Information Systems.18(2):127-152.
    Cristianini N,Taylor J S.2000.An introduction to support vector machines and other kernel-based learning methods [M].Shawe-Taylor Cambridge University Press.
    Crowley J L,Parker A C.1984.A representation for shape based on peaks and ridges in the difference of low pass transform[J].IEEE Transactions on Pattern Analysis and Machine Intelligence.6(2):156-170.
    Csurka Gabriella,Dance Christopher R,el at.2004.Visual categorization with bags of keypoints [C]//In Proceedings of European Conference on Computer Vision Workshop on Statistical Learning in Computer Vision.Pargue,Czech Republic:Springer.59-74.
    Dalai N,Triggs B.2005.Histograms of oriented gradients for human detection [C]//In Proceedings of IEEE International Conference on Computer Vision.886-893.
    Deriche R,Giraudon G 1993.A computational approach for corner and vertex detection [J].International Journal of Computer Vision.10(2):101-124.
    Dorin Comaniciu,Peter Meer.2002.Mean Shift:a robust approach toward feature space analysis [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,24(5):603-619.
    Dufournaud Y,Schmid C,Horaud R.2000.Matching images with different resolutions [C]//In Proceedings of Conference on Computer Vision and Pattern Recognition.Hilton Head Island,South Carolina,USA.612-618.
    Felzenszwalb P,Huttenlocher D.2005.Pictorial structures for object recognition [J].International Journal of Computer Vision.61:55-79.
    Fergus R.2005.Visual object category recognition [D]:[Ph.D.]UK:University of Oxford.
    Fergus R,Perona P,Zisserman P.2003.Object class recognition by unsupervised scale invariant learning [C]//In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2:264-271.
    Fergus R,Perona P,Zisserman P.2005.A sparse object category model for efficient learning and exhaustive recognition [C]//In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,San Diego.1:380-387.
    Foerstner W.1994.A framework for low level feature extraction [C]//In Proceedings of European Conference on Computer Vision.Stockholm,Sweden.383-394.
    Freund Y,Schapire R E.1996.Experiments with a new boosting algorithm [C]//In Proceedings of International Conference on Machine Learning.Morgan Kauffman.148-156.
    Freund Y,Schapire R E.1997.A decision-theoretic generalization of on-line learning and an application to boosting [J].Journal of Computer and System Science.55(1):119-139.
    Grauman K,Darrell T.2005.The pyramid match kernel:discriminative classification with sets of image features [C]//In Proceedings of IEEE International Conference on Computer Vision.Beijing,China,2:1458-1465
    Grauman K,Darrell T.2007.Approximate correspondences in high dimensions [C]//In B.Scholkopf,J.C.Platt,and T.Hofmann,editors,Advances in Neural Information Processing Systems 19.Cambridge,MA,MIT Press,505-512.
    Gregory Shakhnarovich,Paul Viola,Trevor Darrell.2003.Fast pose estimation with parameter-sensitive hashing [C]// In Proceedings of IEEE International Conference on Computer Vision.1-8.
    Gunn S R.1998.Support vector machines for classification and regression.Image Speech and Intelligent Systems Research group,University of Southapton:ISIS Technical Report,ISIS-1-98.
    Harris C,Stephens M.1988.A combined corner and edge detector [J].In Alvey Vision Conference.147-151.
    Heisele Bernd,Serre Thomas,Prentice Sam,Poggio Tomaso.2003.Hierarchical classification and feature reduction for fast face detection with support vector machines [J].Pattern Recognition.36:2007-2017.
    Heitger F,Rosenthaler L,Von der Heydt R,Peterhans E,Kubler O.1992.Simulation of neural contour mechanisms:form simple to end-stopped cells [J].Vision Research.32(5):963-981.
    Henry Schneiderman.2004.Feature-centric evaluation for efficient cascaded object detection [C]//In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition.2:29-36.
    Hichem Sahbi,Donald Geman.2006.A hierarchy of support vector machines for pattern detection [J].Journal of Machine Learning Research.2087-2123.
    Hiremath P S,Shivashankar S.2008.Wavelet based co-occurrence histogram features for texture classification with an application to script identification in a document image [J].Pattern Recognition Letters,29:1182-1189.
    Houub Alex D,Welling Max,Perona Pietro.2005.Combining generative models and fisher kernels for object recognition [C]//In Proceedings of the Tenth IEEE International Conference on Computer Vision.1:13 6-143.
    Huang Lin-Lin,Shimizu Akinobu,Kobatake Hidefumi.2005.Robust face detection using Gabor filter features [J].Pattern Recognition Letters.26:1641-1649.
    Huang Jing,S Ravi Kurnar,et al.1997.Image indexing using color correlograms [CJ//IEEE Computer Society Conference on Computer Vision and Pattern Recognition.762-768.
    Jaeger S,Ma H,Doermann D.2005.Identifying script on word-level with informational confidence [C]// In Proceedings of International Conference on Document Analysis and Recognition.416-420.
    Joachims.T.1997.Text categorization with support vector machines:Learning with many relevant features [C]//In Proceedings of European Conference on Machine Learning.
    Joell P,Marsh R.1997.A hierarchical neural network for human face detection [J].Pattern Recognition.29(5):781-787.
    John Lafferty,Andrew Mccallum,Fernando Pereira.2001.Conditional random fields:probabilistic models for segmenting and labeling sequence data [C]// In Proceedings of the 18~(th)ICML.San Farncisco:Morgan Kaufmann.282-289.
    Juell P,Marsh R.1996.A hierarchical neural network for human face detection [J].Pattern Recognition.29(5):781-787.
    Kimura K,Takashina K,Tsuruoka S,Miyake Y.1987.Modified quadratic discriminant functions and the application to Chinese character recognition [J].IEEE Transactions on Pattern Analysis and Machine Intelligence.19(1):149-153.
    Koenderink J.1984.The structure of images [J].Biological Cybernetics.50:363-370.
    Kondor R,Jebara T.2003.A kernel between sets of vectors [C]//In Proceedings of the International Conference on Machine Learning.Washington,D.C..
    Kouzani A Z,He F,Smmut K.1997.Commonsense knowledge based face detection [C]//In Proceedings of Conferece on Intelligent Engineering Systems.Budapast,Hungry.215-220.
    Lades M,Vorbruggen J C,Buhmann J,Lange J et al.1993.Distortion invariant object recognition in the dynamic link architecture [J].IEEE Transactions on Computer.42:300-311.
    Lazebnik S,Schmid C,Ponce J.2005.A sparse texture representation using local affine regions [J].IEEE Transactions on Pattern Analysis and Machine Intelligence.27(8):1265-1278.
    Lazebnik S,Schmid C,Ponce J.2006.Beyond bags of features:spatial pyramid matching for recognizing natural scene categories[C]//In Proceedings of IEEE International Conference on Computer Vision and Partern Recognition.2169-2178.
    Lecun Y,Bottou L,Bengio Y,Haffner P.1998.Gradient-based learning applied to document recognition[C]// In Proceedings of IEEE.86(11):2278-2324.
    Leibe B,Leonardis A.,Schiele B.2004.Combined object categorization and segmentation with an implicit shape model [C]//In Proceedings of European Conference on Computer Vison in Workshop on Statistical Learning in Computer Vision.17-32
    Leibe B,Grauman K.2008.Visual object recognition.Tutorial for AAAI2008.Leibe B,Schiele B.2003.Analyzing appearance and contour based methods for object categorization [C]//In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition.2:409-415.
    Lewis D.1998.Naive Bayes at forty:The indepedence assumption in information retrieval [C]//In Proceedings of European Conference of Machine Learning.1398:4-15.
    L.Fei-Fei,Fergus R,Perona P.2003.A bayesian approach to unsupervised one-shot lerning of object categories [C]// In Proceedings of IEEE International Conference on Computer Vision.1134-1141.
    L.Fei-Fei,Pietro Perona.2005.A bayesian hierarchical model for learning natural scene categories[C]// In Proceedings of IEEE International Conference on Computer Vision.524-531.
    L.Fei-Fei,Fergus R,Torralba A.2007.Recognizing and learning object categories.Short Course CVPR:http://people.csail.mit.edu/torralba/shortCourseRLOC/index.html.
    Liang Jian,David Doermann,Huiping Li.2005.Camera-based analysis of text and documents:a survey [J].International Journal on Document Analysis and Recognition.7:84-104.
    Lienhart R,Maydt J.2002.An extended set of haar-like features for rapid object detection [C].1:900-903.
    Lienhart Rainer,Kuranov Alexander,Pisarevsky Vadim.2003.Empirical analysis of detection cascades of boosted classifiers for rapid object detection [J].Pattern Recognition.2781:297-304.
    Li Jing,Allinson Nigel M.2008.A comprehensive review of current local features for computer vision [J].Neurocomputing.1771-1787.
    Li Li-Jia,Li Fei-Fei.2007.What,where and who? classifying events by scene and object recognition [C]// In Proceedings of IEEE International Conference on Computer Vision.
    Lim Y K,Choi S H,Lee S W.2000.Text extraction in MPEG compressed video for content-based indexing [C]//In Proceedings of International Conference on Pattern Recognition.409-412.
    Lindeberg T.1994.Scale-space theory:a basic tool for analysis structures at different Scales [J].Journal of Applied Statistics.21(2):224-270.
    Lindeberg T.1998.Feature detection with automatic scale selection [J].International Journal of Computer Vision.30(2):79-116.
    Lin lin,Tan Chew Lim.2005.Text extraction from name cards with complex design [C]//In Proceedings of IEEE International Conference on Document Analysis and Recognition.2:977-980.
    Lin Hsuan-Tien,Lin Chih-Jen and Weng Poaby C.2007.A note on Piatt's probabilistic outputs for support vector machines[J].Machine Learning,68(3):267-276.
    Liu Chengjun and Wechsler Harry.2002.Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition [J].IEEE Transactions on Image Processing.11 (4):467-476.
    Liu Xiaobing,Wang Dong,Li Jianmin,Zhang Bo.2007.The feature and spatial covariant kernel:adding implicit spatial constraints to histogram[C]//In Proceedings of the 6~(th)ACM International Conference on Image And Video Retrieval.Amsterdam,The Netherlands.565-572.
    Lodhi H,Shawe-Taylor J,N Christianini,C.Watkins.2001.Text classification using string kernels [C]//In Advances in Neural Information Processing Systems.13.
    Lowe D.1999.Object recognition from local scale-invariant features [C]//In Proceedings of IEEE International Conference on Computer Vision.Kerkyra,Greece.1150-1157.
    Lowe D.2004.Distinctive image features from scale-invariant keypoints [J].International Journal of Computer Vision.60(2):91-110.
    Lucas.Simon M.2005.ICDAR2005 Text Locating Competition Results [C]//In Proceedings of IEEE International Conference on Document Analysis and Recognition.1:80-84.
    Luo Xi-Ping,Li Jun,Zhen Li-Xin.2004.Design and Implementation of a Card Reader Based On Build-in Camera [C]// In Proceedings of International Conference on Pattern Recognition.1:417-420.
    Lyu S.2005.Mercer kernels for object recognition with local features [C]//In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.San Diego,CA.223-229
    Ma W Y,Manjunath B S.1996.Texture features and learning similarity [C]//In Proceedings of IEEE International Conferece on Computer Vision and Pattern Recognition.425-430.
    Mancas-Thillou Celine.2006.Natural secne text understanding[D]:[Ph.D.].Belgium:Ciaco University.
    Mandler J M,Parker R E.1976.Memory for descriptive and spatial information in complex pictures [J].Journal of Experimental Psychology.2:38-48.
    Maree R,Geurts P,Piater J,Wehenkel L.2005.Random subwindows for robust image classification [C]//In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition.1:34-40.
    Masnadi-Shirazi.H,Vasconcelos.N.2007.High detection-rate cascades for real-Time object detection [C]//In Proceedings of IEEE International Conference on Computer Vision.1-6.
    Mccallum A,Freitag D,E Pereira.2000.Maximum entropy markov models for information extraction and segmentation [C]//In Proceedings of International Conference on Machine Learning.Stanford,California:Morgan Kaufmann.591-598.
    Medioni G,Yasumoto Y.1987.Corner detection and curve representation using cubic B-spline [J].Computer Vision,Graphics and Image Processing.39(1):267-278.
    Mikolajczyk Krystian,Schmid Cordelia.2001.Indexing based on scale invariant interest points [C].In Proceedings of IEEE International Conference on Computer Vision.Vol.1.525-531.
    Moghaddam B,Pentland A.1997.Probabilistic visual learning for object representation [J].IEEE Transaction on Pattern Analysis and Machine Intelligence.19(7):696-710.
    Mokhtarian F,Mackworth A.1986.Scale-based description of planar curves and two-dimensional shapes [J].IEEE Transactions on Pattern Analysis on Machine Intelligence.8(1):34-43.
    Murase H,Nayar S.1995.Visual learning and recognition of 3-d objects from appearance [J].International Journal on Computer Vision.14(1):5-124.
    Nefian A V,Hayes M H.1998.Face detection and recognition using hidden markov models[C]//In Proceedings of IEEE International Conference on Image Processing.Chicago.141-145.
    Nefian A V,Hayes M H.1999.An emmbedden HMM based approach for face detection and recognition [C]//In Proceedings of IEEE Conference on Acousoes,Speech,and Signal Processing.3553-3556.
    Odone F,Barla A,Verri A 2005.Building kernels from binary strings for image matching [J].IEEE Transactions on Image Processing.14(2):169-180.
    Ojala T,Pietikainen M,Maenpaa T.2002.Multiresolution gray-scale and rotation invariant texture classification with local binary patterns [J].24(7):971-987.
    Opelt A,Pinz A,Fussenegger M,P Auer.2006.Generic object recognition with boosting [J].IEEE Transactions on Pattern Analysis and Machine Intelligence.28(3):416-431.
    Osuna E,Freund R,Girosi F.1997.Training support vector machines:an application to face detection [C]//In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition.130-136.
    Papageorgiou Constantine,Poggio Tomaso.2000.A trainable system for object detecion [J].International Journal of Computer Vision.38(1):15-33.
    Pati Peeta Basa,Ramakrishnan A G.2008.Word level multi-script identification [J].Pattern Recognition Letters,29:1218-1229.
    Platt John C.2000.Probabilisic outputs for support vector machines and comparisons to regularized likelihood methods[C]//In:A.Smola,P.Bartlett,B.Scholkopf,and D.Schuurmans (eds.):Advances in Large Margin Classifiers.Cambridge,MA.
    Ramin Zabih,John Woodfill.1994.Non-parametric local transforms for computing visual correspondence [C]//In Proceedings of European Conference on Computer Vision.2:151-158.
    Reisfeld D,Wolfson H,Yeshurun Y.1995.Context free attentional operators:the generalized symmetry transform [J].International Journal of Computer Vision.14(2):119-130.
    Rowley H A.1999.Neural network-based face detection.Carnegie Mellon University,Pittsburgh PA:Technical Report CMU-CS-99-117.
    Rowley H A,Baluja S,Kanade T.1997.Rotation invariant netural network-based face detection.Canegie Mellon University,Pittsburgh PA,Technical Report CMU-CS-97-201.
    Rowley H A,Baluja S,Kanade T.1998.Neural network-based face detection [J].IEEE Transactions on Pattern Analysis and Machine Intelligence.20(1):23-38.
    Rowley Henry,Baluja Shumeet,Kanade Takeo.1996.Human face detection in visual scenes [C]// In Advances in Neurel Information Processing Systems 8.875-881.
    Rubner Y,Tomasi C,Guibas L.2000.The earth mover's distance as a metric for image retrieval [J].International Journal of Computer Vision,40(2):99-121.
    Safavian S R,Landgrebe D.1991.A survey of decision tree classifier methodology[J].IEEE Transactions on Systems,Man and Cybernetics.21(3):660-674.
    Sanjiv Kumar,M.Hebert.2003.Discriminative random fields:a discriminative framework for contextual interaction in classification [C]// In Proceedings of IEEE International Conference on Computer Vision.Pittsburgh,PA,USA..1150-1157.
    Schapire R E.2001.The boosting approach to machine learning:an overview [C]//In Proceedings of the Mathematical Sciences Research Institute (MSRI)Workshop on Nonlinear Estimation and Classification.Berkeley,California.149-172.
    Schapire R E.1990.The strength of weak learn ability[J].Machine Learning.5(2):197-227.
    Schapire R E.1999.A brief introduction to boosting [C]//In Proceedings of the 16~(th)International Joint Conference on Artificial Intelligence.
    Schneiderman H,Kanade T.1998.Probabilistic modeling of local appearance and spatial relataitons for object recognition[C]//In Proceedings of IEEE International Conference on Computer Vison and Pattern Recognition.Sanata Barbara,California.45-51.
    Shawe-Taylor J,Cristianini N.2004.Kernel methods for pattern analysis[M].Cambridge University Press.
    Sivic J,Russell B C,Efros A A,Zisserman A,Freeman W T.2005.Discovering objects and their location in images [C]//In Proceedings of IEEE International Conference on Computer Vision.370-377.
    Smith S M,Brady J M.1997.SUSAN:A new approach to low level image processing [J].International Journal of Computer Vision.23(1):45-78.
    Sudderth Erik B,Torralba Antonio,Freeman William T,et al.2005.Learning hierarchical models of Sscenes,objects,and parts [C]//In Proceedings of IEEE International Conference on Computer Vision.1131-1338.
    Swain M,Ballard D.1991.Color indexing [J].International Journal in Computer Vision.7(1):11-32.
    Tamura Hideyuki,Mori Shunji.Yamawaki Takashi.1978.Texture features corresponding to visual perception [J].IEEE Transactions on Systems,Man,and Cybernetics.8(6):460-473.
    Tan C,Leong T,He S.1999.Language identification in multilingual documents [C]// In Proceedings of International Symposium on Intelligent Multimedia and Distance Education.Baden-Baden,Germany.59-64.
    Thomas Hofmann.2001.Unsupervised learning by probabilitistic latent semantic analysis [J].Machine Learning.42:177-196.
    Tong S,Koller D.2001.Support vector machine active learning with applications to text classification [C]//In Proceedings of ACM International Conference on Machine Learning.107-118.
    Turk M A,Pentland A P.1991.Face recognition using eigenfaces [C]//In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition.586-591.
    Tuytelaars Tinne,Van Gool Luc.1999.Content-based image retrieval based on local affinely invariant regions [C]//In Proceedings of International Conference on Visual Information Systems.493-500.
    Tuytelaars Tinne,Van Gool Luc.2000.Wide baseline stereo matching based on local,affinely invariant regions [C]//In Proceedings of British Machine Vision Conference.University of Bristol,UK.412-425.
    Tuytelaars Tinne,Van Gool Luc.2004.Matching widely separated views based on affine invariant regions [J].International Journal of Computer Vision.1(59):61-85.
    Tuytelaars T,Mikolajczyk K.2006.A survey on local invariant features.Course Report On European Conference on Computer Vision (ECCV).
    ValiantLG 1984.A theory of the learnable[J].Communications of the ACM.27(11):1134-1142.
    Vapnik Vladimir N.1982.Estimation of dependencies based on empirical data [M].Berlin:Springer-verlag.
    Vapnik.Vladimir N.1995.The nature of statistical learning theory [M].NY:Springer-verlag.
    Vapnik.Vladimir N.1998.Statistical Learning Theory [M].Wiley.
    Viola Paul,Jones Michael.2001.Rapid object detection using a boosted cascade of simple features [C]//In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition.Kauai,Hawaii,USA..1:511-518.
    Viola P,Jones M.2002.Fast and robust classification using asymmetric AdaBoost and a detector cascade [C]//In Advances in Neural Information Processing Systems 14.1:1311-1317.
    Viola P,Jones M.2004.Robust real-time face detection [J].International Journal of Computer Vision,57(2):137-154.
    Witkin A.P.1983.Scale-space filtering [C]//In Proceedings of International Joint Conference on Artificial Intelligence.Karlsruhe,Germany.1019-1022.
    Wolf L,Shashua A.2003.Learning over sets using kernel principal angles [J].Journal of Machine Learning Research.913-931.
    Wu Jianxin,Charles Brubaker S,Mullin Matthew D,Rehg James M.2008.Fast asymmetric learning for cascade face detection [J].IEEE Transactions on Pattern Analysis and Machine Intelligence.30(3):369-382.
    Wu Jianxin,James Rehg M.2008.Where am I:place instance and category recognition using spatial PACT [C]//In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.IEEE,1-8.
    Wu V,Manmatha R,Riseman E M.1999.TextFinder:an automatic system to detect and recognize text in images [J].IEEE Transactions on Pattern Anaysis and Machine Intelligence.21(11):1224-1229.
    Xiao Rong,Zhu Long,Zhang Hong-jiang.2003.Boosting chain learning for object detection [C]//In Proceedings of IEEE International Conference on Computer Vision.1:709-715.
    Y.Ke,Sukthankar R.2004.PCA-SIFT:A more distinctive representation for local descriptors [C]// In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition.Washington,USA.511-517.
    Zhang Hao,Berg A.C,Maire M,Malik J.2006.SVM-KNN:discriminative nearest neighbor classification for visual category recognition[C]//In Preceedings of IEEE International Conference on Computer Vision and Pattern Recognition.2126-2136.
    Zhang Hao.2007.Adapting learning techniques for visual recognition[D]:[Ph.D.].USA:University of California at Berkeley.
    Zhang J,Marszalek M,Lazebnik S,et al.2007.Local features and kernels for classifcation of texture and object categories:a comprehensive study [J].International Journal of Computer Vision.73(2):213-238.
    Zhao S H,Chen X W,Wang S D,et al.2003.A new method of remote sensing image decision-level fusion based on support vector machine [C]//In Proceedings of International Conference on Recent Advances in Space Technologies,RAST'03.91-96.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700