基于大量对象识别算法的图片标签生成算法的研究

英文题名：Recognition Techniques of Mass Objects Based Research on Image Tagging Algorithm
作者：齐海智
论文级别：硕士
学科专业名称：电路与系统
中文关键词：智能识别 ; 纹理基元提升算法 ; 滤波器组 ; 共享集搜索算法
英文关键词：intelligent recognition ; TextonBoost ; filter bank ; shared
英文关键词：subset searching strategy
学位年度：2013
导师：吕玉琴
学科代码：080902
学位授予单位：北京邮电大学
论文提交日期：2012-12-10

摘要

进入21世纪以来,随着计算机的普及和互联网的发展,人们的工作和生活方式发生了巨大的改变。计算机不再是工程师的专利,互联网成为生活必需。多媒体技术作为互联网的核心,带来了一次又一次的服务升级。人们不再满足于单纯的文本服务,以图像为核心的应用不断涌现。从天气预报到工业产品检测,从小区安保到医学图像分析。从图像本身提取信息成为一个研究热点,即图像的标签提取技术。然而,目前对智能识别技术的研究只处于初级阶段,已得到广泛应用的模板匹配、神经网络和支持向量机等算法都只能识别较少的物体。难点在于增加识别对象的个数会带来指数倍增长的计算开销,新对象的引入加剧了类间干扰。为此,本文研究了目前在国外流行的一种针对较大数量物体的识别技术,即纹理基元提升算法,并对其进行了改进：提出了一种新的滤波器组的组成方案和一种新的共享集搜索策略。新的滤波器组裁剪了原滤波器组的a,b两个颜色分量的边缘和亮度信息,合并为H分量,用Gabor滤波器提取四个方向的信息代替原算法的横纵两个方向的滤波器,实验证明：新的滤波器组在相同的条件下,缩短了14.5%的聚类时间,而聚类半径缩短了一倍。新的共享集搜索策略引入了随机性,增大了贪婪算法收敛于全局最优解的概率,几乎使得全部解都成为“最够好”,时间性能上也有33%的提升。本文将该算法应用于图像标签识别,可以在一定的误差范围内快速的识别出给定图片中的物体类别。
In the21th century, people's work and life have been changed greatly by the population of computer and internet. Computer comes to ordinary people; internet turns to be a necessity. As a core technique of Internet, multimedia techniques update the internet services again and again. Text based services no longer meet public's needs.As a result, image based services pour into the world. It is applied successfully into multiple areas:from weather report to product defection detect, from community security to medical image analyze. Image tagging becomes a hot topic, which exacts information from the image itself. However, it is still in a beginning of this research, no matter template matching or Neutral Networks, or even supporting vector machine methods, all of which could only recognize few objects with low speed. The difficulty is with the increasing of objects, exponential computer cost is required. Besides, the interferences between classes also go worse. In order to solve these problems, this thesis implements and improves a method popular abroad, which named TextonBoost. This thesis proposes a new filter bank and a new searching strategy in finding best shared subsets. The edge and lightening information of component a and b in the original filter bank are cut off and combined into a new component H. Though gabor filter,4direction information are got instead of two.It is proved that the clustering time is shorten by14.5%under the same situation, besides, the distance between classes is shorten by almost50%.The new sharing set searching strategy introduces randomness, which increases the possibility of greedy search to become the global best solution,and makes almost all the solutions become good enough and time performance33%better. This algorithm improves the time performance and makes it possible to detect multiple classes in a given image in an acceptable time.

引文

[1]王志锋,《多媒体数据库的管理与数据挖掘研究》,《计算机与数字工程》2007年10期
    [2]Wikipedia contributors,Intemet[G／OL],Wikipedia,14,Jan,2013
    [3]何宝宏,《移动互联网是第三代互联网》,《中兴通讯技术》2009年4期
    [4]彭秀萍,《VoIP及其典型应用模式》,《成都大学学报：自然科学版》2006年3期
    [5]伍燕青,《浅谈我国网上购物的发展现状》,《华南金融电脑》2007年3期
    [6]申舟,《互联网意象——《城市意象》的互联网移植》,同济大学建筑与城市规划学院,2004
    [7]任建四,《基于内容的视频检索研究》,华中师范大学,2002
    [8]黎世光,《政治哲学的现代危机和古代出路——施特劳斯思想研究》,华中科技大学,2009
    [9]于二丽,图像匹配的并行算法研究,南京邮电大,2011
    [10]向彪,《基于超声波和视觉信息融合的语音提示技术研究》,河北工业大学,2010
    [11]姚扬中,《射影不变性识别》,西安电子科技大学,2007
    [12]邵泽明,《计算机视觉伺服跟踪控制系统》,南京航空航天大学,2003
    [13]周珂,《基于图像识别的烟草青枯病害诊断研究》,西南大学,2010.
    [14]韩浩,《图像处理与识别技术在智能机器鱼中的应用》,沈阳理工大学,2007
    [15]李景辉,《基于视频的奶牛识别系统——奶牛图像识别的研究》,上海师范大学,2009
    [16]赵剑辉,《基于模糊多类支持向量机的声母识别方法》,《计算机工程与科学》,(5)33,2011年
    [17]谢晶,《色彩在园林设计中的应用》,《剑南文学：下半月》,2011年10月
    [18]张功胜,《同屏多窗口位图显示的颜色快速调整算法》,1997年1期
    [19]赵三勇,《基于人眼视觉特性的数字彩色图像增强算法研究》,桂林理工大学,2011
    [20]李宏伟,《基于肤色分割和人脸特征的人脸检测研究》,安徽大学,2009
    [21]李永忠,《几种小波变换的图像处理技术》,《西北民族学院学报：自然科学版》2001年2期
    [22]邹超,《基于Gabor滤波器组的实时疵点图像分割》,《计算机工程与应用》2010年12期
    [23]Gabor, D. (1946). Theory of communication. Journal of the Institute of Electrical Engineers,93,429-457.
    [24]孙伟峰,《基于非局部信息的信号与图像处理算法及其应用研究》,山东大学,2010
    [25]冯伟兴、唐默、贺波,《数字图像模式识别技术》,机械工业出版社,2010年,第63页
    [26]Antonio Torralba, Kevin P. Murphy, William T. Freeman, Sharing visual features for multiclass and multiview object detection, In Press, IEEE Transactions on Patten Analysis and Machine Intelligence
    [27]A. Torralba, K. P. Murphy, and W. T. Freeman. Sharing visual features for multiclass and multiview object detection. IEEE Trans, on Pattern Analysis and Machine Intelligence.19(5):854-869, May,2007
    [28]Jamie Shotton, J. Winn, C. Rother TextonBoost for Image Understanding Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout and Context, International journal of computer vision,2009,vol.81 P119
    [29]仲伟俊,《多人两层决策问题的随机全局优化算法》,《系统工程学报》,1992年2期
    [30]Richard O. Duda, Perter E.Hart, David GStock《模式分类》第二版机械工业出版社2010年第284页
    [31]Lemonidas S. Pitsoulis and Mauricio GC. Resende, Greedy Randomized Adaptive Search Procedures, In AT&T Labs Research Technical Report, January 18,2001, PI
    [32]Yu-Chi Ho, Qian-Chuan Zhao, Qing-Shan Jia, Ordinal Optimization:Soft Optimization for Hard Problems, Springer,2007,P13
    [33]陈春阳,高校图书馆网站读者访问状况计量统计初探《图书馆杂志》(15)112005年

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700