基于多示例学习的图像内容过滤算法研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
本文结合多示例学习算法,研究并实现了基于图像内容的色情图像监控系统,从理论和实践上对多示例学习算法在图像过滤领域的应用进行了一个探索。
     特征提取是机器获取图像内容的重要手段。针对色情图像的特点,本文选取了颜色、纹理和形状特征作为特征值,由这些特征值组成特征向量交给多示例学习,利用多示例对未知概念包的预测功能来完成图像检测,进而实现图像的过滤。
     本文的机器学习过程采用多示例学习算法完成。首先将图像过滤中的概念统一到多示例框架下,色情图像特征的求解问题被转化成多示例问题中目标概念的求解问题,在多示例框架下采用EM_DD算法实现目标概念的求解,并用模拟退火算法对其改进,提高了搜索的速度和精度。
     通过对1500张色情图像和1500张正常图像进行检测,得出本算法的检出率为87.4%,虚警率为12.5%,从检测结果来看,本文提出的基于多示例学习算法的色情图像过滤算法能够有效地识别色情图像和正常图像。
     最后,本文在Visual C++6.0环境下开发了一个色情图像监控系统,该系统采用面向对象技术完成,具有多种检测方式和对浏览器的实时监控功能,此外该系统还能对色情网址进行记录、汇报、评级等。
This paper unifies the Multi-Instance Learning(MIL) algorithm, studys and has realized a pornographic image supervisory system which is based on the content of image. And it has carried on an exploration in theory and practice to the MIL in the image filtration domain's application.
     Feature extraction is an important means of the machine to obtain of image content. In view of the pornographic image's characteristic, this article selectes color, texture and shape features as feature values. Feature vectors are made up from these feature values and learned by MIL. Filtration of image is realized by the prediction of the MIL to the unknown concept bags.
     In this paper, the machine learning process is completed by MIL. Firstly, the concept of image filtration is unified into the framework of MIL. The solution of pornographic image feature vectors istransformed into the problem of target concept searching, which was realized under the framework by the EM_DD algorithm. With the Improvement of the simulation annealing algorithm, the search speed and the precision was enhanced.
     Through the detection on 1500 pornographic images and 1500 normal images, it is known that the detection rate is 87.4%, the false alarm rate is 12.5%,. According to detection result, the pornographic image filtration algorithm in this paper can effectively identify pornographic images and normal image.
     Finally, this article developed a pornographic image Monitoring system, which was developed in the Visual C + + 6.0 environment with OOP. It not only has various detection functions, but also can monitor the web browser in real-time. Besides, the system can also record, report, rate etc. the erotic websites.
引文
[1]Wilson M.Artificially Intelligent Strategies for Filtering Offensive Images on the Internet[EB/OL].http://www.cs.indiana.edu/~marawils/writing/aiporn.html.
    [2]PORNsweeper.http://www.dansdata.com/pornsweeper.htm,2005-11-29.
    [3]HUNTER C D.Internet Content Conundrum a Thesis in Communication for the Degree of Master of Arts[D].University of Pennsylvania,USA,1999(14):13-28.
    [4]GREENFIELD P,RICKWOOD P,TRAN H C.Effectiveness of Internet filtering software products,prepared for NetAlert and the Australian.broadcasting authority [EB/OL].http://www.aba.gov.eau.internet/research/filtering/filtereffectiveness.pdf,2001.
    [5]Alexandru F.Ddmbarean.Image processing techniques to detect and filter objectionable images based on skin tone and shape recognition,IEEE,2001.
    [6]D.A.Forsyth,M.Fleck,C.Bregler.Finding naked people.In Proc.Fourth European Conference on Computer Vision,1996(5):593-602.
    [7]D.A.Forsyth.Identifying nude pictures,IEEE,1996.
    [8]James Ze Wang.System for screening objectionable images using daubechies'wavelets and color histograms,In Proc.IDMJ,1997(3):389-402.
    [9]Michael W H,Brian C W.Mode estimation of probabilistic hybrid systems[A].Hybrid Systems:Computation and Control[C].Berlin:Springer-Verlag,2002(8):253-266
    [10]Alexandru F.Drimbarean,Image processing techniques to detect and filter objectionable images based on skin tone and shape recognition,IEEE,2001.
    [11]许强,江早,赵宏.基于图像内容过滤的智能防火墙系统研究与实现[J].计算机研究与发展,2000(4):458-464.
    [12]许强,江早,赵宏.一种新颖的智能网络图像内容监测系统模型[J].计算机研究与发展.2002(3):424-432.
    [13]段立娟,崔国勤,高文,张洪明.多层次特定类型图像过滤方法.计算机辅助设计与图形学学报,2002(5):404-409.
    [14]段立娟,包振山,毛国君.多特征特定类型图像过滤方法.北京工业大学学报,2005(4):352-357.
    [15]Dietterich T G,Lathrop R H,Lozano P T.Solving the multiple-instance problem with xisparallel rectangles[J].Artificial Intelligence,1997(12):31-71.
    [16]Maron O.Learning from Ambiguity[Ph.D.Thesis].Cambridge:Massachusetts Institute of Technology,1998(4):78-102.
    [17]Maron O,Ratan A L.Multiple-instance Learning for Natural Scene Classification[A]Proc of the 15th Int Conf on Machine Learning[C].Madison,1998(9):341-349.
    [18]Ruffo G.Learning Single and Multiple Instance Decision Tree for Computer Security Applications[D].Torino:University of Turin,2000(2):456-489.
    [19]Andrews S,Hofrnann T,Tsochantaridis I.Multiple Iinstance Learning with Generalized Support Vector Machines[A].AAA I/IAA I[C].Edmonton,2002(6):943-944.
    [20]Huang X,Chen S C,Shy M L,et al.User Concept Pattern Discovery Using Relevance Feedback and Multiple Instance Learning for Content-based Image Retrieval[A].MDM/KDD2002 Workshop[C].Edmonton,2002(9):100-108.
    [21]Yang C,Lozano P T.Image Database Retrieval with Multiple-instance Learning Techniques[A].Proc of the 16 th Int Conf on Data Engineering[C].San Diego,2000(6):233-243.
    [22]Zhang Qi,Goldman S A,Yu W,et al.Content-based Image Retrieval Using Multiple-instance Learning[A].The Nineteenth Int Conf on Machine Learning[C].Sudney,2002(2):682-689.
    [23]蔡自兴,李枚毅.多示例学习及其研究现状.控制与决策[J],2004(6):607-611.
    [24]黎铭,薛晓冰,周志华.基于多示例学习的中文Web目录页面推荐[J].软件学报.2004(9):1328-335.
    [25]张敏灵,周志华.一种基于多示例学习的图像检索方法.模式识别与人工智能,2006(4):179-185.
    [26]Michael J,Jones James M.Rehg,statistical color models with application to skin detection,CVPR,1999(4):158-164.
    [27]Casella G,Robert C P.Rao-blackwellisation of sampling schemes[J].Biometrika,1996(1):81-94.
    [28]Aaron C Shumate,Hui Li.Color Balancing in Digital Cameras.http://ise,stanford.edu/class/psych221/00/trek/
    [29]Jie Yang,Weier Lu,Alex Waibel.Skin-color modeling and Adaptation.Technical report,Carnegie Melton University,1997(5):97-146.
    [30]J.Fritsch,S.Lang,M.Kleinehagenbrock,G.A.Fink and G.Sagerer.Improving Adaptive Skin Color Segmentation by Incorporating Results from Face Detection.Proceedings of the IEEE Int.Workshop on Robot and Human Interactive, Communication,Germany,2002(5):337-343.
    [31]Jones MJ,Rehg JM.Statistical color models with application to skin detection.Proc IEEE Conf Computer Vision and Pattern Recognition,1999(1):274-280.
    [32]Martinkauppi B.Face colour under varying illumination-analysis and application.PhD thesis,University of Oulu,2002(2):56-89.
    [33]Yang M H,Ahuja N.Detecting human faces in color images.Proc.IEEE Conference on Image Processing,1998(10):127-139.
    [34]Terrillon JC,Shirazi M N,Fukamachi H,and Akamatsu S.Comparative performance of different skin chrominance models and chrominance spaces for automatic detection of human faces in color images.Proc.IEEE Int'l Conf.on Face and Gesture Recognition,2000(12):54-61.
    [35]Rein-Lien Hsu,Mohamed Abdel-Mottaleb and Anil K.Jain.Face Detection in Color Image.IEEE Trans.Pattern Analysis and Machine Intelligence,2002(5):696-706.
    [36]Garcia C,Tziritas G.Face detection using quantized skin color regions merging and wavelet packet analysis.IEEE Trans.Multimedia,1999(3):264-277.
    [37]李学龙;刘政凯,俞能海等.一种高效的语意图像分类方法[J].电路与系统学报,2002(2):22-25.
    [38]张维明,吴玲达,李松杨.多媒体信息系统[M].北京:电子工业出版社,2002(1):87-92.
    [39]Filip Mulier.Vapnik-Chervonenkis(VC)Learning Theory and Its Applications.IEEE Trans.on Neural Networks.1999(10):20-56.
    [40]李雁,申铱京,赵德斌.基于纹理的皮肤检测[J].计算机工程与应用,2003(19):74-77.
    [41]Duda,R.O.模式分类[M].北京:机械工业出版社.2003(1):1-15.
    [42]朱学芳.多媒体信息处理与检索技术[M].北京:电子工业出版社.2002(1):32-33.
    [43]Zhi-Hua Zhou,Ling Zhang.Solving multi-instance problems with classifier ensemble based on constructive clustering[C].2006(7):45-50.
    [44]Chevaleyre Y,Zucker J-D.Solving multiple-instance and multiple-part learning problems with decision trees and decision rules.Application to the mutagenesis problem.Proceedings of the 14th Biennial Conference of the Canadian Society for Computational Studies of Intelligence[C],Ottawa,Canada,2001(8):204-214.
    [45]Wang J,Zucker J-D.Solving the multiple-instance problem:a lazy learning approach.In:Proceedings of the17th International Conference on Machine Learning[C],San Francisco,CA,2000(13):1119-1125.
    [46]袁亚湘,孙文瑜.最优化理论与方法[M].北京:科学出版社,2003,4.
    [47]Friedman J H,Stuetzle W.Projection pursuit regression.Journal of the American Statistical Association[C],1981(8):817-823.
    [48]高永英,章毓晋.基于多级描述模型的渐进式图像内容理解[J].电子学,2001(10):1376-1380.
    [49]陶霖密,徐光佑.机器视觉中的颜色问题及应用[J].科学通报,2001(43):178-190.
    [50]陶霖密,彭振云.人体的肤色特征[J].软件学报,2001(7):1032-1041.
    [51]Wang J,Zucker J-D.Solving the multiple-instance problem:a lazy learning approach.In:Proceedings of the17th International Conference on Machine Learning[C],San Francisco,CA,2000(2):1119-1125.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700