图像挖掘在图像检索中的应用

英文题名：Image Mining in Image Retrieval
作者：段曼妮
论文级别：博士
学科专业名称：信号与信息处理
中文关键词：图像检索 ; 图像挖掘 ; 近重复图像检索 ; 基于地点的图像检索 ; 自动图像标注
英文关键词：image retrieval ; image mining ; near-duplicate image retrieval ; associate rule mining ; location image retrieval ; geographical relevance ; style based image annotation
学位年度：2009
导师：吴秀清 ; 徐守时
学科代码：081002
学位授予单位：中国科学技术大学
论文提交日期：2009-07-01

摘要

近年来,随着多媒体技术和计算机网络的飞速发展,全世界的数字图像的容量正以惊人的速度增长。无论是军用还是民用设备,每天都会产生容量相当于数千兆字节的图像。这些图像中包含了现实世界的各种实体,实体所组成的图像集合则包含着这些实体的变化、相互关系以及隐藏在其中的各种模式、演化规律等信息。但是,对于人来说,处理包含数以万计的图像数据集,并从中发现知识,是非常困难,甚至是不可能的。
     数据挖掘、信息检索和多媒体数据库及其相关领域的发展使得对图像的管理和分析以及从中发现对人们有用的信息成为可能,同时也促生了图像挖掘这一研究领域。图像挖掘指的是在图像数据库中发现隐含的、未知而且潜在有用的知识、图像数据关系的过程。由于这些隐含的信息和知识是人们直观所不能得到的,它有望使很多相关领域发展到一个新的阶段。本文的主要研究目的是探索图像挖掘技术在图像检索领域的应用方法。
     图像检索技术大体上可以分为基于样例的图像检索和基于文字的图像检索两类。基于样例的图象检索可分为图像类别检索和近重复图像检索两类；而基于文字的图象检索则可分为基于伴生文字和基于图像自动标注两类。本文对图像挖掘技术进行了深入研究,主要工作和创新之处归纳为以下几点：
     1.本文的第二章首先分析了基于Bag-of-Words模型的近重复检索中存在的主要问题,即视觉多义词和同义词现象,提出用关联规则挖掘算法挖掘视觉词组以消除视觉多义词和同义词现象的方法。本章使用以特征为中心的事务库构造方法,并用传统的Apriori算法在事务库挖掘关联视觉词汇,构造视觉词组。本章同时比较了视觉词组的多种使用方法,讨论了视觉词组在近重复图像检索中的意义。在标准数据库上的实验证明了该方法的有效性。
     2.本文的第三章研究了图像挖掘技术在基于伴生文字的图象检索中的应用。在基于文字的图像检索中,利用伴生文字进行图像检索是当今商业搜索引擎最常用的做法。但是由于伴生信息往往含有较多的噪声图像检索结果并不能很好地满足用户的需求。本文的第三章设计一个收集用户知识的网络游戏,并通过位置关联规则挖掘算法分析用户的游戏日志,获取的知识用以改善LiveSearch上的地理相关图像检索结果。大规模的用户调查证实了该方法可以较好地改善基于伴生文字的图像检索。
     3.本文的第四章提出一种用概念挖掘实现风格化的图象自动标注的方法。自动图像标注是指通过机器学习的手段,系统自动为图像生成与内容相关的标注词的过程。该领域中的关键问题是底层特征与高层语义之间的语义鸿沟问题。本章提出,利用概念挖掘,发现图像中包含的概念,以及不同用户的图像风格,从而通过个性化的图像标注方法改善图像标注的结果。本章提出用PLSA模型对图像的内容以及用户的兴趣进行概念挖掘,不同用户的不同风格则被表现为给定概念上的图像特征和标注特征的不同分布。在来自商用网站的数据集上的实验证明,风格化的图像标注可以大幅度提高自动图像标注的精度,从而为个性化的图像检索提供可能的解决方案。
Advances in image acquisition and storage technology have led to tremendous growth in very large and detailed image databases. These images, if analyzed, can reveal useful information to the human users. Image mining deals with the extraction of implicit knowledge, image data relationship, or other patterns not explicitly stored in the images.
     Image mining is rapidly gaining attention among researchers in the field of data mining, information retrieval, and multimedia databases because of its potential in dis-covering useful image patterns that may push the various research fields to new fron-tiers. The main purpose of this paper is to explore the usage of image mining technique in the field of image retrieval.
     Image retrieval in general can be divided into example-based image retrieval and text-based image retrieval.
     Among example-based image retrieval, image near-duplicate(IND) retrieval has a vast scene of application. We first analysis the major problem in IND retrieval based on Bag-of-Words model as "visual polysemy and synonymy phenomenon". To eliminate this phenomenon, we propose using associate rule mining to find "visual pattern". We propose and compare different usages of visual pattern. Experiments on benchmark dataset prove our proposed method is superior to classic Bag-of-Words model.
     Among the text-based image retrieval, using surrounding text as image's keywords and building index is the most commonly used method in commercial image search engine. However, because surrounding text is often associated with a lot of noise, image retrieval results can not meet the needs of users very well. To solve a location image retrieval task, we first define a measurement for image, namely, geographical relevance and then use it to rank the returning images. To obtain images'geographical relevance, we designed a online game to gather user's knowledge about image and location. We then use co-location mining algorithm to find the similar location, image geographical relevance and image region's geographical relevance. The comparison with a commercial search engine (Live search) confirmed our proposed algorithm is useful in improving location image retrieval's performance.
     Another direction in text-based image retrieval is automatic image annotation. Au-tomatic image annotation (also known as automatic image tagging) is the process by which a computer system automatically assigns metadata in the form of captioning or keywords to a digital image. The key problem in the field is the semantic gap be-tween feature and semantic concept. Modeling user's attention is one feasible solution to eliminate the semantic gap. We use concept mining technical in a image annota-tion task. We assume the images in one group share one "style". Mining and using this "style" could improve the annotation precision and consequently improve image retrieval based on auto-annotation. Experiments on the real word datasets prove our assumption.

引文

[1]K. Barnard and D. Forsyth. Learning the semantics of words and pictures, volume 2, pages 408-415 vol.2,2001. doi:10.1.109/ICCV.2001.937654. URL http://dx.doi.org/ 10.1109/ICCV.2001.937654.
    [2]David M. Blei and Michael I. Jordan. Modeling annotated data. In SIGIR'03:Proceed-ings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 127-134, New York, NY, USA,2003. ACM Press. ISBN 1581136463. doi:10.1145/860435.860460. URL http://dx.doi.org/10.1145/ 860435.860460.
    [3]P. Duygulu, K. Barnard, J. de Freitas, and D. Forsyth. Object recognition as machine trans-lation: learning a lexicon for a fixed image vocabulary,2002. URL http://citeseer. ist.psu.edu/duygulu02object.html.
    [4]S. L. Feng, R. Manmatha, and V. Lavrenko. Multiple Bernoulli relevance models for im-age and video annotation, volume 2, pages Ⅱ-1002-Ⅱ-1009 Vol.2,2004. URL http: //ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1315274.
    [5]V. Lavrenko, R. Manmatha, and J. Jeon. A model for learning the semantics of pictures, URL http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1. 9.3487.
    [6]Jia Y. Pan and Hyung Jeong. Gcap:Graph-based automatic image captioning.
    [7]A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the em algorithm, doi:10.2307/2984875. URL http://dx.doi.org/10.2307/ 2984875.
    [8]Florent Monay and Daniel Gatica-Perez. Plsa-based image auto-annotation:constraining the latent space. In MULTIMEDIA'04:Proceedings of the 12th annual ACM international conference on Multimedia, pages 348-351, New York, NY, USA,2004. ACM Press. ISBN 1581138938. doi:10.1145/1027527.1027608. URL http://dx.doi.org/10.1145/ 1027527.1027608.
    [9]Jia Li and James Z. Wang. Automatic linguistic indexing of pictures by a statistical model-ing approach. IEEE Trans. Pattern Anal Mach. Intell.,25(9):1075-1088, September 2003. ISSN 0162-8828. doi:10.1109/TPAMI.2003.1227984. URL http://dx.doi.org/ 10.1109/TPAMI.2003.1227984.
    [10]J. Li and J. Z. Wang. Real-time computerized annotation of pictures. IEEE transactions on pattern analysis and machine intelligence,30(6):985-1002, June 2008. ISSN 0162-8828. doi:10.1109/TPAMI.2007.70847. URL http://dx.doi.org/10.1109/TPAMI.
    2007.70847.
    [11]Ritendra Datta, Dhiraj Joshi, Jia Li, and James Z. Wang. Image retrieval:Ideas, influences, and trends of the new age. ACM Comput. Surv.,40(2):1-60, April 2008. ISSN 0360-0300. doi:10.1145/1348246.1348248. URL http://dx.doi.org/10.1145/1348246. 1348248.
    [12]Yan Ke, Rahul Sukthankar, and Larry Huston. An efficient parts-based near-duplicate and sub-image retrieval system. In MULTIMEDIA'04: Proceedings of the 12th annual ACM international conference on Multimedia, pages 869-876, New York, NY, USA,2004. ACM Press. ISBN 1581138938. doi:10.1145/1027527.1027729. URL http://dx.doi.org/ 10.1145/1027527.1027729.
    [13]Josef Sivic and Andrew Zisserman. Video Google: Efficient Visual Search of Videos.2006. doi:10.1007/11957959\_7.URL http://dx.doi.org/10.1007/11957959_7.
    [14]David D. Lewis. Naive (bayes) at forty: The independence assumption in information re-trieval. In Claire Nedellec and Celine Rouveirol, editors, Proceedings of ECML-98,10th European Conference on Machine Learning, number 1398, pages 4-15, Chemnitz, DE, 1998. Springer Verlag, Heidelberg, DE. URL http://citeseerx.ist.psu.edu/ viewdoc/summary?doi=10.1.1.33.8397.
    [15]Wynne Hsu, Mong Li Lee, Ji Zhang. Image Mining: Trends and Developments, volume 1 of Journal of Intelligent Information System. Kluwer Academic, Netherlands,2002.
    [16]Christopher M. Bishop. Pattern Recognition and Machine Learning (Informa-tion Science and Statistics). Springer, August 2006. ISBN 0387310738. URL http://www.amazon.ca/exec/obidos/redirect?tag=citeulike09-20\ &path=ASIN/0387310738.
    [17]J. Zhang, W. Hsu, and M. Lee. Image mining: Issues, frameworks and techniques, 2001. URL http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10. 1.1.4.8726.
    [18]Rakesh Agrawal, Tomasz Imielinski, and Arun N. Swami. Mining association rules between sets of items in large databases. In Peter Buneman and Sushil Jajodia, editors, Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pages 207-216, Washington, D.C., FebruaryJune-FebruaryAugust~ 1993. URL http://citeseerx. ist.psu.edu/viewdoc/summary?doi=10.1.1.40.6984.
    [19]Usama M. Fayyad, S. George Djorgovski, and Nicholas Weir. Automating the analysis and cataloging of sky surveys. In Advances in Knowledge Discovery and Data Mining, pages 471-493.1996.
    [20]Carlos Ordonez and Edward Omiecinski. Discovering association rules based on image content. In Proceedings of the IEEE Advances in Digital Libraries Conference (ADL'99, pages 38-49,1999.
    [21]Serge Belongie Joseph M. Hellerstein Jitendra Malik Chad Carson, Megan Thomas. Blob-world: a system for region-based image indexing and retrieval. In Third International Con-ference on Visual Information Systems, Lecture Notes in Computer Science. IEEE Computer Society,1999.
    [22]George A. Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross, and Katherine Miller. Wordnet: An on-line lexical database. International Journal of Lexicography,3:235-244,1990. URL http://citeseerx.ist.psu.edu/viewdoc/summary?doi= 10.1.1.105.1244.
    [23]Yohan Jin, Latifur Khan, Lei Wang, and Mamoun Awad. Image annotations by combining multiple evidence & wordnet. In MULTIMEDIA'05: Proceedings of the 13th annual ACM international conference on Multimedia, pages 706-715, New York, NY, USA,2005. ACM. ISBN 1-59593-044-2. doi: 10.1145/1101149.1101305. URL http://dx.doi.org/ 10.1145/1101149.1101305.
    [24]Beat Fasel, Florent Monay, and Daniel Gatica-Perez. Latent semantic analysis of facial action codes for automatic facial expression recognition. In MIR'04: Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval, pages 181-188, New York, NY, USA,2004. ACM. ISBN 1-58113-940-3. doi:http://doi.acm.org/10.1145/ 1026711.1026742.
    [25]Thomas Hofmann. Probabilistic latent semantic indexing,1999. URL http:// citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.33.6814.
    [26]Anna Bosch, Andrew Zisserman, and Xavier Munoz. Scene classification via plsa. In In Proc. ECCV, pages 517-530,2006. URL http://citeseerx.ist.psu.edu/viewdoc/ summary?doi=10.1.1.65.3509.
    [27]Keiji Yanai. Automatic web image selection with a probabilistic latent topic model. In WWW '08: Proceeding of the 17th international conference on World Wide Web, pages 1237-1238, New York, NY, USA,2008. ACM. ISBN 978-1-60558-085-2. doi:http://doi.acm.org/10. 1145/1367497.1367744.
    [28]D. Blei, A. Ng, and M. Jordan. Latent dirichlet allocation,,2002. URL http:// citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.8.8598.
    [29]Mark Girolami and Ata Kabán. On an equivalence between plsi and Ida. In SI-GIR'03:Proceedings of the 26th annual international ACM SIGIR conference on Re-search and development in informaion retrieval, pages 433-434, New York, NY, USA, 2003. ACM Press. ISBN 1581136463. doi:10.1145/860435.860537. URL http: //dx.doi.org/10.1.145/860435.860537.
    [30]Live image search, http://www.live.com/?scope=images.
    [31]Josef Sivic and Andrew Zisserman. Video data mining using configurations of viewpoint invariant regions. URL http://citeseerx.ist.psu.edu/viewdoc/summary? doi=10.1.1.3.5883.
    [32]T. Quack, V. Ferrari, and L.J. Van Gool. Video mining with frequent itemset configurations. pages 360-369,2006.
    [33]J.S. Yuan, Y. Wu, and M. Yang. Discovery of collocation patterns: from visual words to visual phrases. pages 1-8,2007.
    [34]Li Fei-Fei. A bayesian hierarchical model for learning natural scene categories. In In CVPR, volume 2, pages 524-531,2005. URL http://citeseerx.ist.psu.edu/ viewdoc/summary?doi=10.1.1.112.6262.
    [35]J. Vogel and B. Schiele. On performance characterization and optimization for image re-trieval. In 7th European Conference on Computer Vision (ECCV 2002) (Part IV), pages 49-66, May 2002.
    [36]Chris Harris and Mike Stephens. A combined corner and edge detector. In The Fourth Alvey Vision Conference, pages 147-151,1988.
    [37]D.Lowe. Distinctive image features from scale-invariant keypoints,2003. URL http: //citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.2.8899.
    [38]Timor Kadir and Michael Brady. Saliency, scale and image description. International Jour-nal of Computer Vision, V45(2):83-105, November 2001. doi:10.1023/A:1012460413855. URL http://dx.doi.org/10.1023/A:1012460413855.
    [39]M. Vidal-Naquet and S. Ullman. Object recognition with informative features and linear classification, pages 281-288 vol.1,2003. URL http://ieeexplore.ieee.org/ xpls/abs_all.jsp?arnumber=1238356.
    [40]Raphael Maree, Pierre Geurts, Justus Piater, and Louis Wehenkel. Random subwindows for robust image classification. In CVPR'05: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05)-Volume 1, pages 34-40, Washington, DC, USA,2005. IEEE Computer Society. doi:10.1109/CVPR.2005.287. URL http://dx.doi.org/10.1109/CVPR.2005.287.
    [41]K. Barnard, P. Duygulu, D. Forsyth, N. de Freitas, D. Blei, and M. Jordan. Match-ing words and pictures,2003. URL http://citeseerx. ist.psu.edu/viewdoc/ summary?doi=10.1.1.16.4108.
    [42]Thomas Leung and Jitendra Malik. Representing and recognizing the visual appearance of materials using three-dimensional textons. Int. J. Comput. Vision,43(1):29-44, June 2001. ISSN 0920-5691. doi:10.1023/A:1011126920638. URL http://dx.doi.org/10. 1023/A:1011126920638.
    [43]Chris Ding and Xiaofeng He. K-means clustering via principal component analysis,
    2004. URL http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10. 1.1.1.9373.
    [44]Chris Dance, Jutta Willamowski, Lixin Fan, Cedric Bray, and Gabriela Csurka. Visual cate-gorization with bags of keypoints. In ECCV International Workshop on Statistical Learning in Computer Vision,2004. URL http://www.xrce.xerox.com/Publications/ Attachments/2004-010/2004_010.pdf.
    [45]Josef Sivic, Bryan C. Russell, Alexei A. Efros, Andrew Zisserman, and William T. Free-man. Discovering objects and their location in images. In IEEE International Conference on Computer Vision, volume 1, pages 370-377,2005. doi:10.1109/ICCV.2005.77. URL http://dx.doi.org/10.1109/ICCV.2005.77.
    [46]T. Serre, L. Wolf, and T. Poggio. Object recognition with features inspired by visual cor-tex, volume 2, pages 994-1000 vol.2,2005. URL http://ieeexplore.ieee.org/ xpls/abs_all.jsp?arnumber=1467551.
    [47]Justin Zobel and Alistair Moffat. Inverted files for text search engines. ACM Comput. Surv.,38(2). ISSN 0360-0300. URL http://portal.acm.org/citation. cfm? id=1132959.
    [48]G. Salton, A. Wong, and C. S. Yang. A vector space model for automatic indexing. Commun. ACM,18(11):613-620, November 1975. ISSN 0001-0782. doi:10.1145/361219.361220. URL http://dx.doi.org/10.1145/361219.361220.
    [49]Krystian Mikolajczyk and Cordelia Schmid. Scale & affine invariant interest point de-tectors. Int. J. Comput. Vision,60(1):63-86, October 2004. ISSN 0920-5691. doi:10. 1023/B:VISI.0000027790.02288.f2. URL http://dx.doi.org/10.1023/B:VISI. 0000027790.02288.f2.
    [50]J. Matas, O. Chum, M. Urban, and T. Pajdla. Robust wide baseline stereo from maximally stable extremal regions. In In British Machine Vision Conference, volume 1, pages 384-393,2002. URL http://citeseerx.ist.psu.edu/viewdoc/summary?doi= 10.1.1.7.2484.
    [51]Herbert Bay, Tinne Tuytelaars, Van Gool, and L. Surf: Speeded up robust features. In 9th European Conference on Computer Vision, Graz Austria, May 2006.
    [52]Yan Ke and R. Sukthankar. Pca-sift: a more distinctive representation for local image de-scriptors. In 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, volume 2, pages 506-513,2004. doi:10.1109/CVPR.2004.1315206. URL http://dx.doi.org/10.1109/CVPR.2004.1315206.
    [53]D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. volume 2, pages 2161-2168,2006. URL http://ieeexplore.ieee.org/xpls/abs_all.jsp? arnumber=1641018.
    [54]S. K. M. Wong, Wojciech Ziarko, and Patrick C. N. Wong. Generalized vector spaces model in information retrieval. In SIGIR'85: Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval, pages 18-25, New York, NY, USA,1985. ACM. ISBN 0897911598. doi:10.1145/253495.253506. URL http://dx.doi.org/10.1145/253495.253506.
    [55]James Philbin, Ondrej. Chum, Michael Isard, Josef Sivic, and Andrew Zisserman. Lost in quantization:Improving particular object retrieval in large scale image databases. In CVPR. IEEE Computer Society,2008. URL http://dblp.uni-trier.de/db/ conf/cvpr/cvpr2008.html#PhilbinCISZ08.
    [56]M. Mitra, C. Buckley, A. Singhal, and C. Cardie. An analysis of statistical and syntactic phrases. In Proceedings of RIAO-97,1997.
    [57]S. Lazebnik, C. Schmid, and J. Ponce. Beyond bags of features:Spatial pyramid matching for recognizing natural scene categories, volume 2, pages 2169-2178,2006. URL http: //ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1641019.
    [58]S. Savarese, J. Winn, and A. Criminisi. Discriminative object class models of appearance and shape by correlatons. pages Ⅱ:2033-2040,2006.
    [59]T. Quack, V. Ferrari, B. Leibe, and L. Van Gool. Efficient mining of frequent and dis-tinctive feature configurations. In Computer Vision,2007. ICCV 2007. IEEE 11th In-ternational Conference on, pages 1-8,2007. doi:10.1109/ICCV.2007.4408906. URL http://dx.doi.org/10.1109/ICCV.2007.4408906.
    [60]ETH-Zurich. Zurich building image database. URL http://www.vision.ee.ethz. ch/showroom/zubud/index.en.html.
    [61]Vivek Kwatra, Arno Schödl, Irfan Essa, Greg Turk, and Aaron Bobick. Graphcut textures:image and video synthesis using graph cuts. ACM Trans. Graph.,22(3):277-286, July 2003. ISSN 0730-0301. doi:10.1145/882262.882264. URL http://dx.doi.org/ 10.1145/882262.882264.
    [62]Lee Wang, Chuang Wang, Xing Xie, Josh Forman, Yansheng Lu, Wei-Ying Ma, Ying Li. Detecting dominant locations from search queries. In Proceedings of the 28th annual in-ternational ACM SIGIR conference on Research and development in information retrieval, Salvador, Brazil,2005. ACM.
    [63]Boris Epshtein, Eyal Ofek, Yonatan Wexler, Pusheng Zhang. Hierarchical photo organization using geo-relevance. In Proceedings of the 15th annual ACM international symposium on Advances in geographic information systems, Seattle, Washington,2007. ACM.
    [64]James Hays, Alexei A. Efros. Im2gps:estimating geographic information from a single im-age. In IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, Alaska, USA,23-28 June 2008.
    [65]Wei Zhang, Jana Kosecka. Image based localization in urban enviroments. In Proceedings.of the Third International Symposium on 3D Data Processing, Visualization, and Transmission, Washtong. DC, USA,2006. IEEE Computer Society.
    [66]Lyndon Kennedy, Mor Naaman, Shane Ahern, Rahul Nair, Tye Rattenbury. How flickr helps us make sense of the world: Context and content in communitycontributed media collections. In Proceedings of the 15th international conference on Multimedia, Augsburg, Germany, 2007. ACM.
    [67]Yan Ke, Xiaoou Tang, Feng Jing. The design of high-level features for photo quality assess-ment. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE Computer Society,2006.
    [68]Yushi Jing, Shumeet Baluja. Pagerank for product image search. In Proceeding of the 17th international conference on World Wide Web, Beijing, China,2008.
    [69]Xin-Jing Wang, Lei Zhang, Feng Jing, Wei-Ying Ma. Annosearch:image auto-annotation by search. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, volume 2. IEEE Computer Society,2006.
    [70]Luis von Ahn, Laura Dabbish. Labeling images with a computer game. In Proceedings of the SIGCHI conference on Human factors in computing systems. ACM,2004.
    [71]a and b. Gps mission. http://gpsmission.com.
    [72]Luis von Ahn, Mihir Kedia, Manuel Blum. Verbosity:A game for collecting common-sense facts. In Proceedings of the SIGCHI conference on Human factors in computing systems. ACM,2006.
    [73]'Casey Dugan, Michael Muller, David R Millen, Werner Geyer, Beth Brownholtz, Marty Moore. The dogear game: a social bookmark recommender system. In Proceedings of the 2007 international ACM conference on Supporting group work. ACM,2007.
    [74]David A. Shamm, Bryan Pardo. Karaoke callout: using social and collaborative cell phone networking for new entertainment modalities and data collection. In Proceedings of the 1st ACM workshop on Audio and music computing multimedia. ACM,2006.
    [75]Luis von Ahn, Ruoran Liu, Manuel Blum. Peekaboom: a game for locating objects in images. In Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, 2006.
    [76]Hui Xiong Yan Huang, Shashi Shekhar. Discovering colocation patterns from spatial data sets:A general approach. IEEE Transactions on Knowledge and Data Engineering,16: 1472-1485, December 2004.
    [77]Xiangye Xiao, Longhao Wang, Xing Xie, Qiong Luo. Discovering co-located queries in geographic search logs. In Proceedings of the first international workshop on Location and the web. ACM,2008.
    [78]Yu-Fei Ma.and Hong-Jiang Zhang. Contrast-based image attention analysis by using fuzzy growing. In MULTIMEDIA'03: Proceedings of the eleventh ACM international conference on Multimedia, pages 374-381, New York, NY, USA,2003. ACM. ISBN 1581137222. doi: 10.1145/957013.957094. URL http://dx.doi.org/10.1145/957013.957094.
    [79]Thomas Hofmann. Unsupervised learning by probabilistic latent semantic analysis. Machine Learning,42(1-2):177-196,2001. ISSN 0885-6125. URL http://portal.acm.org/ citation.cfm?id=599631.
    [80]Alberto del Bimbo. Visual Information Retrieval. The Morgan Kaufmann Series in Multi-media Information and Systems. Morgan Kaufmann Publishers, New York, U.S.,1999.
    [81]Samppa Saarela Eero Hyvonen, Avril Styrmanl. Ontology-based image retrieval. In Pro-ceedings of XML Finland Conference. ACM,2002.
    [82]R Oka Y Mori, H Takahashi. Image-to-word transformation based on dividing and vector quantizing images with words. In Proceedings of International Workshop Multimedia Intel-ligent Storage and Retrieval Management. ACM,1999.
    [83]R Manmatha J Jeon, V Lavrenko. Automatic image annotation and retrieval using cross-media relevance models. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. ACM,2003.
    [84]S. L. Feng, R. Manmatha, and V. Lavrenko. Multiple Bernoulli relevance models for im-age and video annotation, volume 2, pages Ⅱ-1002-Ⅱ-1009 Vol.2,2004. URL http: //ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1315274.
    [85]Jianbo Shi and Jitendra Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence,22(8):888-905,2000. URL citeseer. ist.psu.edu/shi97normalized.html.
    [86]Jia Li and James Z. Wang. Real-time computerized annotation of pictures. In MULTI-MEDIA'06:Proceedings of the 14th annual ACM international conference on Multime-dia, pages 911-920, New York, NY, USA,2006. ACM Press. ISBN 1595934472. doi: http://dx.doi.org/10.1145/1180639.1180841. URL http://dx.doi.org/10.1145/ 1180639.1180841.
    [87]David M. Blei, Andrew Y. Ng, Michael I. Jordan, and John Lafferty. Latent dirichlet alloca-tion. Journal of Machine Learning Research,3:2003,2003.
    [88]G. Carneiro, A. B. Chan, P. J. Moreno, and N. Vasconcelos. Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans Pattern Anal Mach Intell,29(3): 394-410, March 2007. ISSN 0162-8828. doi: 10.1109/TPAMI.2007.61. URL http: //dx.doi.org/10.1109/TPAMI.2007.61.
    [89]George Nagy Prateek Sarkar. Style consistent classification of isogenous patterns. IEEE Transaction on Pattern Analysis And Machine Intelligence,27(1):88-98, January 2005.
    [90]Liangliang Cao, Jiebo Luo, Henry S. Kautz, and Thomas S. Huang. Annotating collections of photos using hierarchical eyent and scene models. In CVPR. IEEE Computer Society, 2008. URL http://dblp.uni-trier.de/db/conf/cvpr/cvpr2008.html# CaoLKH08.
    [91]Mor Naaman, Ron B. Yeh, Hector Garcia-Molina, and Andreas Paepcke. Leveraging context to resolve identity in photo albums. In JCDL'05: Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries, pages 178-187, New York, NY, USA,2005. ACM. ISBN 1581138768. doi:http://dx.doi.org/10.1145/1065385.1065430. URL http://dx.doi. org/10.1145/1065385.1065430.
    [92]Andrew C. Gallagher, Carman Neustaedter, Liangliang Cao, Jiebo Luo, and Tsuhan Chen. Image annotation using personal calendars as context. In ACM Multimedia, pages 681-684. ACM,2008. ISBN 978-1-60558-303-7. URL http://dblp.uni-trier.de/ db/conf/mm/mm2008.html#GallagherNCLC08.
    [93]M. Cristani, A. Perina, U. Castellani, and V. Murino. Geo-located image analysis using latent representations, pages 1-8,2008.
    [94]Radu A. Negoescu and Daniel G. Perez. Analyzing flickr groups. In CIVR'08:Proceedings of the 2008 international conference on Content-based image and video retrieval, pages 417-426, New York, NY, USA,2008. ACM. ISBN 978-1-60558-070-8. doi:10.1145/1386352. 1386406. URL http://dx.doi.org/10.1145/1386352.1386406.
    [95]Charles Elkan. Using the triangle inequality to accelerate k-means,2003. URL http: //citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.14.8422.
    [96]P. Duygulu, Kobus Barnard, J. F. G. de Freitas, and David A. Forsyth. Object recognition as machine translation:Learning a lexicon for a fixed image vocabulary. In ECCV'02: Proceedings of the 7th European Conference on Computer Vision-Part Ⅳ, pages 97-112, London, UK,2002. Springer-Verlag. ISBN 3540437487. URL http://portal.acm. org/citation.cfm?id=649254.
    [97]V. Lavrenko, R. Manmatha, and J. Jeon. A model for learning the semantics of pictures, . URL http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1. 9.3487.
    [98]Jiayu Tang and P. H. Lewis. A study of quality issues for image auto-annotation with the corel dataset. Circuits and Systems for Video Technology, IEEE Transactions on,17:384-389,2007. doi:10.1109/TCSVT.2006.888941. URL http://dx.doi.org/10.1109/ TCSVT.2006.888941.
    [99]Can a probabilistic image annotation system be improved using a co-occurrence approach? pages 33-42, Karlsruhe, Germany,2008.
    [100]Ji Zhang, Wynne Hsu, Mong Li Lee. An information-driven framework for image mining. Munich, Germany, September,2001. Springer.
    [101]David Blei Michael I. Jordan. Matching words and pictures. J. Machine Learning Research, 3:2003,2003.
    [102]Shashi Shekhar, Pusheng Zhang, Yan Huang, and Ranga Raju Vatsavai. Trends in spatial data mining,2004.
    [103]August Boston Ma, Edited Simeon, and In Conjunction. Multimedia data mining (md-m/kdd'2000),2000.
    [104]Wynne Hsu, Mong Li Lee, and Kheng Guan Goh. Image mining in iris:integrated retinal information system. In SIGMOD'00: Proceedings of the 2000 ACM SIGMOD international conference on Management of data, page 593, New York, NY, USA,2000. ACM. ISBN 1-58113-217-4. doi:http://doi.acm.org/10.1145/342009.336573.
    [105]Asanobu KITAMOTO. Data mining from the multiple alignment of typhoon image se-quences. In Technical Report of IEICE (Institute of Electronics, Information, and Com-munication Engineers), volume PRMU2002-159, pages 79-84,12 2002. (in Japanese).
    [106]Simeon J. Simoff, Chabane Djeraba, and Osmar R. Zaiane. Mdm/kdd2002:multimedia data mining between promises and problems. SIGKDD Explor. Newsl,4(2):118-121,2002. ISSN 1931-0145. doi:http://doi.acm.org/10.1145/772862.772886.
    [107]M. S. Lew, N. Sebe, and J.P. Eakins. Challenges of image and video retrieval. In in Lecture Notes in Computer Science, pages 1-6. Springer-Verlag,2002.
    [108]Xin Huang, Shu-Ching Chen, Mei-Ling Shyu, and Chengcui Zhang. User concept pattern discovery using relevance feedback and multiple instance learning for content-based image retrieval. In MDM/KDD, pages 100-108,2002.
    [109]Bo Li, Hong Li, Min Wu, and Ping Li. Multi-label classification based on association rules with application to scene classification. In Young Computer Scientists,2008. ICYCS 2008. The 9th International Conference for, pages 36-41,2008. doi:10.1109/ICYCS.2008.524. URL http://dx.doi.org/10.1109/ICYCS.2008.524.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700