基于稀疏非负矩阵分解的图像检索
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
当前,社会化标签系统日益流行。社会化标签系统已成为Web 2.0时代的标志性特征之一。用户可在不同的社会化标签系统中分享并自由标注不同媒体,比如在Flickr上分享并标注图片,在YouTube上分享视频并添加标签;同时,用户也可以在社会化标签系统中检索感兴趣的资源。这些不断发展的社会化标签系统,在不断丰富我们生活的同时,也给我们的研究工作带来了挑战与机遇。
     如何对用户自由添加的标签进行除噪音、去歧义等操作,提升标注的准确性,并进一步提高基于标签的信息检索的准确性?如何充分利用多种不同来源数据的丰富信息,实现多数据源间的迁移学习和相互辅助?这些问题已成为当前国内外研究的热点。
     针对上述问题,本文吸收了稀疏编码和共享子空间学习等方面的最新研究进展,提出了基于稀疏非负矩阵分解的多源利用(Multi-source Boosting by Sparse Nonnegative Matrix Factorization, MtBSNMF)算法。本文提出的基于稀疏非负矩阵分解的多源利用算法,通过联合稀疏非负矩阵分解分析多来源的数据,从中挖掘不同来源数据的共享子结构以及各来源对应的独立子结构,并籍此实现多数据源间知识的迁移学习。
     基于对稀疏非矩阵分解的多源利用(MtBSNMF)算法的研究,笔者在本文进一步开展了两部分的应用研究。第一部分,以标签为纽带联系两类不同的数据源,研究MtBSNMF算法在基于标签的图像检索领域的应用;第二部分,利用图像数据源的视觉特征,以视觉单词为纽带联系两类不同的图像数据源,研究MtBSNMF算法在样例查询和标签预测上的应用。实证表明,本文提出的方法具有一定的有效性和扩展性。
Recently, social tagging has become more and more popular. Now social tagging is one of the defining characteristics of Web 2.0. Users are free to upload, share and annotate different media in various social tagging systems, for example, users can upload images in Flickr and label them, and can also share videos in YouTube. Meanwhile, users are able to retrieve any web-accessible items of interest. As these increasing social tagging systems provide us with much convenience, they also propose challenges and chances to our research.
     Since the tags annotated by users are often noisy, ambiguous, and subjective, how to deal with these problems in order to improve the exactness of the tags and to raise the precision of tag based information retrieval? How to utilize the rich multiple data sources so as to achieve the transfer learning between different sources? All these problems have become the hot research areas recently.
     Inspired by the recent advances of sparse coding and shared subspace learning, in this paper we propose an approach, namely Multi-source Boosting by Sparse Nonnegative Matrix Factorization (MtBSNMF). The proposed algorithm, analyses the multiple data sources via sparse nonnegative matrix factorization simultaneously, learning a shared structure and the corresponding individual structure for each data source. In this way, we achieve the transfer learning across different sources.
     In this thesis, we mainly implement two types of applications on the proposed algorithm. The first is on two different data sources linked by tags, MtBSNMF is applied to image retrieval. The second is on image sources connected by visual words, and we implement sample query and tag prediction with the assist of MtBSNMF. The experimental results demonstrate the effectiveness and feasibility of the proposed approach.
引文
[1]Li Xirong, Snoek Cees G. M., Worring Marcel. Learning Social Tag Relevance by Neighbor Voting[J]. IEEE Transactions on Multimedia,2009,11:1310-1322.
    [2]Marlow Cameron, Naaman Mor, Boyd Danah. HT06, tagging paper, taxonomy, Flickr, academic article, to read. In ACM Conference on Hypertext,2006:31-40.
    [3]Sigurbjrnsson Brkur, Zwol Roelof Van. Flickr tag recommendation based on collective knowledge. In World Wide Web Conference Series,2008:327-336.
    [4]Wu Lei, Yang Linjun, Yu Nenghai. Learning to tag. In World Wide Web Conference Series,2009:361-370.
    [5]Gupta Sunil Kumar, Phung Dinh Q., Adams Brett. Nonnegative shared subspace learning and its application to social media retrieval. In Knowledge Discovery and Data Mining,2010:1169-1178.
    [6]Gupta S., Phung D., Adams B.. A Matrix Factorization Framework for Jointly Analyzing Multiple Nonnegative Data Sources. In Procs of Text Mining Workshop, in conjuction with SIAM Int Conf on Data Mining. Arizona, USA,2011.
    [7]Quattoni Ariadna, Collins Michael, Darrell Trevor. Transfer learning for image classification with sparse prototype representations. In Computer Vision and Pattern Recognition,2008:1-8.
    [8]OLSHAUSEN B, FTELD D. Sparse coding with an overcomplete basis set:A strategy employed by V1?[J]. Vision Research,1997,37:3311-3325.
    [9]Belhumeur Peter N., Hespanha Jo?o P., Kriegman David J. Eigenfaces vs. Fisherfaces:Recognition Using Class Specific Linear Projection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,1997,19:711-720.
    [10]Cai Deng, He Xiaofei, Han Jiawei. Spectral Regression:A Unified Approach for Sparse Subspace Learning. In IEEE International Conference on Data Mining, 2007:73-82.
    [11]Lowe David G. Object Recognition from Local Scale-Invariant Features[J]. IEEE Internet Computing,1999,2:1150-1157.
    [12]Li Hao, Wang Meng, Hua Xian-sheng. MSRA-MM 2.0:A Large-Scale Web Multimedia Dataset. In IEEE International Conference on Data Mining,2009:164-169.
    [13]Chua Tat-seng, Tang Jinhui, Hong Richang. NUS-WIDE:a real-world web image database from National University of Singapore. In Conference on Image and Video Retrieval,2009:1-9.
    [14]Olshausen Bruno A., Field David J. Emergence of simple-cell receptive field properties by learning a sparse code for natural images[J]. Nature,1996,381:607-609.
    [15]Olshausen B., Field D. Sparse coding of sensory inputs[J]. Current Opinion in Neurobiology,2004,14:481-487.
    [16]Tibshirani Robert. Regression Shrinkage and Selection Via the Lasso[J]. Journal of the Royal Statistical Society,1996,58(1):267-288.
    [17]Tibshirani Robert, Saunders Michael, Rosset Saharon. Sparsity and smoothness via the fused lasso[J]. Journal of the Royal Statistical Society,2004,67(1):91-108.
    [18]Kim Hyunsoo, Howland Peg, Park Haesun. Dimension Reduction in Text Classification with Support Vector Machines[J]. Journal of Machine Learning Research,2005,6:37-53.
    [19]Ding Chris H. Q., He Xiaofeng, Zha Hongyuan. Adaptive dimension reduction for clustering high dimensional data. In IEEE International Conference on Data Mining, 2002:147-154.
    [20]Berry Michael W., Jessupz Elizabeth R., Jessup Elizabeth R. Matrices, Vector Spaces, and Information Retrieval[J]. Siam Review.1999,41(2):335-362.
    [21]Shlens Jonathon. A Tutorial on Principal Component Analysis. Measurement, 2005,51:52.
    [22]Lee D. D., Seung H. S. Learning the parts of objects by non-negative matrix factorization[J]. Nature,1999,401:788-791.
    [23]Comon, P. (1994). Independent component analysis, a new concept?[J]. Signal Processing,1994,36(3):287-314.
    [24]Golub G. H., Reinsch G. Singular value decomposition and least-square solutions[J].Numerische Mathematik,1970,14(5):403-420.
    [25]Gersho A., Gray R. M. Vector quantization and signal compression.1992.
    [26]Lee Daniel D., Seung H. Sebastian. Algorithms for Nonnegative Matrix Factorization. In Neural Information Processing Systems,2000,13:556-562.
    [27]Pan Sinno Jialin, Yang Qiang. A Survey on Transfer Learning[J]. IEEE Transactions on Knowledge and Data Engineering,2010,22:1345-1359.
    [28]Ling Xiao, Dai Wenyuan, Xue Gui-rong. Spectral domain-transfer learning. In Knowledge Discovery and Data Mining,2008:488-496.
    [29]Dai Wenyuan, Yang Qiang, Xue Gui-rong. Self-taught clustering. In International Conference on Machine Learning,2008:200-207.
    [30]Dai Wenyuan, Chen Yuqiang, Xue Gui-rong. Translated Learning:Transfer Learning across Different Feature Spaces. In Neural Information Processing Systems, 2008:353-360.
    [31]Kim Hyunsoo, Park Haesun. Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis[J]. Bioinformatics/computer Applications in The Biosciences,2007,23:1495-1502.
    [32]Lin Chih-Jen. Projected Gradient Methods for Nonnegative Matrix Factorization[J]. Neural Computation,2007,19(10):2756-2779.
    [33]Mardia K. V., Kent J. T., Bibby J. M. Multivariate Analysis. London:Academic Press,1980.
    [34]Benthem Mark H. Van, Keenan Michael R. Fast algorithm for the solution of large-scale non-negativity-constrained least squares problems[J]. Journal of Chemometrics,2004,18:441-450.
    [35]Singhal, A. Modern Information Retrieval:A Brief Overview[J]. IEEE Data(base) Engineering Bulletin,2001,24(4):35-43.
    [36]Manning Christopher D., Raghavan Prabhakar, Schiitze Hinrich. Introduction to information retrieval.2008.
    [37]Yang Jianchao, Yu Kai, Gong Yihong. Linear spatial pyramid matching using sparse coding for image classification. In Computer Vision and Pattern Recognition, 2009:1794-1801.
    [38]Ji Yang-Sheng, Chen Jia-Jun, Niu Gang. Transfer Learning via Multi-View Principal Component Analysis[J]. Journal of Computer Science and Technology,2011, 26(1):81-98.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700