共享和私有信息最大化的跨媒体聚类

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

共享和私有信息最大化的跨媒体聚类

详细信息查看全文 | 推荐本文 |

英文篇名：Cross-Media Clustering by Share and Private Information Maximization
作者：闫小强 ; 叶阳东
英文作者：Yan Xiaoqiang;Ye Yangdong;School of Information Engineering, Zhengzhou University;
关键词：跨媒体 ; 多源异构 ; 共享和私有信息 ; 信息最大化 ; 互信息 ; 聚类分析
英文关键词：cross-media;;multi-source heterogeneous;;share and private information;;information maximization;;mutual information;;clustering analyse
中文刊名：JFYZ
英文刊名：Journal of Computer Research and Development
机构：郑州大学信息工程学院;
出版日期：2019-07-15
出版单位：计算机研究与发展
年：2019
期：v.56
基金：国家重点研发计划项目(2018YFB1201403);; 国家自然科学基金项目(61772475,61502434)~~
语种：中文;
页：JFYZ201907002
页数：13
CN：07
ISSN：11-1777/TP
分类号：16-28

摘要

近年来,具有典型多源异构特性的跨媒体数据的快速涌现给数据分析带来巨大挑战.然而,绝大多数现有跨媒体数据分析方法仅依赖模态间的共享信息发掘跨媒体数据中蕴含的模式结构,忽略各模态自身的重要信息.针对此问题,提出共享和私有信息最大化(share and private information maximization)的跨媒体聚类算法,通过兼顾跨媒体数据的共享和私有信息,以求得更加合理的聚类模式.首先,提出2种跨媒体数据的共享信息构建模型:1)混合单词模型,该模型将各模态的底层特征转换为统一的词频向量表示,然后使用一种新的自凝聚信息最大化方法自底向上地构建多模态的混合单词空间,最大化地保持各模态底层特征的统计相似性;2)聚类集成模型,构建各模态自身的聚类划分,通过互信息度量各模态聚类划分间的信息量,抽取各模态的高层聚类划分之间的相关性.其次,提出基于信息论的目标函数,将跨媒体数据的共享和私有信息融合在同一目标函数中,在抽取聚类模式结构的过程中兼顾跨媒体数据的共享和私有信息.最后,采用顺序"抽取-合并"过程优化SPIM算法的目标函数,保证其收敛到局部最优解.在6种跨媒体数据上的实验结果表明SPIM算法的优越性.
Recently, the rapid emergence of cross media data with typical multi-source and heterogeneous characteristic brings great challenges to the traditional data analysis approaches. However, the most of existing approaches for cross media data heavily rely on the shared latent feature space to construct the relationships between multiple modalities, while ignoring the private information hidden in each modality. Aiming at this problem, this paper proposes a novel share and private information maximization(SPIM) algorithm for cross media data clustering, which leverages the shared and private information into the clustering process. Firstly, we present two shared information construction models: 1) Hybrid words(H-words) model. In this model, the low-level features in each modality are transformed into words or visual words co-occurrence vector, then a novel agglomerative information maximization is presented to build the hybrid word space for all modalities, which ensures the statistical correlation between the low-level features of multiple modalities. 2) Clustering ensemble(CE) model. This model adopts the mutual information to measure the similarity between the clustering partitions of different modalities, which ensures the semantic correlation of the high-level clustering partitions. Secondly, SPIM algorithm integrates the shared information of multiple modalities and the private information of individual modalities into a unified objective function. Finally, the optimization of SPIM algorithm is performed by a sequential "draw-and-merge" procedure, which guarantees the function converge to a local maximum. The experimental results on 6 cross media datasets show that the proposed approach compares favorably with the existing state-of-the-art cross-media clustering methods.

引文

[1]Kumar A,Daume H.A co-training approach for multi-view spectral clustering[C] //Proc of the 28th Int Conf on Machine Learning.New York:ACM,2011:393- 400
    [2]Cai Xiao,Nie Feiping,Huang Heng,et al.Heterogeneous image feature integration via multi-modal spectral clustering[C] //Proc of the 24th IEEE Conf on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2011:1977- 1984
    [3]Zhang Hong,Wu Fei,Zhuang Yueting.Cross-media correlation reasoning and retrieval[J].Journal of Computer Research and Development,2008,45(5):869- 876 (in Chinese)(张鸿,吴飞,庄越挺.跨媒体相关性推理与检索研究[J].计算机研究与发展,2008,45(5):869- 876)
    [4]Zhang Lei,Zhao Yao,Zhu Zhenfeng.Advances in semantically shared subspace learning for cross-media data[J].Chinese Journal of Computers,2017,40(6):1394- 1421 (in Chinese)(张磊,赵耀,朱振峰.跨媒体语义共享子空间学习研究进展[J].计算机学报,2017,40(6):1394- 1421)
    [5]Chaudhuri K,Kakade M,Livescu K,et al.Multi-view clustering via canonical correlation analysis[C] //Proc of the 26th Int Conf on Machine Learning.New York:ACM,2009:129- 136
    [6]Hardoon D R,Szedmak S,Shawe -Taylor J.Canonical correlation analysis:An overview with application to learning methods[J].Neural Computation,2004,16(12):2639- 2664
    [7]Sigal L,Memisevic R,Fleet D J.Shared kernel information embedding for discriminative inference[C] //Proc of the 22nd IEEE Conf on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2009:2852- 2859
    [8]Carl H,Phipip P,Lawrence N D.Gaussian process latent variable models for human pose estimation[C] //Proc of the 4th Machine Learning for Multimodal Interaction.Berlin:Springer,2007:132- 143
    [9]Barnard K,Forsyth D.Learning the semantics of words and pictures[C] //Proc of the 8th Int Conf on Computer Vision.Piscataway,NJ:IEEE,2001:408- 415
    [10]Hofmann T.Learning and representing topic—A hierarchical mixture model for word occurrence in document databases[C] //Proc of the Workshop on Learning from Text and the Web.Pittsburgh,PA:CMU,1998
    [11]Barnard K,Duygulu P,Forsyth D,et al.Matching words and pictures[J].Journal of Machine Learning Research,2003,3(2):1107- 1135
    [12]Blei D M,Ng A Y,Jordan M I.Latent Dirichlet allocation[J].Journal of Machine Learning Research,2003,3(1):993- 1022
    [13]Gao Jing,Han Jiawei,Liu Jailu,et al.Multi-view clustering via joint nonnegative matrix factorization[C] //Proc of the 13th SIAM Int Conf on Data Mining.Philadelphia,PA:SIAM,2013:252- 260
    [14]Cai Xiao,Nie Feiping,Huang Heng.Multi-view k-means clustering on big data[C] //Proc of the 23rd Int Joint Conf on Artificial Intelligence.Palo Alto,CA:AAAI,2013:2598- 2604
    [15]Strehl A,Ghosh A.Cluster ensembles-a knowledge reuse framework for combining multiple partitions[J].The Journal of Machine Learning Research,2003 (3):583- 617
    [16]Huang Dong,Lai Jianhuang,Wang Changdong.Robust ensemble clustering using probability trajectories[J].IEEE Transactions on Knowledge and Data Engineering,2016,28(5):1312- 1326
    [17]Huang Dong,Wang Changdong,Lai Jianhuang.Locally weighted ensemble clustering[J].IEEE Transactions on Cybernetics,2017,48(5):1460- 1473
    [18]Lou Zhengzheng,Ye Yangdong,Yan Xiaoqiang.The multi-feature information bottleneck with application to unsupervised image categorization[C] //Proc of the 23rd Int Joint Conf on Artificial Intelligence.Palo Alto,CA:AAAI,2013:1508- 1515
    [19]Yan Xiaoqiang,Ye Yangdong,Lou Zhengzheng.Unsuper-vised video categorization based on multivariate information bottleneck method[J].Knowledge-Based Systems,2015,84(C):34- 45
    [20]Luo Peng,Peng Jinye,Guan Ziyu,et al.Multi-view semantic learning for data representation[J].IEEE Transactions on Knowledge and Data Engineering,2015,27(11):3016- 3028
    [21]Zhao Qi,Li Zongmin,Cross-modal social image clustering[J].Chinese Journal of Computers,2018,41(1):98- 111 (in Chinese)(赵其鲁,李宗民.跨模态社交图像聚类[J].计算机学报,2018,41(1):98- 111)
    [22]Peng Yuxin,Qi Jinwei,Huang Xin,et al.CCL:Cross-modal correlation learning with multi-grained fusion by hierarchical network[J].IEEE Transactions on Multimedia,2018,20(2):405- 420
    [23]Peng Yuxin,Huang Xin,Qi Jinwei.Cross-media shared representation by hierarchical learning with multiple deep networks[C] //Proc of the 25th Int Joint Conf on Artificial Intelligence.Palo Alto,CA:AAAI,2016:3846- 3853
    [24]Tishby N,Pereira F,Bialek W.The information bottleneck method[C] //Proc of the 37th Allerton Conf on Communication,Control and Computing.Piscataway,NJ:IEEE,1999:368- 377
    [25]Slonim N.The information bottleneck:Theory and applications[D].Hebrew,IL:Hebrew University,2002
    [26]Yan Xiaoqiang,Hu Shizhe,Ye Yangdong.Multi-task clustering of human actions by sharing information[C] //Proc of the 29th IEEE Conf on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2017:6401- 6409
    [27]Svetlana L,Cordelia S,Jean P.Beyond bags of features:Spatial pyramid matching for recognizing natural scene categories[C] //Proc of the 19th IEEE Conf on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2006:2169- 2178
    [28]Rasiwasia N,Pereira J C,Coviello E,et al.A new approach to cross-modal multimedia retrieval[C] //Proc of the 18th ACM Int Conf on Multimedia.New York:ACM,2010:251- 260
    [29]Lowe D G.Distinctive image features from scale-invariant keypoints[J].International Journal of Computer Vision,2004,60(2):91- 110
    [30]Rashtchian C,Young P,Hodosh M,et al.Collecting image annotations using Amazon's Mechanical Turk[C] //Proc of the Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk.Stroudsburg,PA:ACL,2010:139- 147
    [31]Everingham M,Gool L V,Williams C K I,et al.The pascal visual object classes (VOC) challenge[J].International Journal of Computer Vision,2010,88(2):303- 338
    [32]Hwang S J,Grauman K.Accounting for the relative importance of objects in image retrieval[C] //Proc of the British Machine Vision Conf.Berlin:Springer,2010:1- 12
    [33]Peng Yuxin,Zhai Xiaohua,Zhao Yunchao,et al.Semi-supervised cross-media feature learning with unified patch graph regularization[J].IEEE Transactions on Circuits & Systems for Video Technology,2016,26(3):583- 596
    [34]Zhai Xiaohua,Peng Yuxin,Xiao Jianguo.Learning cross-media joint representation with sparse and semisupervised regularization[J].IEEE Transactions on Circuits & Systems for Video Technology,2014,24(6):965- 978
    [35]Peng Yuxin,Huang Xin,Zhao Yunchao.An overview of cross-media retrieval:Concepts,methodologies,benchmarks and challenges[J].IEEE Transactions on Circuits & Systems for Video Technology,2017,28(9):2372- 2385
    [36]Kuehne H,Jhuang H,Garrote E,et al.HMDB:A large video database for human motion recognition[C] //Proc of the 13th IEEE Int Conf on Computer Vision.Piscataway,NJ:IEEE,2011:2556- 2563
    [37]Laptev I,Marszalek M,Schmid C,et al.Learning realistic human actions from movies[C] //Proc of the 21st IEEE Conf on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2008:1- 8
    [38]Karypis G,Kumar V.A fast and high quality multilevel scheme for partitioning irregular graphs[J].SIAM Journal on Scientific Computing,1998,20(1):359- 392
    [39]Chang Shifu,Wu Xiaoming,Li Zhenguo.Segmentation using superpixels:A bipartite graph partitioning approach[C] //Proc of the 25th IEEE Conf on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2012:789- 796

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700