基于语义相关性与拓扑关系的跨媒体检索算法

英文篇名：Cross-media retrieval algorithm based on semantic correlation and topological relationship
作者：代刚 ; 张鸿
英文作者：DAI Gang;ZHANG Hong;College of Computer Science and Technology,Wuhan University of Science and Technology;Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System(Wuhan University of Science and Technology);
关键词：跨媒体检索 ; 语义信息 ; 近邻关系 ; 半监督正则化 ; 语义相关性 ; 稀疏正则化
英文关键词：cross-media retrieval;;semantic information;;nearest neighbor relationship;;semi-supervised regularization;;semantic correlation;;sparse regularization
中文刊名：JSJY
英文刊名：Journal of Computer Applications
机构：武汉科技大学计算机科学与技术学院;智能信息处理与实时工业系统湖北省重点实验室(武汉科技大学);
出版日期：2018-05-14 09:09
出版单位：计算机应用
年：2018
期：v.38;No.337
语种：中文;
页：JSJY201809016
页数：6
CN：09
ISSN：51-1307/TP
分类号：87-92

摘要

针对如何挖掘不同模态中具有相同语义的特征数据之间的内在相关性的问题,提出了一种基于语义相关性与拓扑关系(SCTR)的跨媒体检索算法。一方面,利用具有相同语义的多媒体数据之间的潜在相关性去构造多媒体语义相关超图;另一方面,挖掘多媒体数据的拓扑关系来构建多媒体近邻关系超图。通过结合多媒体数据语义相关性与拓扑关系去为每种媒体类型学习一个最优的投影矩阵,然后将多媒体数据的特征向量投影到一个共同空间,从而实现跨媒体检索。该算法在XMedia数据集上,对多项跨媒体检索任务的平均查准率为51.73%,与联合图正则化的异构度量学习(JGRHML)、跨模态相关传播(CMCP)、近邻的异构相似性度量(HSNN)、共同的表示学习(JRL)算法相比,分别提高了22.73、15.23、11.7、9.11个百分点。实验结果从多方面证明了该算法有效提高了跨媒体检索的平均查准率。
Focused on how to mine the intrinsic correlation between feature data with the same semantics in different modalities,a novel cross-media retrieval algorithm based on Semantic Correlation and Topological Relationship( SCTR) was proposed.On one hand,the potential correlation between multimedia data with the same semantics was exploited to construct multimedia semantic correlation hypergraph.On the other hand,the topological relationship of multimedia data was mined to build multimedia nearest neighbor relationship hypergraph.The main idea was to learn an optimal projection matrix for each media type by combining the semantic correlation and topological relationship of multimedia data,then to project the feature vectors of the multimedia data into a common space to achieve cross-media retrieval.On the XMedia dataset,compared with the average precisions of the Heterogeneous Metric Learning with Joint Graph Regularization( JGRHML) algorithm,Cross Modality Correlation Propagation( CMCP) algorithm,Heterogeneous Similarity measure with Nearest Neighbors( HSNN)algorithm and Joint Representation Learning( JRL) algorithm,the average precision of the proposed algorithm in multiple retrieval tasks is 51.73%,which is increased by 22.73,15.23,11.7,9.11 percentage points respectively.Experimental results prove from many aspects that the proposed algorithm effectively improves the average precision of cross-media retrieval.

引文

[1]ATREY P K,HOSSAIN M A,SADDIK A E,et al.Multimodal fusion for multimedia analysis:a survey[J].Multimedia Systems,2010,16(6):345-379.
    [2]FU Y,HOSPEDALES T M,XIANG T,et al.Learning multimodal latent attributes[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2014,36(2):303-316.
    [3]XU C,TAO D,XU C.Large-margin multi-view information bottleneck[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2014,36(8):1559-1572.
    [4]PENG Y,QI J,HUANG X,et al.CCL:cross-modal correlation learning with multigrained fusion by hierarchical network[J].IEEE Transactions on Multimedia,2018,20(2):405-420.
    [5]HARDOON D R,SZEDMAK S R,SHAWE-TAYLOR J R.Canonical correlation analysis:an overview with application to learning methods[J].Neural Computation,2014,16(12):2639-2664.
    [6]MROUEH Y,MARCHERET E,GOEL V.Multimodal retrieval with asymmetrically weighted regularized canonical correlation analysis[EB/OL].[2018-01-05].http://xueshu.baidu.com/s?wd=paperuri%3A%2821c0d6790a49dece4ef4d84bc5b2c279%29&filter=sc_long_sign&tn=SE_xueshusource_2kduw22v&sc_vurl=http%3A%2F%2Farxiv.org%2Fpdf%2F1511.06267&ie=utf-8&sc_us=2302455906385530691.
    [7]LI D,DIMITROVA N,LI M,et al.Multimedia content processing through cross-modal association[C]//MULTIMEDIA’03:Proceedings of the 11th ACM International Conference on Multimedia.New York:ACM,2003:604-611.
    [8]ZHAI X,PENG Y,XIAO J.Heterogeneous metric learning with joint graph regularization for cross-media retrieval[C]//AAAI’13:Proceedings of the 27th AAAI Conference on Artificial Intelligence.Menlo Park,CA:AAAI Press,2013:1198-1204.
    [9]ZHAI X,PENG Y,XIAO J.Cross-modality correlation propagation for cross-media retrieval[C]//Proceedings of the 2012 IEEE International Conference on Acoustics,Speech and Signal Processing.Piscataway,NJ:IEEE,2012:2337-2340.
    [10]ZHAI X,PENG Y,XIAO J.Effective heterogeneous similarity measure with nearest neighbors for cross-media retrieval[C]//MMM’12:Proceedings of the 18th International Conference on Advances in Multimedia Modeling.Berlin:Springer,2012:312-322.
    [11]ZHAI X,PENG Y,XIAO J.Learning cross-media joint representation with sparse and semisupervised regularization[J].IEEETransactions on Circuits and Systems for Video Technology,2014,24(6):965-978.
    [12]PENG Y,ZHAI X,ZHAO Y,et al.Semi-supervised cross-media feature learning with unified patch graph regularization[J].IEEETransactions on Circuits and Systems for Video Technology,2016,26(3):583-596.
    [13]WANG K,HE R,WANG L,et al.Joint feature selection and subspace learning for cross-modal retrieval[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,38(10):2010-2023.
    [14]XIE L,PAN P,LU Y.Analyzing semantic correlation for cross-modal retrieval[J].Multimedia Systems,2015,21(6):525-539.
    [15]ZHUANG Y T,YANG Y,WU F.Mining semantic correlation of heterogeneous multimedia data for cross-media retrieval[J].IEEETransactions on Multimedia,2008,10(2):221-229.
    [16]庄毅,庄越挺,吴飞.一种支持海量跨媒体检索的集成索引结构[J].软件学报,2008,19(10):2667-2680.(ZHUANG Y,ZHUANG Y T,WU F.An integrated indexing structure for largescale cross-media retrieval[J].Journal of Software,2008,19(10):2667-2680.)
    [17]张鸿,吴飞,庄越挺,等.一种基于内容相关性的跨媒体检索方法[J].计算机学报,2008,31(5):820-826.(ZHANG H,WUF,ZHUANG Y T,et al.Cross-media retrieval method based on content correlations[J].Chinese Journal of Computers,2008,31(5):820-826.)
    [18]RASIWASIA N,PEREIRA J C,COVIELLO E,et al.A new approach to cross-modal multimedia retrieval[C]//MM’10:Proceedings of the 18th ACM International Conference on Multimedia.New York:ACM,2010:251-260.
    [19]CHEN D,TIAN X,SHEN Y,et al.On visual similarity based 3Dmodel retrieval[J].Computer Graphics Forum,2010,22(3):223-232.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700