基于多媒体信息检索的有监督词袋模型

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

基于多媒体信息检索的有监督词袋模型

详细信息查看全文 | 推荐本文 |

英文篇名：Supervised bag of word model for multimedia information retrieval
作者：袁桂霞 ; 周先春
英文作者：YUAN Gui-xia;ZHOU Xian-chun;School of Information and Mechanical and Electrical Engineering,Jiangsu Open University;School of Electronic and Information Engineering,Nanjing University of Information Science and Technology;
关键词：词袋模型 ; 多媒体信息检索 ; 文本检索 ; 图像检索 ; 能量最小准则
英文关键词：bag of word model;;multimedia information retrieval;;text retrieval;;image retrieval;;minimum energy criterion
中文刊名：SJSJ
英文刊名：Computer Engineering and Design
机构：江苏开放大学信息与机电工程学院;南京信息工程大学电子与信息工程学院;
出版日期：2018-09-16
出版单位：计算机工程与设计
年：2018
期：v.39;No.381
基金：国家创新基金项目(435012C26244104350)
语种：中文;
页：SJSJ201809031
页数：6
CN：09
ISSN：11-1775/TP
分类号：181-186

摘要

词袋模型的复杂度高,且区分能力较弱,为解决这一问题,在经典词袋模型的基础上,提出一种有监督的词袋模型。在训练过程中对训练样本类别进行标记,在此基础上构建直方图总体能量目标函数,依据能量最小准则学习码本。通过文本检索和图像检索两组多媒体信息检索实验进行对比,对比结果表明,有监督词袋模型比经典词袋模型的检索精确度高、检索耗时少。
The training and coding process of the classic bag of word model is unsupervised and does not require tag data.Although the adaptability of this approach is strong,the bag of word model is highly complex and has a weak ability to distinguish.To solve this problem,a supervised bag of word model based on the classical one was put forward.The proposed model needed to mark the category of samples in the training process.On this basis,the objective function of the overall energy of histogram was constructed,and the codebook was learned according to the minimum energy criterion.Through experimental comparison on two groups of multimedia information retrieval experiments including text retrieval and image retrieval,the results show that the supervised bag of word model is more accurate and less time-consuming than the classical one.

引文

[1]Rocha V,Kon F,Cobe R,et al.A hybrid cloud-P2Parchitecture for multimedia information retrieval on VoD services[J].Computing,2016,98(2):73-92.
    [2]Moulin C,Largeron C,Ducottet C,et al.Fisher linear discriminant analysis for text-image combination in multimedia information retrieval[J].Pattern Recognition,2014,47(1):260-269.
    [3]Souza C L,Silva G D,Assis G T,et al.SAPTE:A multimedia information system to support the discourse analysis and information retrieval of television programs[J].Multimedia Tools&Applications,2015,74(23):10923-10963.
    [4]WU Fei,ZHU Wenwu,YU Junqing.Research on multimedia technology:2014-deep learning and media China calculation[J].Journal of Image and Graphics,2015,20(11):1423-1433(in Chinese).[吴飞,朱文武,于俊清.多媒体技术研究:2014——深度学习与媒体计算[J].中国图象图形学报,2015,20(11):1423-1433.]
    [5]Purda L,Skillicorn D.Accounting variables,deception,and a bag of words:Assessing the tools of fraud detection[J].Contemporary Accounting Research,2015,32(3):1193-1223.
    [6]HUANG Zhiming,XU Zhenxiang,CAO Zhengcai,et al.Study of image information processing algorithms based on visual word bag model[J].Journal of Huazhong University of Science and Technology(Natural Science Edition),2015,2(3):233-236(in Chinese).[黄志明,徐祯祥,曹政才,等.基于视觉词袋模型的图像信息处理算法研究[J].华中科技大学学报(自然科学版),2015,2(3):233-236.]
    [7]Yilmaz T,Yazici A,Kitsuregawa M.RELIEF-MM:Effective modality weighting for multimedia information retrieval[J].Multimedia Systems,2014,20(4):389-413.
    [8]ZHANG Yu,YUAN Ye,WANG Guoren.A multi modal multimedia retrieval model based on probabilistic latent semantic analysis[J].Small and Micro Computer Systems,2015,36(8):1665-1670(in Chinese).[张宇,袁野,王国仁.一个基于概率潜语义分析的多模态多媒体检索模型[J].小型微型计算机系统,2015,36(8):1665-1670.]
    [9]Ramesh B,Xiang C,Tong H L.Shape classification using invariant features and contextual information in the bag-of-words model[J].Pattern Recognition,2015,48(3):894-906.
    [10]HE Tengpeng,ZHANG Rongfen,LIU Chao,et al.Design of smart seeing glasses based on machine vision[J].Application of Electronic Technique,2017,43(4):58-61(in Chinese).[何腾鹏,张荣芬,刘超,等.基于机器视觉的智能导盲眼镜设计[J].电子技术应用,2017,43(4):58-61.]
    [11]Liu L,Peng T.Clustering-based method for positive and unlabeled text categorization enhanced by improved TFIDF[J].Journal of Information Science&Engineering,2014,30(5):1463-1481.
    [12]Batur A,Tursun G,Mamut M,et al.Uyghur printed document image retrieval based on SIFT features[J].Procedia Computer Science,2017,107(C):737-742.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700