摘要
【目的/意义】大数据背景下,优良的多媒体信息检索系统是提升数字图书馆交互性,促使其知识服务升级的关键。【方法/过程】调研主流数字图书馆的多媒体信息检索系统,发现主要存在"未充分利用跨模态相关性"、"未有效组织多媒体资源"等问题。从"跨模态相关性分析"、"层次化知识推理"等方面提出优化方案并实证分析。【结果/结论】系统检索性能提升,这表明:运用深度学习、知识表示学习等理论优化多媒体信息检索系统,可更好地满足用户知识需求,进而提升数字图书馆知识服务质量。
【Purpose/significance】Under the big data environment, a better multi-media information retrieval system is one of the core aspects for promoting digital libraries' interactivities and impelling the transformation of its knowledge services.【Method/process】After investigating several famous digital libraries, it finds out two key problems still remain in the traditional multi-media retrieval system. The first is"the useful cross-modal semantic information wasn't applied in the retriev-al procedure". The second is"the multi-media resources in digital library weren't organized and managed systematically".To resolve the problems and improve retrieval performance in some extent, it proposes several novel ideas for optimizing thetraditional multi-media information retrieval system:"cross-modal correlation analysis","hierarchical knowledge reasoning", et al. Detailed empirical analysis is done to verify the presented novel ideas.【Result/conclusion】Retrieval performances are improved apparently. It means that several modern technologies such as deep learning and knowledge represen-tation learning actually contribute to optimize the traditional multi-media information retrieval system of digital library.More importantly, it can better satisfy users' knowledge demands and improve the knowledge service quality of digital library.
引文
1李广丽,张红斌.数字图书馆中跨媒体检索模型的设计及优化探索[J].情报理论与实践,2013,36(2):104-108.
2 袁红军,宁光芳.大数据时代数字图书馆知识咨询能力研究框架构建[J].现代情报, 2013,33(11):25-28.
3 Socher, R., Fei-Fei, L. Connecting modalities:Semi-supervised segmentation and annotation of images using unaligned text corpora[C]//In Proceedings of CVPR, 2010.
4 Xikui Wang, Yang Liu, Donghui Wang, et al.Cross-media Topic Mining on Wikipedia[C]//In Proceedings of ACM MM,2013:689-692.
5 Jiquan Ngiam, Aditya Khosla, Mingyu Kim, et al.Multimodal Deep Learning[C]//In Proceedings of ICML, 2011.
6 Kiros R, Salakhutdinov R, and Zemel R. S. Unifying visual-semantic embeddings with multimodal neural language models[C]//In Proceedings of NIPS, Deep Learning Workshop, 2015.
7 Sarath Chandar, Mitesh M. Khapra, Hugo Larochelle,and Balaraman Ravindran. Correlational neural networks[J]. Neural Computation, 2016, 28(2):257-285.
8 McGurk J M. Hearing lips and seeing voices[J]. Nature, 1976,(264):746-748.
9 Therriault D, Yaxley R, and Zwaan R. The role of color diagnosticity in object recognition and representation[J]. Cognitive Processing, 2009, 10(4):335-342.
10 刘知远,孙茂松,林衍凯,谢若冰.知识表示学习研究进展[J].计算机研究与发展, 2016, 53(2):247-261.
11 Simonyan K, Zisserman A. very deep convolutional networks for large-scale image recognition[C]//In Proceedings of ICLR, 2015.
12 He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//In Proceedings of CVPR, 2016:770-778.
13 Hochreiter, S., Schmidhuber, J. Long short-term memory[J]. Neural Computation, 1997, 9(8):1735-1780.
14 K. Cho, D. Bahdanau, F. Bougares, H. Schwenk, Y. Bengio. Learning phrase representations using rnn encoder-decoder for statistical machine translation[C]//In Proceedings of EMNLP, 2014.
15 Jeffrey Pennington, Richard Socher, Christopher D. Manning. GloVe:global vectors for word representation[C]//In Proceedings of ACL, 2014.
16 李广丽,陈婧琳,刘斌,殷依,张红斌.基于Tag-rank和CCA的在线商品跨媒体检索研究[J].科学技术与工程,2016, 16(14):221-227, 235.
17 X Chen,Y Qi, B Bai, Q Lin, JG Carbonell. Sparse latent semantic analysis[C]//In Proceedings of SDM, 2012.
18 Nitish Srivastava,Ruslan Salakhutdinov. Learning representations for multimodal data with deep belief nets[C]//In Proceedings of NIPS, 2012.
19 李广丽,陈婧琳,杨将,刘斌,殷依,张红斌.基于颜色高效匹配核的图像检索研究[J].微电子学与计算机,2016, 33(11):147-151.
20 M. Hodosh, P. Young, and J. Hockenmaier. Framing image description as a ranking task:Data, models and evaluation metrics[J]. Journal of Artificial Intelligence Resource,2013,(47):853-899.
21 Bordes, Antoine, et al. Translating embeddings for modeling multi-relational data[C]//In Proceedings of NIPS,2013.
22 张红斌,姬东鸿,尹兰,任亚峰,牛正雨.基于关键词精化和句法树的商品图像句子标注[J].计算机研究与发展,2016, 53(11):2542-2555.
23 Elman Mansimov, Emilio Parisotto, Jimmy Lei Ba, et al.Generating images from captions with attention[C]//In Proceedings of ICLR, 2016.