Semantic context based refinement for news video annotation

详细信息查看全文

作者：Zhiyong Wang (1)
Genliang Guan (1)
Yu Qiu (1)
Li Zhuo (2)
Dagan Feng (1) (3)
关键词：Video annotation ; News video ; Semantic context ; Semantic similarity ; Random walk
刊名：Multimedia Tools and Applications
出版年：2013
出版时间：December 2013
年：2013
卷：67
期：3
页码：607-627
全文大小：658KB
参考文献：1. Ballan L, Bertini M, Bimbo AD, Serra G (2010) Video annotation and retrieval using ontologies and rule learning. IEEE Multimed 17(4):80鈥?8 CrossRef
2. Benitez AB, Chang SF (2003) Image classification using multimedia knowledge networks. In: IEEE International Conference on Image Processing (ICIP), vol聽2. Barcelona, Spain, pp聽613鈥?16
3. Bertini M, Bimbo AD, Torniai C (2006) Automatic annotation and semantic retrieval of video sequences using multimedia ontologies. In: ACM international conference on multimedia. Santa Barbara, CA, pp聽679鈥?82
4. Carneiro G, Chan A, Moreno P, Vasconcelos N (2007) Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans Pattern Anal Mach Intell 29(3):394鈥?10 CrossRef
5. Chang SF, He J, Jiang YG, Khoury EE, Ngo CW, Yanagawa A, Zavesky E (2008) Columbia University/VIREO-CityU/IRIT TRECVID2008 high-level feature extraction and interactive video search. In: NIST TRECVID workshop (TRECVID鈥?8). Gaithersburg, MD
6. Chen Z, Cao J, Xia T, Song Y, Zhang Y, Li J (2011) Web video retagging. Multimed Tools Appl 55:53鈥?2 CrossRef
7. Cilibrasi R, Vitanyi PMB (2007) The Google similarity distance. IEEE Trans Knowl Data Eng 19(3):370鈥?83 CrossRef
8. Datta R, Joshi D, Li J, James聽Wang Z (2008) Image retrieval: ideas, influences, and trends of the new age. ACM Comput Surv 40(2):5:1鈥?:49 CrossRef
9. Deng J, Berg AC, Li K, Fei-Fei L (2010) What does classifying more than 10,000 image categories tell us? In: European Conference of Computer Vision (ECCV). Crete, Greece, pp聽71鈥?4
10. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) ImageNet a large-scale hierarchical image database. In: IEEE international conference on Computer Vision and Pattern Recognition (CVPR). Miami, Florida, pp聽248鈥?55
11. Deschacht K, Moens MF (2007) Text analysis for automatic image annotation. The 45th annual meeting of the Association for Computational Linguistics (ACL) pp聽1000鈥?007
12. Fan J, Luo H, Gao Y, Jain R (2007) Incorporating concept ontology for hierarchical video classification, annotation, and visualization. IEEE Trans Multimedia 9(5):939鈥?57 CrossRef
13. Fellbaum C (ed) (1998) WordNet: an electronic lexical database. MIT Press
14. Feng D, Siu WC, Zhang HJ (2003) Multimedia information retrieval and management. Springer-Verlag, Germany CrossRef
15. Feng Y, Lapata M (2008) Automatic image annotation using auxiliary text information. In: The 46th annual meeting of the association for computational linguistics: human language technologies. Columbus, Ohio, pp聽272鈥?80
16. Fu H, Chi Z, Feng D (2006) Attention-driven image interpretation with application to image retrieval. Pattern Recogn 39(9):1604鈥?621 CrossRef
17. Fu H, Chi Z, Feng D (2009) An efficient algorithm for attention-driven image interpretation from segments. Pattern Recogn 42(1):126鈥?40 CrossRef
18. Fu H, Chi Z, Feng D (2010) Recognition of attentive objects with a concept association network for image annotation. Pattern Recogn 43:3539鈥?547 CrossRef
19. Fu H, Chi Z, Feng D, Song J (2004) Machine learning techniques for ontology-based leaf classification. In: International conference on control, automation, robotics and vision. Kunming, China, pp聽681鈥?86
20. Guan G, Wang Z, Tian Q, Feng D (2009) Improved concept similarity measuring in visual domain. In: IEEE international workshop on multimedia signal processing. Rio de Janeiro, Brazil, pp聽1鈥?
21. Hanbury A (2008) A survey of methods for image annotation. J Visual Lang Comput 19:617鈥?27 CrossRef
22. Hauptmann AG, Chen MY, Christel M, Lin WH, Yang J (2007) A hybrid approach to improving semantic extraction of news video. In: International conference on semantic computing. Irvine, CA, pp聽79鈥?6
23. Hollink L, Little S, Hunter J (2005) Evaluating the application of semantic inferencing rules to image annotation. In: International conference on knowledge capture. Banff, Alberta
24. Hoogs A, Rittscher J, Stein G, Schmiederer J (2003) Video content annotation using visual analysis and a large semantic knowledgebase. In: IEEE international conference on computer vision and pattern recognition. Madison, Wisconsin, pp聽327鈥?34
25. Jain R, Sinha P (2010) Content without context is meaningless. In: ACM international conference on multimedia. Firenze, Italy, pp聽1259鈥?268
26. Jiang W, Xie L, Chang SF (2009) Visual saliency with side information. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Taiwan, pp聽1765鈥?768
27. Jiang YG, Wang J, Chang SF, Ngo CW (2009) Domain adaptive semantic diffusion for large scale content-based video annotation. In: IEEE International Conference on Computer Vision (ICCV). Kyoto Japan, pp聽1420鈥?427
28. Jiang YG, Yang J, Ngo CW, Hauptmann AG (2010) Representations of keypoint-based semantic concept detection: a comprehensive study. IEEE Trans Multimedia 12(1):42鈥?3 CrossRef
29. Jin Y, Khan L, Wang L, Awad M (2005) Image annotations by combining multiple evidence & WordNet. In: ACM international conference on multimedia. Singapore, pp聽706鈥?15
30. Lew MS, Sebe N, Djeraba C, Jain R (2006) Content-based multimedia information retrieval: State of the art and challenges. ACM TOMCCAP 2(1):1鈥?9 CrossRef
31. Li H, Shi Y, Chen MY, Hauptmann AG, Xiong Z (2010) Hybrid active learning for cross-domain video concept detection. In: ACM international conference on multimedia. Firenze, Italy, pp聽1003鈥?996
32. Li X, Snoek CGM, Worring M (2009) Learning social tag relevance by neighbor voting. IEEE Trans. Multimedia 11(7):1310鈥?322 CrossRef
33. Lin D (1998) An information-theoretic definition of similarity. In: International conference on machine learning, pp聽296鈥?04
34. Liu D, Hua XS, Wang M, Zhang HJ (2010) Image retagging. In: ACM international conference on multimedia. Firenze, Italy, pp聽491鈥?00
35. Liu J, Lai W, Hua XS, Huang Y, Li S (2007) Video search re-ranking via multi-graph propagation. In: ACM international conference on multimedia
36. Manning CD, Sch眉tze H (1999) Foundations of statistical natural language processing. MIT Press, MA
37. Naphade M, Kozinstev IV, Huang TS (2002) A factor graph ramework for semantic video indexing. IEEE Trans Circuits Syst Video Technol 12(1):40鈥?2 CrossRef
38. Naphade M, Smith JR, Tesic J, Chang SF, Hsu W, Kennedy L, Hauptmann A, Curtis J (2006) Large-scale concept ontology for multimedia. IEEE Multimed 13(3):86鈥?1 CrossRef
39. Page L, Brin S, Motwani R, Winograd T (1998) The PageRank citation ranking: bringing order to the web. Tech. rep., Stanford University
40. Qi GJ, Hua XS, Rui Y, Tang J, Mei T, Zhang HJ (2007) Correlative multi-label video annotation. In: ACM international conference on multimedia. ACM, New聽York, pp聽17鈥?6
41. Qiu Y, Guan G, Wang Z, Feng D (2010) Improving news video annotation with semantic context. In: International conference on Digital Image Computing: Techniques and Applications (DICTA). Sydney, Australia
42. Schreiber AT, Dubbeldam B, Wielemaker J, Wielinga B (2001) Ontology-based photo annotation. IEEE Intell Syst 16(3):66鈥?4 CrossRef
43. Shi R, Chua TS, Lee CH, Gao S (2006) Bayesian learning of hierarchical multinomial mixture models of concepts for automatic image annotation. In: ACM international conference on image and video retrieval. Arizona, USA, pp聽102鈥?12
44. Snoek CGM, Huurnink B, Hollink L, de聽Rijke M, Schreiber G, Worring M (2007) Adding semantics to detectors for video retrieval. IEEE Trans Multimedia 9(5):975鈥?86 CrossRef
45. Velivelli A, Huang TS (2006) Automatic video annotation by mining speech transcripts. In: IEEE conference on computer vision and pattern recognition workshop. New York, USA, pp聽115鈥?22
46. Wang C, Jing F, Zhang L, Zhang HJ (2006) Image annotation refinement using random walk with restarts. In: ACM international conference on multimedia. Santa聽Barbara, California, pp聽647鈥?50
47. Wang G, Chua TS (2009) Multimedia content analysis theory and applications, chap. capturing text semantics for concept detection in news video. Springer, US, pp聽1鈥?5
48. Wang M, Hua XS, Hong R, Tang J, Qi GJ, Song Y (2009) Unified video annotation via multigraph learning. IEEE Trans Circuits Syst Video Technol 19(5):733鈥?46 CrossRef
49. Wang M, Hua XS, Mei T, Hong R, Qi GJ, Song Y, Dai LR (2009) Semi-supervised kernel density estimation for video annotation. Comput Vis Image Underst 113(3):384鈥?96 CrossRef
50. Wang M, Hua XS, Tang J, Hong R (2009) Beyond distance measurement: constructing neighborhood similarity for video annotation. IEEE Trans Multimedia 11(3):465鈥?76 CrossRef
51. Wang XJ, Zhang L, Li X, Ma WY (2008) Annotating images by mining image search results. IEEE Trans Pattern Anal Mach Intell 30(11):1919鈥?932 CrossRef
52. Wang Y, Gong S (2007) Refining image annotation using contextual relations between words. In: ACM international conference on image and video retrieval. Amsterdam, The聽Netherlands, pp聽425鈥?32
53. Wang Z, Feng D (2010) Machine learning techniques for adaptive multimedia retrieval: technologies applications and perspectives, chap. discovering semantics from visual information. IGI Global
54. Wei XY, Jiang YG, Ngo CW (2009) Exploring inter-concept relationship with context space for semantic video indexing. In: ACM international conference on image and video retrieval. Island of Santorini, Greece
55. Wu Y, Tseng BL, Smith JR (2004) Ontology-based multi-classification learning for video concept detection. In: IEEE International Conference on Multimedia and Expo (ICME). Taipei, Taiwan, pp聽1003鈥?006
56. Xu H, Wang J, Hua XS, Li S (2009) Tag refinement by regularized LDA. In: ACM international conference on multimedia. Beijing, China, pp聽573鈥?76
57. Yan R, Chen MY, Hauptmann AG (2006) Mining relationship between video concepts using probabilistic graphical model. In: IEEE International Conference On Multimedia and Expo (ICME). Toronto, Canada, pp聽301鈥?04
58. Zha ZJ, Mei T, Wang Z, Hua XS (2007) Building a comprehensive ontology to refine video concept detection. In: International workshop on multimedia information retrieval. Augsburg, Bavaria, pp聽227鈥?36
59. Zhang D, MonirulIslam M, Lu G (2011) A review on automatic image annotation techniques. Pattern Recogn 45:346鈥?62 CrossRef
60. Zhuang J, Hoi SC (2011) A two-view learning approach for image tag ranking. In: ACM international conference on Web Search and Data Mining (WSDM). Hong Kong, China, pp聽625鈥?34
作者单位：Zhiyong Wang (1)
Genliang Guan (1)
Yu Qiu (1)
Li Zhuo (2)
Dagan Feng (1) (3)

1. School of Information Technologies, The University of Sydney, Sydney, NSW, Australia
2. Signal and Information Processing Laboratory, Beijing University of Technology, Beijing, People鈥檚 Republic of China
3. Department of Electronic and Information Engineering, Hong Kong Polytechnic University, Kowloon, Hong Kong
ISSN：1573-7721

文摘

Automatic video annotation is to bridge the semantic gap and facilitate concept based video retrieval by detecting high level concepts from video data. Recently, utilizing context information has emerged as an important direction in such domain. In this paper, we present a novel video annotation refinement approach by utilizing extrinsic semantic context extracted from video subtitles and intrinsic context among candidate annotation concepts. The extrinsic semantic context is formed by identifying a set of key terms from video subtitles. The semantic similarity between those key terms and the candidate annotation concepts is then exploited to refine initial annotation results, while most existing approaches utilize textual information heuristically. Similarity measurements including Google distance and WordNet distance have been investigated for such a refinement purpose, which is different with approaches deriving semantic relationship among concepts from given training datasets. Visualness is also utilized to discriminate individual terms for further refinement. In addition, Random Walk with Restarts (RWR) technique is employed to perform final refinement of the annotation results by exploring the inter-relationship among annotation concepts. Comprehensive experiments on TRECVID 2005 dataset have been conducted to demonstrate the effectiveness of the proposed annotation approach and to investigate the impact of various factors.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700