Max-margin adaptive model for complex video pattern recognition

详细信息查看全文

作者：Litao Yu (1)
Jie Shao (2)
Xin-Shun Xu (3)
Heng Tao Shen (1)

1. School of Information Technology and Electrical Engineering ; The University of Queensland ; Brisbane ; QLD ; 4072 ; Australia
2. Department of Computer Science ; National University of Singapore ; Singapore ; 117417 ; Singapore
3. School of Computer Science and Technology ; Shandong University ; Jinan ; 250101 ; Shandong ; People鈥檚 Republic of China
关键词：Video pattern recognition ; Max ; margin adaptive model ; Event detection
刊名：Multimedia Tools and Applications
出版年：2015
出版时间：January 2015
年：2015
卷：74
期：2
页码：505-521
全文大小：1,613 KB
参考文献：1. Ballan L, Bertini M, Del Bimbo A, Seidenari L, Serra G (2011) Event detection and recognition for semantic annotation of video. Multimedia Tools Appl 51(1):279鈥?02 CrossRef
2. Blitzer J, Crammer K, Kulesza A, Pereira F, Wortman J (2007) Learning bounds for domain adaptation. In: NIPS, pp 129鈥?36
3. Borgwardt KM, Gretton A, Rasch MJ, Kriegel HP, Schlkopf B, Smola AJ (2006) Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics 22(14):49鈥?7 CrossRef
4. Brefeld U, G盲rtner T, Scheffer T, Wrobel S (2006) Efficient co-regularised least squares regression. In: ICML, pp 137鈥?44
5. Charles J, Pfister T, Magee D, Hogg D, Zisserman A (2013) Domain adaptation for upper body pose tracking in signed tv broadcasts. In: Proceedings of the British machine vision conference
6. Chen B, Lam W, Tsang IW, Wong TL (2013) Discovering low-rank shared concept space for adapting text mining models. IEEE Trans Pattern Anal Mach Intell 35(6):1284鈥?297 CrossRef
7. Cortes C, Mohri M, Rostamizadeh A (2009) L2 regularization for learning kernels. In: UAI, pp 109鈥?16
8. Diane C, Feuz KD, Krishnan NC (2013) Transfer learning for activity recognition: a survey. Knowl Inf Syst 36(3):537鈥?56 CrossRef
9. Duan L, Tsang I, Xu D (2012) Domain transfer multiple kernel learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(3):465鈥?79 CrossRef
10. Gehler P, Nowozin S (2009) On feature combination for multiclass object classification. In: ICCV, pp 221鈥?28
11. Jiang YG, Ye G, Chang SF, Ellis D, Loui AC (2011) Consumer video understanding: a benchmark database and an evaluation of human and machine performance. In: ICMR, pp 29:1鈥?9:8
12. Jiang YG, Bhattacharya S, Chang SF, Shah M (2013) High-level event recognition in unconstrained videos. Int J Multimedia Inf Retrieval 2(2):73鈥?01 CrossRef
13. Jie L, Tommasi T, Caputo B (2011) Multiclass transfer learning from unconstrained priors. In: Computer Vision (ICCV), pp 1863鈥?870
14. Joachims T (1999) Transductive inference for text classification using support vector machines. In: ICML, vol 99, pp 200鈥?09
15. Kuehne H, Jhuang H, Garrote E, Poggio T, Serre T (2011) HMDB: a large video database for human motion recognition. In: ICCV
16. Li Z, Yang Y, Liu J, Zhou X, Lu H (2012) Unsupervised feature selection using nonnegative spectral analysis. In: AAAI
17. Liang F, Tang S, Wang Y, Han Q, Li J (2013) A sparse coding based transfer learning framework for pedestrian detection. In: Advances in multimedia modeling, vol 7733, pp 272-282
18. Lin W, Sun MT, Poovendran R, Zhang Z (2010) Group event detection with a varying number of group members for video surveillance. IEEE Trans Circ Syst Video Technol 20(8):1057鈥?067 CrossRef
19. Lin YY, Liu TL, Fuh CS (2011) Multiple kernel learning for dimensionality reduction. IEEE Trans Pattern Anal Mach Intell 33(6):1147鈥?160 CrossRef
20. Ma Z, Yang Y, Cai Y, Sebe N, Hauptmann AG (2012) Knowledge adaptation for ad hoc multimedia event detection with few exemplars. In: ACM multimedia, pp 469鈥?78
21. Ma Z, Yang Y, Sebe N, Zheng K, Hauptmann A (2013a) Multimedia event detection using a classifier-specific intermediate representation. IEEE Trans 15(7):1628鈥?637
22. Ma Z, Yang Y, Xu Z, Yan S, Sebe N, Hauptmann A (2013b) Complex event detection via multi-source video attributes. In: 2013 IEEE conference on computer vision and pattern recognition (CVPR), pp 2627鈥?633
23. Merler M, Huang B, Xie L, Hua G, Natsev A (2012) Semantic model vectors for complex video event recognition. IEEE Trans Multimed 14(1):88鈥?01 CrossRef
24. Natarajan P, Wu S, Vitaladevuni S, Zhuang X, Tsakalidis S, Park U, Prasad R (2012) Multimodal feature fusion for robust event detection in web videos. In: Computer vision and pattern recognition (CVPR), pp 1298鈥?305
25. Obozinski G, Taskar B, Jordan M (2010) Joint covariate selection and joint subspace selection for multiple classification problems. Stat Comput 20(2):231鈥?52 CrossRef
26. Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345鈥?359 CrossRef
27. Pan SJ, Ni X, Sun JT, Yang Q, Chen Z (2010) Cross-domain sentiment classification via spectral feature alignment. In: WWW, pp 751鈥?60
28. Quattoni A, Collins M, Darrell T (2008) Transfer learning for image classification with sparse prototype representations. In: Computer vision and pattern recognition (CVPR), pp 1鈥?
29. Rohrbach M, Ebert S, Schiele B (2013) Transfer learning in a transductive setting. In: Burges C, Bottou L, Welling M, Ghahramani Z, Weinberger K (eds) Advances in neural information processing systems, vol 26, pp 46鈥?4
30. Sugiyama M, Id T, Nakajima S, Sese J (2010) Semi-supervised local fisher discriminant analysis for dimensionality reduction. Mach Learn 78(1鈥?):35鈥?1 CrossRef
31. Tamrakar A, Ali S, Yu Q, Liu J, Javed O, Divakaran A, Cheng H, Sawhney H (2012) Evaluation of low-level features and their combinations for complex event detection in open source videos. In: Computer vision and pattern recognition (CVPR), pp 3681鈥?688
32. Tang K, Fei-Fei L, Koller D (2012) Learning latent temporal structure for complex event detection. In: Computer vision and pattern recognition (CVPR), pp 1250鈥?257
33. Tjondronegoro D, Chen YP (2010) Knowledge-discounted event detection in sports video. IEEE Trans Syst, Man Cybern, Part A: Syst Hum 40(5):1009鈥?024 CrossRef
34. Van Erp M, Vuurpijl L, Schomaker L (2002) An overview and comparison of voting methods for pattern recognition. In: Eighth international workshop on frontiers in handwriting recognition, pp 195鈥?00
35. Wang S, Ma Z, Yang Y, Li X, Pang C, Hauptmann A (2014) Semi-supervised multiple feature analysis for action recognition. IEEE Trans Multimed 16(2):289鈥?98 CrossRef
36. Xiao M, Guo Y (2012) Semi-supervised kernel matching for domain adaptation. In: AAAI
37. Xu Z, Yang Y, Tsang I, Sebe N (2013) Feature weighting via optimal thresholding for video analysis. In: The IEEE international conference on computer vision (ICCV)
38. Yang J, Yan R, Hauptmann AG (2007) Cross-domain video concept detection using adaptive svms. In: ACM Proceedings of the 15th international conference on Multimedia, pp 188鈥?97
39. Yang Y, Shah M (2012) Complex events detection using data-driven concepts. In: Computer vision鈥揈CCV 2012. Springer, pp 722鈥?35
40. Yang Y, Song J, Huang Z, Ma Z, Sebe N, Hauptmann A (2013a) Multi-feature fusion via hierarchical regression for multimedia analysis. IEEE Trans Multimed 15(3):572鈥?81 CrossRef
41. Yang Y, Yang Y, Shen HT (2013b) Effective transfer tagging from image to video. ACM Trans Multimed Comput Commun, Appl 9(2):1鈥?0 CrossRef
42. Yao Y, Doretto G (2010) Boosting for transfer learning with multiple sources. In: Computer vision and pattern recognition (CVPR), pp 1855鈥?862
43. Younessian E, Quinn M, Mitamura T, Hauptmann A (2013) Multimedia event detection using visual concept signatures. Proc SPIE 8667(1)
44. Zeng Z, Pantic M, Roisman G, Huang T (2009) A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans Pattern Analysis Mach Intell 31(1):39鈥?8 CrossRef
45. Zhang T, Xu C, Zhu G, Liu S, Lu H (2010) A generic framework for event detection in various video domains. In: ACM multimedia, pp 103鈥?12
刊物类别：Computer Science
刊物主题：Multimedia Information Systems
Computer Communication Networks
Data Structures, Cryptology and Information Theory
Special Purpose and Application-Based Systems
出版者：Springer Netherlands
ISSN：1573-7721

文摘

Patternrecognitionmodels are usually used in a variety of applications ranging from video concept annotation to event detection. In this paper we propose a new framework called the max-margin adaptive (MMA) model for complex video pattern recognition, which can utilize a large number of unlabeled videos to assist the model training. The MMA model considers the data distribution consistence between labeled training videos and unlabeled auxiliary ones from the statistical perspective by learning an optimal mapping function which also broadens the margin between positive labeled videos and negative labeled videos to improve the robustness of the model. The experiments are conducted on two public datasets including CCV for video object/event detection and HMDB for action recognition. Our results demonstrate that the proposed MMA model is very effective on complex video pattern recognition tasks, and outperforms the state-of-the-art algorithms.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700