Scene categorization based on local–global feature fusion and multi-scale multi-spatial resolution encoding

设为首页

收藏本站

网站地图 | English | 公务邮箱

About the library

Background
History
Leadership
Organization

Readers' Guide

Opening Hours
Collections
Help Via Email

Publications

Electronic Information Resources

Scene categorization based on local–global feature fusion and multi-scale multi-spatial resolution encoding

详细信息查看全文

作者：Jianzhao Qin ; Fuqin Deng ; Nelson H. C. Yung
关键词：Scene categorization ; Local–global feature fusion ; Multi ; scale multi ; spatial resolution encoding
刊名：Signal, Image and Video Processing
出版年：2014
出版时间：December 2014
年：2014
卷：8
期：1-supp
页码：145-154
全文大小：1,046 KB
参考文献：1. Blei, DM, Ng, AY, Jordan, MI (2003) Latent dirichlet allocation. J. Mach. Learn. Res. 3: pp. 944937
2. Bosch, A., Zisserman, A., Munoz, X.: Scene classification via plsa. In: ECCV 2006, pp. 517-30 (2006)
3. Bosch, A, Zisserman, A, Muoz, X (2008) Scene classification using a hybrid generative/discriminative approach. IEEE Trans. Pattern Anal. Mach. Intell. 30: pp. 712-727 class="external" href="http://dx.doi.org/10.1109/TPAMI.2007.70716" target="_blank" title="It opens in new window">CrossRef
4. Boureau, Y.L., Bach, F., LeCun, Y., Ponce, J.: Learning mid-level features for recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010, pp. 2559-566 (2010)
5. Fei-Fei, L, Perona, P (2005) A bayesian hierarchical model for learning natural scene categories. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 2: pp. 524-531
6. Gehler, P., Nowozin, S.: On feature combination for multiclass object classification. In: IEEE 12th International Conference on Computer Vision, 2009 , pp. 221-28 (2009)
7. Hofmann, T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42: pp. 177-196 class="external" href="http://dx.doi.org/10.1023/A:1007617005950" target="_blank" title="It opens in new window">CrossRef
8. Kwitt, R., Vasconcelos, N., Rasiwasia, N.: Scene recognition on the semantic manifold. In: Proceedings of the 12th European Conference on Computer Vision—Volume Part IV. ECCV-2, pp. 359-72. Springer, Berlin (2012)
9. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2006 , vol. 2, pp. 2169-178 (2006)
10. Lee, J.J.: Libpmk: a pyramid match toolkit. Tech. Rep. MIT-CSAIL-TR-2008-17, MIT Computer Science and Artificial Intelligence Laboratory (2008)
11. Li, T, Mei, T, Kweon, IS, Hua, XS (2011) Contextual bag-of-words for visual categorization. IEEE Trans. Circuits Syst. Video Technol. 21: pp. 381-392 class="external" href="http://dx.doi.org/10.1109/TCSVT.2010.2041828" target="_blank" title="It opens in new window">CrossRef
12. Lian, X.C., Li, Z., Lu, B.L., Zhang, L.: Max-margin dictionary learning for multiclass image categorization. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) Computer Vision ECCV 2010. Lecture Notes in Computer Science, vol. 6314, pp. 157-70. Springer, Berlin (2010)
13. Lowe, D.G.: Object recognition from local scale-invariant features. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150-157 (1999)
14. Mahbub, U, Imtiaz, H, Ahad, MAR (2014) Action recognition based on statistical analysis from clustered flow vectors. Signal Image Video Process. 8: pp. 243-253 class="external" href="http://dx.doi.org/10.1007/s11760-013-0533-3" target="_blank" title="It opens in new window">CrossRef
15. Ojala, T, Pietik?inen, M, M?enp??, T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24: pp. 971-987 class="external" href="http://dx.doi.org/10.1109/TPAMI.2002.1017623" target="_blank" title="It opens in new window">CrossRef
16. Oliva, A, Torralba, A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42: pp. 145-175 class="external" href="http://dx.doi.org/10.1023/A:1011139631724" target="_blank" title="It opens in new window">CrossRef
17. Pandey, M., Lazebnik, S.: Scene recognition and weakly supervised object localization with deformable part-based models. In: IEEE International Conference on Computer Vision (ICCV), 2011 pp. 1307-314 (2011). doi:class="a-plus-plus non-url-ref">10.1109/ICCV.2011.6126383
18. Qin, J, Yung, NHC (2009) Scene categorization with multi-scale category-specific visual words. Opt. Eng. 48: pp. 047 class="external" href="http://dx.doi.org/10.1117/1.3115471" target="_blank" title="It opens in new window">CrossRef
19. Qin, J, Yung, NHC (2010) Scene categorization via contextual visual words. Pattern Recognit. 43: pp. 1874-1888 class="external" href="http://dx.doi.org/10.1016/j.patcog.2009.11.009" target="_blank" title="It opens in new window">CrossRef
20. Qin, J, Yung, NHC (2012) Feature fusion within local region using localized maximum-margin learning for scene categorization. Pattern Recognit. 45: pp. 1671-1683 class="external" href="http://dx.doi.org/10.1016/j.patcog.2011.09.027" target="_blank" title="It opens in new window">CrossRef
21. Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009, pp. 413-20 (2009). doi:class="a-plus-plus non-url-ref">10.1109/CVPR.2009.5206537
22. Siagian, C., Itti, L
刊物类别：Engineering
刊物主题：Signal,Image and Speech Processing
Image Processing and Computer Vision
Computer Imaging, Vision, Pattern Recognition and Graphics
Multimedia Information Systems
出版者：Springer London
ISSN：1863-1711

文摘

With the bag-of-contextual-visual-word (BOCVW) models, we propose a scene categorization method based on local–global feature fusion and multi-scale multi-spatial resolution encoding. First, the performances of the BOCVW models belonging to different features are mutually reinforced by fusing other types of features within local regions. Then, the spatial configuration information is explored using a multi-scale multi-spatial resolution encoding approach. Furthermore, these encoded BOCVW models are globally fused using an improved maximum-margin optimization strategy, which considers the margin between input vectors of different categories and the diameter of the smallest ball containing feature vectors simultaneously. The proposed method has been evaluated on three scene categorization datasets consisting of scene categories 8, 15, and 67, respectively. And its effectiveness has been verified by these experimental results.