State-of-the-art and future challenges in video scene detection: a survey
详细信息    查看全文
  • 作者:Manfred Del Fabro ; Laszlo B?sz?rmenyi
  • 关键词:Video segmentation ; Scene detection ; Non ; sequential video ; Survey
  • 刊名:Multimedia Systems
  • 出版年:2013
  • 出版时间:October 2013
  • 年:2013
  • 卷:19
  • 期:5
  • 页码:427-454
  • 全文大小:821KB
  • 参考文献:1. Adams, B., Dorai, C., Venkatesh, S.: Toward automatic extraction of expressive elements from motion pictures: tempo. IEEE Trans. Multimed. 4(4), 472-81 (2002) CrossRef
    2. Aner, A., Kender, J.: Video Summaries through mosaic-based shot and scene clustering. In: Heyden, A., Sparr, G., Nielsen, M., Johansen P. (eds.) Computer Vision ECCV 2002, Lecture Notes in Computer Science, vol. 2353, Chap. 26, pp. 45-9. Springer, Berlin (2006)
    3. Arifin, S., Cheung, P.Y.K.: Affective level video segmentation by utilizing the Pleasure-Arousal-dominance information. IEEE Trans. Multimed. 10(7), 1325-341 (2008) CrossRef
    4. Ariki, Y., Kumano, M., Tsukada, K.: Highlight scene extraction in real time from baseball live video. In: Proceedings of the 5th ACM SIGMM International Workshop on Multimedia Information Retrieval, MIR -3, pp. 209-14. ACM, New York, NY, USA (2003)
    5. Benini, S., Xu, L.Q., Leonardi, R.: Identifying video content consistency by vector quantization. In: Proceedings of the 2005 International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS 2005) (2005)
    6. Bredin, H.: Segmentation of tv shows into scenes using speaker diarization and speech recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012, pp. 2377-380 (2012)
    7. Cao, J.R.: Algorithm of scene segmentation based on svm for scenery documentary. In: Third International Conference on Natural Computation, 2007 (ICNC 2007), vol. 3, pp. 95-8 (2007)
    8. Chaisorn, L., Chua, T.S., Lee, C.H.: The segmentation of news video into story units. In: IEEE International Conference on Multimedia and Expo, 2002. ICME -2, 2002, vol. 1, pp. 73-6 (2002)
    9. Chasanis, V.T., Likas, A.C., Galatsanos, N.P.: Scene detection in videos using shot clustering and sequence alignment. IEEE Trans. Multimed. 11(1), 89-00 (2009) CrossRef
    10. Chen, L., Ozsu, M.: Rule-based scene extraction from video. In: Proceedings of 2002 International Conference on Image Processing (2002)
    11. Chen, L.H., Lai, Y.C., Mark Liao, H.Y.: Movie scene segmentation using background information. Pattern Recognit. 41, 1056-065 (2008) CrossRef
    12. Chen, S.C., Shyu, M.L., Liao, W., Zhang, C.: Scene change detection by audio and video clues, pp. 365-68
    13. Cheng, W., Lu, J.: Video scene oversegmentation reduction by tempo analysis. In: Fourth International Conference on Natural Computation, 2008 (ICNC -8), vol. 4, pp. 296-00 (2008)
    14. Chu, W.T., Li, C.J., Tseng, S.C.: Travelmedia: an intelligent management system for media captured in travel. J. Vis. Commun. Image Represent. 22(1), 93-04 (2011) CrossRef
    15. Chu, W.T., Lin, C.C., Yu, J.Y.: Using cross-media correlation for scene detection in travel videos. In: Proceedings of the ACM International Conference on Image and Video Retrieval, CIVR -9. ACM, New York, NY, USA (2009)
    16. Cour, T., Jordan, C., Miltsakaki, E., Taskar, B.: Movie/script: alignment and parsing of video and text transcription. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) Computer Vision ECCV 2008, Lecture Notes in Computer Science, vol. 5305, Chap. 12, pp. 158-71. Springer, Berlin (2008)
    17. Del Fabro, M., B?sz?rmenyi, L.: Video scene detection based on recurring motion patterns. In: Second International Conferences on Advances in Multimedia (MMEDIA), pp. 113-18 (2010)
    18. Del Fabro, M., B?sz?rmenyi, L.: Summarization and presentation of real-life events using community-contributed content. In: Schoeffmann, K., Merialdo, B., Hauptmann, A., Ngo, C.W., Andreopoulos, Y., Breiteneder, C. (eds.) Advances in Multimedia Modeling, Lecture Notes in Computer Science, vol. 7131, pp. 630-32. Springer, Berlin (2012)
    19. Del Fabro, M., Sobe, A., B?sz?rmenyi, L.: Summarization of real-life events based on community-contributed content. In: The Fourth International Conferences on Advances in Multimedia, pp. 119-26. IARIA (2012)
    20. Ellouze, M., Boujemaa, N., Alimi, A.: Scene pathfinder: unsupervised clustering techniques for movie scenes extraction. Multimed. Tools Appl. 47(2), 325-46 (2010) CrossRef
    21. Ercolessi, P., Bredin, H., Sénac, C., Joly, P.: Segmenting TV series into scenes using speaker diarization. In: WIAMIS 2011: 12th International Workshop on Image Analysis for Multimedia Interactive Services. Delft, The Netherlands (2011)
    22. Friedland, G., Gottlieb, L., Janin, A.: Joke-o-mat: browsing sitcoms punchline by punchline. In: Proceedings of the Seventeen ACM International Conference on Multimedia, MM -9, pp. 1115-116. ACM, New York, NY, USA (2009)
    23. Gatica-Perez, D., Loui, A., Sun, M.T.: Finding structure in home videos by probabilistic hierarchical clustering. IEEE Trans. Circuits Syst. Video Technol. 13(6), 539-548 (2003) CrossRef
    24. Goela, N., Wilson, K., Niu, F., Divakaran, A., Otsuka, I.: An SVM framework for Genre-Independent scene change detection. In: IEEE International Conference on Multimedia and Expo, pp. 532-35 (2007)
    25. Gu, Z., Mei, T., Hua, X.S., Wu, X., Li, S.: EMS: Energy Minimization Based Video Scene Segmentation. In: IEEE International Conference on Multimedia and Expo, pp. 520-23 (2007)
    26. Han, B., Wu, W.: Video scene segmentation using a novel boundary evaluation criterion and dynamic programming. In: IEEE International Conference on Multimedia and Expo (ICME), 2011, pp. 1- (2011)
    27. Hanjalic, A., Lagendijk, R.L., Biemond, J.: Automated high-level movie segmentation for advanced video-retrieval systems. IEEE Trans. Circuits Syst. Video Technol. 9(4), 580-88 (1999) CrossRef
    28. Hauptmann, A., Witbrock, M.: Story segmentation and detection of commercials in broadcast news video. In: Proceedings. IEEE International Forum on Research and Technology Advances in Digital Libraries, 1998. ADL 98, pp. 168-79 (1998)
    29. Hsu, W.H.M., Chang, S.F.: Generative, discriminative, and ensemble learning on multi-modal perceptual fusion toward news video story segmentation. In: IEEE International Conference on Multimedia and Expo, 2004. ICME -4, vol. 2, pp. 1091-094 (2004)
    30. Huang, J., Liu, Z., Wang, Y.: Joint scene classification and segmentation based on hidden markov model. IEEE Trans. Multimed. 7(3), 538-50 (2005) CrossRef
    31. Huang, J., Liu, Z., Yao, W.: Integration of audio and visual information for content-based video segmentation. In: International Conference on Image Processing, ICIP 98, vol. 3, pp. 526-29 (1998)
    32. Janin, A., Gottlieb, L., Friedland, G.: Joke-o-Mat HD: browsing sitcoms with human derived transcripts. In: Proceedings of the International Conference on Multimedia, MM -0, pp. 1591-594. ACM, New York, NY, USA (2010)
    33. Javed, O., Rasheed, Z., Shah, M.: A framework for segmentation of talk and game shows. In: Eighth IEEE International Conference on Computer Vision, ICCV 2001, (2001)
    34. Katz, E., Klein, F., Nolen, R.: The film encyclopedia. Film Encyclopedia. HarperPerennial (1998). http://books.google.com/books?id=jhx0QgAACAAJ
    35. Kender, J., Yeo, B.L.: Video scene segmentation via continuous video coherence. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 367-73 (1998)
    36. Kohonen, T.: The self-organizing map. Neurocomputing 21(1-), 1- (1998) CrossRef
    37. Kwon, Y.M., Song, C.J., Kim, I.J.: A new approach for high level video structuring. In: IEEE International Conference on Multimedia and Expo, ICME 2000. (2000)
    38. Kyperountas, M., Kotropoulos, C., Pitas, I.: Enhanced Eigen-Audioframes for audiovisual scene change detection. IEEE Trans. Multimed. 9(4), 785-97 (2007) CrossRef
    39. Liang, C., Zhang, Y., Cheng, J., Xu, C., Lu, H.: A novel role-based movie scene segmentation method. In: Muneesawang, P., Wu, F., Kumazawa, I., Roeksabutr, A., Liao, M., Tang, X. (eds.) Advances in Multimedia Information Processing—PCM 2009, Lecture Notes in Computer Science, vol. 5879, Chap. 82, pp. 917-22. Springer, Berlin (2009)
    40. Lienbart, R., Pfeiffer, S., Effelsberg, W.: Scene determination based on video and audio features. In: IEEE International Conference on Multimedia Computing and Systems, vol. 1, pp. 685-90 (1999)
    41. Lin, T., Zhang, H.J., Shi, Q.Y.: Video scene extraction by force competition. In: IEEE International Conference on Multimedia and Expo, p. 192 (2001)
    42. Liu, C., Huang, Q., Jiang, S., Xing, L., Ye, Q., Gao, W.: A framework for flexible summarization of racquet sports video using multiple modalities. Comput. Vis. Image Underst. 113(3), 415-24 (2009) CrossRef
    43. Lu, L., Cai, R., Hanjalic, A.: Audio elements based auditory scene segmentation. In: IEEE International Conference on Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings, vol. 5, p. V (2006)
    44. Lu, L., Zhang, H.J., Jiang, H.: Content analysis for audio classification and segmentation. IEEE Trans. Speech Audio Process. 10(7), 504-16 (2002) CrossRef
    45. Mitrovi?, D., Hartlieb, S., Zeppelzauer, M., Zaharieva, M.: Scene segmentation in artistic archive documentaries. In: Leitner, G., Hitz, M., Holzinger, A. (eds.) HCI in Work and Learning, Life and Leisure, Lecture Notes in Computer Science, vol. 6389, Chap. 27, pp. 400-10. Springer, Berlin (2010)
    46. Monaco, J.: How to Read a Film: The World of Movies, Media, Multimedia: Language, History, Theory, 3 edn. Oxford University Press, USA (2000)
    47. Ngo, C.W., Ma, Y.F., Zhang, H.J.: Video summarization and scene detection by graph modeling. IEEE Trans. Circuits Syst. Video Technol. 15(2), 296-05 (2005) CrossRef
    48. Ngo, C.W., Pong, T.C., Zhang, H.J.: Motion-based video representation for scene change detection. Int. J. Comput. Vis. 50(2), 127-42 (2002) CrossRef
    49. Nitanda, N., Haseyama, M., Kitajima, H.: Audio signal segmentation and classification for scene-cut detection. In: IEEE International Symposium on Circuits and Systems, 2005. ISCAS 2005, Vol. 4, pp. 4030-4033 (2005)
    50. Niu, F., Goela, N., Divakaran, A., Abdel-Mottaleb, M.: Audio scene segmentation for video with generic content. In: Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series. Presented at the Society of Photo-Optical Instrumentation Engineers (SPIE) Conference, vol. 6820 (2008)
    51. Odobez, J.M., Gatica-Perez, D., Guillemot, M.: Spectral structuring of home videos. In: Bakker, E., Lew, M., Huang, T., Sebe, N., Zhou, X. (eds.) Image and Video Retrieval, Lecture Notes in Computer Science, vol. 2728, Chap. 31, pp. 85-0. Springer, Berlin (2003)
    52. Over, P., Awad, G., Fiscus, J., Antonishek, B., Michel, M., Smeaton, A.F., Kraaij, W., Quenot, G.: Trecvid 2010—an overview of the goals, tasks, data, evaluation mechanisms and metrics. In: Proceedings of TRECVID 2010. NIST, USA (2010)
    53. Parshin, V., Paradzinets, A., Chen, L.: Multimodal data fusion for video scene segmentation. In: Bres, S., Laurini, R. (eds.) Visual Information and Information Systems, Lecture Notes in Computer Science, vol. 3736, pp. 279-89. Springer, Berlin (2006)
    54. Petersohn, C.: Temporal video structuring for preservation and annotation of video content. In: 16th IEEE International Conference on Image Processing (ICIP), 2009, pp. 93-6 (2009)
    55. Poulisse, G., Moens, M.: Unsupervised scene detection in olympic video using multi-modal chains. In: 9th International Workshop on Content-Based Multimedia Indexing (CBMI), 2011, pp. 103-08 (2011)
    56. Rasheed, Z., Shah, M.: Scene Detection in Hollywood Movies and TV Shows. IEEE Computer Society, Los Alamitos, CA, USA, p. 343 (2003)
    57. Rasheed, Z., Shah, M.: Detection and representation of scenes in videos. IEEE Trans. Multimed. 7(6), 1097-105 (2005) CrossRef
    58. Rui, Y., Huang, T.S., Mehrotra, S.: Constructing table-of-content for videos. Multimed. Syst. 7(5), 359-68 (1999) CrossRef
    59. Sakarya, U., Telatar, Z.: Graph-based multilevel temporal video segmentation. Multimed. Syst. 14(5), 277-90 (2008) CrossRef
    60. Sakarya, U., Telatar, Z.: Video scene detection using dominant sets. In: 15th IEEE International Conference on Image Processing, 2008. ICIP 2008, pp. 73-6 (2008)
    61. Sakarya, U., Telatar, Z.: Video scene detection using graph-based representations. Signal Process. Image Commun. 25(10), 774-83 (2010) CrossRef
    62. Sang, J., Xu, C.: Character-based movie summarization. In: Proceedings of the International Conference on Multimedia, MM -0, pp. 855-58. ACM, New York, NY, USA (2010)
    63. Schoeffmann, K., Lux, M., Taschwer, M., Boeszoermenyi, L.: Visualization of video motion in context of video browsing. In: Proceedings of the IEEE International Conference on Multimedia and Expo. IEEE, New York, USA (2009)
    64. Schoeffmann, K., Taschwer, M., Boeszoermenyi, L.: The video explorer: a tool for navigation and searching within a single video based on fast content analysis. In: MMSys 10: Proceedings of the First Annual ACM SIGMM Conference on Multimedia Systems, p. 247-58. ACM, New York, NY, USA (2010)
    65. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888-05 (2000) CrossRef
    66. Sidiropoulos, P., Mezaris, V., Kompatsiaris, I., Kittler, J.: Differential edit distance: a metric for scene segmentation evaluation. IEEE Transa. Circuits Syst. Video Technol. 22(6), 904-14 (2012) CrossRef
    67. Sidiropoulos, P., Mezaris, V., Kompatsiaris, I., Meinedo, H., Bugalho, M., Trancoso, I.: Temporal video segmentation to scenes using High-Level audiovisual features. IEEE Trans. Circuits Syst. Video Technol. 21(8), 1163-177 (2011) CrossRef
    68. Sidiropoulos, P., Mezaris, V., Kompatsiaris, I., Meinedo, H., Trancoso, I.: Multi-modal scene segmentation using scene transition graphs. In: Proceedings of the Seventeen ACM International Conference on Multimedia, MM -9, pp. 665-68. ACM, New York, NY, USA (2009)
    69. Song, Y., Ogawa, T., Haseyama, M.: MCMC-based scene segmentation method using structure of video. In: IEEE International Symposium on Communications and Information Technologies (ISCIT), pp. 862-66 (2010)
    70. Sundaram, H., Chang, S.F.: Video scene segmentation using video and audio features. In: IEEE International Conference on Multimedia and Expo, 2000. ICME 2000 (2000)
    71. Sundaram, H., Chang, S.F.: Computable scenes and structures in films. IEEE Trans. Multimed. 4(4), 482-91 (2002) CrossRef
    72. Surowiecki, J.: The Wisdom of Crowds. Anchor, New York (2005)
    73. Tavanapong, W., Zhou, J.: Shot Clustering Techniques for Story Browsing. IEEE Trans. Multimed. 6(4), 517-27 (2004) CrossRef
    74. Truong, B.T., Venkatesh, S.: Video abstraction: a systematic review and classification. ACM Trans. Multimed. Comput. Commun. Appl. 3(1), 3+ (2007) CrossRef
    75. Truong, B.T., Venkatesh, S., Dorai, C.: Scene extraction in motion pictures. IEEE Trans. Circuits Syst. Video Technol. 13(1), 5-5 (2003) CrossRef
    76. Velivelli, A., Ngo, C.W., Huang, T.S.: Detection of documentary scene changes by Audio-Visual fusion image and video retrieval. In: Bakker, E.M., Lew, M.S., Huang, T.S., Sebe, N., Zhou, X.S. (eds.) Image and Video Retrieval, Lecture Notes in Computer Science, vol. 2728, Chap. 23, pp. 227-38. Springer, Berlin (2003)
    77. Vendrig, J., Worring, M.: Systematic evaluation of logical story unit segmentation. IEEE Trans. Multimed. 4(4), 492-99 (2002) CrossRef
    78. Vinciarelli, A., Favre, S.: Broadcast news story segmentation using social network analysis and hidden markov models. In: Proceedings of the 15th International Conference on Multimedia, MULTIMEDIA -7, pp. 261-64. ACM, New York, NY, USA (2007)
    79. Wang, J., Duan, L., Liu, Q., Lu, H., Jin, J.S.: A multimodal scheme for program segmentation and representation in broadcast video streams. IEEE Trans. Multimed. 10(3), 393-08 (2008) CrossRef
    80. Wang, X., Wang, S., Xuejun, S., Gabbouj, M.: A shot clustering based algorithm for scene segmentation. In: International Conference on Computational Intelligence and Security Workshops, CISW 2007, pp. 259-52 (2007)
    81. Weng, C.Y., Chu, W.T., Wu, J.L.: RoleNet: Movie analysis from the perspective of social networks. IEEE Trans. Multimed. 11(2), 256-71 (2009) CrossRef
    82. Wengang, C., De, X.: A novel approach of generating video scene structure. In: TENCON 2003. Conference on Convergent Technologies for Asia-Pacific Region, vol. 1, pp. 350-353 (2003)
    83. Wilson, K.W., Divakaran, A.: Discriminative genre-independent audio-visual scene change detection. SPIE, p. 725502 (2009)
    84. Xie, L.: Structure analysis of soccer video with domain knowledge and hidden markov models. Pattern Recognit. Lett. 25(7), 767-75 (2004) CrossRef
    85. Ya?aro?lu, Y., Alatan, A.: Summarizing video: Content, features, and HMM topologies. In: García, N., Salgado, L., Martínez, J.M. (eds.) Visual Content Processing and Representation, Lecture Notes in Computer Science, vol. 2849, Chap. 15, pp. 101-10. Springer, Berlin (2003)
    86. Yeung, M., Yeo, B.L., Liu, B.: Segmentation of video by clustering and graph analysis. Comput. Vis. Image Underst. 71(1), 94-09 (1998) CrossRef
    87. Zhai, Y., Shah, M.: Video scene segmentation using markov chain monte carlo. IEEE Trans. Multimed. 8(4), 686-97 (2006) CrossRef
    88. Zhai, Y., Yilmaz, A., Shah, M.: Story segmentation in news videos using visual and text cues. In: Leow, W.K., Lew, M., Chua, T.S., Ma, W.Y., Chaisorn, L., Bakker, E. (eds.) Image and Video Retrieval, Lecture Notes in Computer Science, vol. 3568, Chap. 13, pp. 92-02. Springer, Berlin (2005)
    89. Zhang, Z., Li, B., Lu, H., Xue, X.: Scene segmentation based on video structure and spectral methods. In: 10th International Conference on Control, Automation, Robotics and Vision, 2008. ICARCV 2008, pp. 1093-096 (2008)
    90. Zhao, L., Yang, S.Q., Feng, B.: Video scene detection using slide windows method based on temporal constrain shot similarity. In: IEEE International Conference on Multimedia and Expo, ICME 2001, pp. 1171-1174 (2001)
    91. Zhao, Y., Wang, T., Wang, P., Hu, W., Du, Y., Zhang, Y., Xu, G.: Scene segmentation and categorization using ncuts. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR -7, pp. 1- (2007)
    92. Zhou, J., Tavanapong, W.: Shot Weave: A shot clustering technique for story browsing for large video databases. In: Chaudhri, A., Unland, R., Djeraba, C., Lindner, W. (eds.) XML-Based Data Management and Multimedia Engineering EDBT 2002 Workshops, Lecture Notes in Computer Science, vol. 2490, Chap. 17, pp. 529-33. Springer, Berlin (2002)
    93. Zhu, S., Liu, Y.: Video scene segmentation and semantic representation using a novel scheme. Multimed. Tools Appl. 42(2), 183-05 (2009) CrossRef
  • 作者单位:Manfred Del Fabro (1)
    Laszlo B?sz?rmenyi (1)

    1. Institute of Information Technology (ITEC), Alpen-Adria-Universit?t Klagenfurt, Klagenfurt, Austria
  • ISSN:1432-1882
文摘
In the last 15?years much effort has been made in the field of segmentation of videos into scenes. We give a comprehensive overview of the published approaches and classify them into seven groups based on three basic classes of low-level features used for the segmentation process: (1) visual-based, (2) audio-based, (3) text-based, (4) audio-visual-based, (5) visual-textual-based, (6) audio-textual-based and (7) hybrid approaches. We try to make video scene detection approaches better assessable and comparable by making a categorization of the evaluation strategies used. This includes size and type of the dataset used as well as the evaluation metrics. Furthermore, in order to let the reader make use of the survey, we list eight possible application scenarios, including an own section for interactive video scene segmentation, and identify those algorithms that can be applied to them. At the end, current challenges for scene segmentation algorithms are discussed. In the appendix the most important characteristics of the algorithms presented in this paper are summarized in table form.
NGLC 2004-2010.National Geological Library of China All Rights Reserved.
Add:29 Xueyuan Rd,Haidian District,Beijing,PRC. Mail Add: 8324 mailbox 100083
For exchange or info please contact us via email.