A multimodal approach for extracting content descriptive metadata from lecture videos
详细信息    查看全文
  • 作者:Vidhya Balasubramanian…
  • 关键词:Multimodal metadata extraction ; Content descriptive metadata ; Keyphrase extraction ; Topic based segmentation ; Lecture videos
  • 刊名:Journal of Intelligent Information Systems
  • 出版年:2016
  • 出版时间:February 2016
  • 年:2016
  • 卷:46
  • 期:1
  • 页码:121-145
  • 全文大小:5,488 KB
  • 参考文献:Academic earth (2013). http://​academicearth.​org/​ .
    Adcock, J., Cooper, M., Denoue, L., Pirsiavash, H., Rowe, L.A. (2010). Talkminer: a lecture webcast search engine. In: Proceedings of the international conference on Multimedia, MM ’10, pp. 241–250. ACM, New York, NY, USA. doi:10.​1145/​1873951.​1873986 .
    Akiba, T., Aikawa, K., Itoh, Y., Kawahara, T., Nanjo, H., Nishizaki, H., Yasuda, N., Yamashita, Y., Itou, K. (2009). Construction of a test collection for spoken document retrieval from lecture audio data. JIP, 17, 82–94.
    Balagopalan, A., Balasubramanian, L.L., Balasubramanian, V., Chandrasekharan, N., Damodar, A. (2012). Automatic keyphrase extraction and segmentation of video lectures. In: Technology Enhanced Education (ICTEE), 2012 IEEE International Conference on, pp. 1–10. doi:10.​1109/​ICTEE.​2012.​6208622 .
    Berkeley webcasts (2013). http://​webcast.​berkeley.​edu/​ .
    Böhm, K., & Rakow, T.C. (1994). Metadata for multimedia documents. ACM Sigmod Record, 23(4), 21–26.CrossRef
    Chen, Y.N., Huang, Y., Kong, S.Y., Lee, L.S. (2010). Automatic key term extraction from spoken course lectures using branching entropy and prosodic/semantic features. In: Spoken Language Technology Workshop (SLT), 2010 IEEE, pp. 265–270. doi:10.​1109/​SLT.​2010.​5700862 .
    Fayyad, U., & Irani, K. (1993). Multi-interval discretization of continuous-valued attributes for classification learning.
    Frantzi, K.T., & Ananiadou, S. (1996). Extracting nested collocations. In: Proceedings of the 16th conference on Computational linguistics - Volume 1, COLING ’96, pp. 41–46. Association for Computational Linguistics, Stroudsburg, PA, USA. doi:10.​3115/​992628.​992639 .
    Gocr (2013). http://​jocr.​sourceforge.​net/​ .
    Haubold, A. (2004). Analysis and visualization of index words from audio transcripts of instructional videos. In: Multimedia Software Engineering, 2004. Proceedings. IEEE Sixth International Symposium on, pp. 570–573. IEEE .
    Haubold, A., & Kender, J.R. (2005). Augmented segmentation and visualization for presentation videos. In: Proceedings of the 13th annual ACM international conference on Multimedia, MULTIMEDIA ’05, pp. 51–60. ACM, New York, NY, USA . doi:10.​1145/​1101149.​1101158 .
    Haubold, A., & Kender, J.R. (2007). VAST MM: multimedia browser for presentation video. In: Proceedings of the 6th ACM international conference on Image and video retrieval, CIVR ’07, pp. 41–48. ACM, New York, NY, USA . doi: 10.​1145/​1282280.​1282286 .
    Hearst, M.A. (1997). Texttiling: segmenting text into multi-paragraph subtopic passages. Computer Linguistic, 23(1), 33–64. http://​dl.​acm.​org/​citation.​cfm?​id=​972684.​972687 .
    Hulth, A. (2003). Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of the 2003 conference on Empirical methods in natural language processing, EMNLP ’03, pp. 216–223. Association for ComputationalLinguistics, Stroudsburg, PA, USA. doi:10.​3115/​1119355.​1119383 .
    Hunter, J., Little, S., Building and indexing a distributed multimedia presentation archive using SMIL. In: ECDL’01, pp. 415–428 (2001).
    Kim, S.N., & Kan, M.Y. (2009). Re-examining automatic keyphrase extraction approaches in scientific articles. In: Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications, MWE ’09, pp. 9–16. Association for Computational Linguistics, Stroudsburg, PA, USA. http://​dl.​ acm.​org/​citation.​cfm?​id=​1698239.​1698242 .
    Liu, F., Liu, F., Liu, Y. (2008). Automatic keyword extraction for the meeting corpus using supervised approach and bigram expansion. In: Spoken Language Technology Workshop, 2008. SLT 2008. IEEE, pp. 181–184. doi:10.​1109/​SLT.​2008.​4777870 .
    Liu, F., Pennell, D., Liu, F., Liu, Y. (2009). Unsupervised approaches for automatic keyword extraction using meeting transcripts. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, NAACL ’09, pp. 620–628. Association for Computational Linguistics, Stroudsburg, PA, USA. http://​dl.​acm.​org/​citation.​cfm?​id=​ 1620754.​1620845 .
    Liu, T., & Kender, J.R. (2004). Lecture videos for e-learning: Current research and challenges. In: Proceedings of the IEEE Sixth International Symposium on Multimedia Software Engineering, 2004 pp. 574–578, IEEE.
    Manning, C.D., Raghavan, P., Schtze, H. (2008). Introduction to information retrieval. New York, NY, USA: Cambridge University Press.MATH CrossRef
    MIT OCW - MIT OpenCourseWare (2013). http://​ocw.​mit.​edu/​ .
    Medelyan, O., & Witten, I.H. (2006). Thesaurus based automatic keyphrase indexing. In: Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries, JCDL ’06, pp. 296–297. ACM, New York, NY, USA. doi:10.​1145/​1141753.​1141819 .
    Mukhopadhyay, S., & Smith, B. (1999). Passive capture and structuring of lectures. In: Proceedings of the seventh ACM international conference on Multimedia (Part 1), MULTIMEDIA ’99, pp. 477–487. ACM, New York, NY, USA. doi:10.​1145/​319463.​319690 .
    NPTEL - National Programme on Technology Enhanced Education (2013). http://​nptel.​iitm.​ac.​in/​ .
    Open Yale Courses (OYC) (2013). http://​oyc.​yale.​edu/​ .
    Tesseract OCR (2013). https://​code.​google.​com/​p/​tesseract-ocr/​ .
    VideoLectures.NET (2013). http://​videolectures.​net/​ .
    VideoLectures.Net Challenge (2014). http://​acmmm.​org/​2014/​docs/​mm/​_​gc/​mediamixer.​pdf .
    Viertl, R. (2008). Fuzzy models for precision measurements. Mathematics and Computers in Simulation, 79(4), 874–878.MATH MathSciNet CrossRef
    Witten, I.H., Paynter, G.W., Frank, E., Gutwin, C., Nevill-Manning, C.G. (1999). KEA: practical automatic keyphrase extraction. In: Proceedings of the fourth ACM conference on Digital libraries, DL ’99, pp. 254–255. ACM, New York, NY, USA. doi:10.​1145/​313238.​313437 .
    Ziółko, B., Manandhar, S., Wilson, R.C. (2007). Fuzzy recall and precision for speech segmentation evaluation. In: Proceedings of 3rd Language & Technology Conference, Poznan, Poland, .
  • 作者单位:Vidhya Balasubramanian (1)
    Sooryanarayan Gobu Doraisamy (2)
    Navaneeth Kumar Kanakarajan (2)

    1. Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham University, Coimbatore, India
    2. Amrita E-Learning Research Center, Amrita Vishwa Vidyapeetham University, Coimbatore, India
  • 刊物类别:Computer Science
  • 刊物主题:Data Structures, Cryptology and Information Theory
    Artificial Intelligence and Robotics
    Document Preparation and Text Processing
    Business Information Systems
  • 出版者:Springer Netherlands
  • ISSN:1573-7675
文摘
The rapidly increasing availability of e-learning content and lecture videos over the internet, has brought forth an imperative need for developing effective content based retrieval systems. Comprehensive metadata extraction and support for topic-level search within videos are key factors in developing such systems. In this paper, we propose a multimodal metadata extraction system which extracts an optimal set of keyphrases and topic based segments that effectively summarize the content of a lecture video. The extraction process utilizes features from both audio transcripts and slide content in video streams. A hybrid approach combining a Naive Bayes classifier and a rule-based refiner is used for effective retrieval of the metadata in a lecture. The proposed content-descriptive metadata extraction technique has been evaluated using actual lecture videos from different sources, and our results show that our multimodal approach is effective in summarizing the lecture’s content, potentially improving the user experience during retrieval and browsing. Keywords Multimodal metadata extraction Content descriptive metadata Keyphrase extraction Topic based segmentation Lecture videos

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700