体育视频语义内容分析技术研究

英文题名：Sports Video Semantic Content Analysis
作者：陈剑赟
论文级别：博士
学科专业名称：管理科学与工程
中文关键词：语义内容分析 ; 低层特征与高层语义之间的关联 ; 基本语义单元(BSU) ; 基本语义单元之间的关系(BSURelation) ; 体育视频
英文关键词：Semantic Content Analysis ; Mapping between low-level Features and high-level Semantics ; BSU ; BSURelation ; Sports Video
学位年度：2005
导师：吴玲达
学科代码：1201
学位授予单位：国防科学技术大学
论文提交日期：2005-04-01

摘要

传统的视频内容分析抽取客观存在的感知特征,而用户所消费的往往是语义内容,这就造成了计算机自动分析与用户需求之间的矛盾。多媒体信息系统领域专家把这种矛盾称为语义鸿沟。语义鸿沟是阻碍新一代视频应用的瓶颈问题。本文以体育视频为研究对象,从概念模型、技术框架、分析方法等方面系统地研究了视频低层特征与高层语义之间的关联,以跨越语义鸿沟获取体育视频的语义内容。
     在体育比赛领域规则和视频拍摄编辑手法的基础上,本文定义了体育视频的基本语义单元BSU(Basic Semantic Unit),BSU是表征体育视频语义内容的基本单元。围绕BSU,本文提出了基于BSU的体育视频语义内容分析框架,进而重点研究了该框架下的伴随音轨BSU、场景BSU和事件BSU等各类BSU的语义内容分析,并设计实现了体育视频语义内容分析和摘要平台SCASP(Sports video Semantic Content Analysis and Summarization Platform)。论文的主要贡献体现在以下几个方面:
     ●提出了基于BSU的体育视频语义内容分析框架。这个框架包括两个部分:一是基于BSU的概念模型——BSUCN(Basic Semantic Unit Composite Network);定义基本语义单元之间的关系为BSURelation,BSUCN是由BSU和BSURelation组成的体育视频语义内容分析的网络;BSUCN将纷繁芜杂的语义理解问题转化为目标明确的BSU分类识别。另一是基于概率统计关联模型的技术框架;技术框架明确了体育视频语义内容分析的技术途径和基本方法论,指出BSU的语义内容分析是不确定性的分类识别问题,需要采用基于概率统计的模型实现低层特征与高层语义之间的关联。
     ●提出了基于高斯混合模型的伴随音轨BSU语义内容分析方法。在基于BSU的体育视频语义内容分析框架基础上,运用高斯混合模型建模体育视频伴随音轨的语义类型,将伴随音轨BSU的语义内容分析转化为音频的语义分类与分段。
     ●提出了基于隐马尔可夫模型的场景BSU语义内容分析方法。在基于BSU的体育视频语义内容分析框架基础上,运用隐马尔可夫模型建模体育视频视图与场景的统计时序关系,将场景BSU的语义内容分析转化为场景的语义分类与分割。
     ●提出了基于贝叶斯网络的事件BSU的语义内容分析方法。在基于BSU的体育视频语义内容分析框架基础上,运用贝叶斯网络建模体育视频语义事件的多特征融合关系,将事件BSU的语义内容分析转化为基于概率统计模型的融合分析。
     ●设计并实现了体育视频语义内容分析和摘要平台——SCASP,对基于BSU的体育视频语义内容分析框架和相关技术进行了应用和验证。
     综上所述,本文提出了体育视频语义内容分析的概念、框架和方法,并通过设计实现SCASP,验证了本文的思路。这些研究为视频语义鸿沟问题提供了一定的解决之道,视频语义内容分析技术的不断发展和完善将使其在信息资源的管理和共享等领域发挥越来越大的作用。
One of the major challenges facing current content-based video analysis and the related applications is the so-called "the Semantic Gap" between the rich high-level semantics that a user desires and the shallowness of the low-level features that the automatic algorithms can extract from the media. In this thesis, we systematically explore the problem of bridging this gap in the sports video.According to domain-specific knowledge of sports video, at first we define those periodical or semi-periodical important semantic parts during the sports programs as "Basic Semantic Unit", abbreviated to "BSU", which include AudioBSU, SceneBSU and EventBSU and so on. Then a general framework based on BSU for sports video semantic content analysis is presented. Within this general framework, we develop the methods of BSUs semantic content analysis that map low-level features to high-level semantics. Finally, the above framework and methods are validated by designing and implementing the Sports video Semantic Content Analysis and Summarization Platform-- SCASP.The main contributions of this thesis are as follows:· We propose a novel unified BSU-based framework for sports video semantic content analysis, which is composed of two parts: the concept model BSUCN (Basic Semantic Unit Composite Network) and the probabilistic technical framework. On one hand, BSUCN defines the relations among BSUs as "BSURelation" and models the semantic content of sports video. To extract semantics from sports video, we convert the video indexing and understanding problem into a pattern classification and recognition problem. On the other hand, the technical framework clarifies the appropriate approach and methodology of this domain. Unlike previous approaches, we want a feasible, general and effective technique for developing those stochastic models rather than fine-tuning signal-based analytical procedures.· We address the method of AudioBSU semantic content analysis based on Gaussian Mixture Model. We model three kinds of AudioBSU in sports video using GMM and approach the AudioBSU semantic content analysis as audio classification and segmentation.· We develop the method of SceneBSU semantic content analysis based on Hidden Mixture Model. We model the statistical temporal relations of views and scenes in sports video using HMM and approach the SceneBSU semantic content analysis as scene classification and segmentation.· We devise the method of EventBSU semantic content analysis based on Bayesian Network. We model the combined relations of low-level evidences in event and approach the EventBSU semantic content analysis as fusion analysis in event detection.· We design and implement SCASP, which gives a sound support to the above
    framework and methods of sports video semantic content analysis.In a word, this thesis provides an in-depth investigation into the concepts, framework and methods of sports video semantic content analysis. The framework and methods are flexible and generic and can therefore be applied to applications such as multimedia management, human-computer interaction and so on.

引文

[1] Shih-Fu Chang. The Holy Grail of Content-based Media Analysis. IEEE Multimedia, Vol.9, No.2, pp6-10, April/June, 2002.
    [2] Marc Davis, Chitra Dorai, Frank Nack. Understanding Media Semantics. The 11th Tutorial Program of the 11th ACM International Conference on Multimedia. Berkeley, CA, USA, Nov 2003.
    [3] Hyun Sung Chang, Sanghoon Sull, Sang Uk Lee. Efficient Video Indexing Scheme for Content-based Retrieval. IEEE Transactions on Circuit and System for Video Technology, Vol.9, No.8, pp1269-1279, Dec 1999.
    [4] Shih-Fu Chang, William Chen, Horace J.Meng, et al. A Fully Automated Conten-based Video Search Engine Supporting Spatiotemporal Queries. IEEE Transactions on Circuit and System for Video Technology, Vol.8, No.5, pp602-615, Sep 1998.
    [5] D. Zhong and Shih-Fu Chang. An Integrated Approach for Content-based Segmentation and Retrieval. IEEE Transactions on Circuit and System for Video Technology, Vol.9, No.8, pp1259-1268, Dec 1999.
    [6] Yong Rui, Thomas S.Huang, Michael Ortega, Sharad Mehrotra. Relevance Feedback: A Power Tool for Interactive Content-based Image Retrieval. IEEE Transactions on Circuits and System for Video Technology, Vol.8, No.5, pp644-655, Sep 1998.
    [7] C.E. Erdem, A. M. Tekalp and B. Sankur. Video Object Tracking with Feedback of Performance Measures. IEEE Transactions on Circuits and System for Video Technology, Vol. 13, No.4, pp310-324, Apr 2003.
    [8] Matthew J. Beal, Nebojsa Jojic, Hagai Attias. A Graphical Model for Audiovisual Object Tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 25, No.7, pp828-836, July 2003.
    [9] Song-Chun Zhu. Statistical Modeling and Conceptualization of Visual Patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 25, No.6, pp691-712, June 2003.
    [10] A. Smeulders, et al. Content-based Image Retrival at the End of the Early Years. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.22, No.12, pp1349-1380, Dec 2000.
    [11] Frank Nack. The Future in Digital Media Computing is Meta. IEEE MultiMedia, Vol. 11, No. 2, pp10-13, April/June 2004.
    [12] Marios C. Angelides. Guest Editor's Introduction: Multimedia Content Modeling and Personalization. IEEE MultiMedia, Vol. 10, No. 4, pp12-15, Oct/Dec 2003.
    [13] Chitra Dorai, Svetha Venkatesh. Guest Editors' Introduction: Bridging the Semantic Gap with Computational Media Aesthetics. IEEE MultiMedia, Vol.10, No.2, pp 15-17, April/June 2003.
    [14] Mehmet Emin Donderler, Ediz Saykol, Ozgur Ulusoy, Ugur Gudukbay. BilVideo: A Video Database Management System. IEEE MultiMedia, Vol. 10, No. 1, pp66-70, January/March 2003.
    [15] Nevenka Dimitrova, Hong-Jiang Zhang, Behzad Shahraray, Ibrahim Sezan, Thomas Huang, Avideh Zakhor. Applications of Video-Content Analysis and Retrieval. IEEE MultiMedia, Vol.9, No.3, pp42-55, July/September 2002.
    [16] Jurgen Assfalg, Marco Bertini, Carlo Colombo, Alberto Del Bimbo. Semantic Annotation of Sports Videos. IEEE MultiMedia, Vol.9, No.2, pp52-60, April/June 2002.
    [17] Riccardo Leonardi, Pierangelo Migliorati. Semantic Indexing of Multimedia Documents. IEEE MultiMedia, Vol.9, No.2, pp44-51, April/June 2002.
    [18] G. L. Foresti and C.S. Regazzoni. A Hierarchical Approach to Feature Extraction and Grouping. IEEE Transactions on Image Processing, Vol.9, No.6, pp1056-1074, 2000.
    [19] Storage and Retrieval for Image and Video Databases (SPIE), http://www.informatik.uni-trier.de/～ley/db/conf/spieSR/
    [20] Storage and Retrieval for Media Databases 2001 (SPIE), http://www.spie.org/web/meetings/programs/pw01/confs/4315.html
    [21] Storage and Retrieval for Media Databases 2002 (SPIE), http://www.spie.org/Conferences/Programs/O2/pw/confs/4676.html
    [22] Storage and Retrieval for Media Databases 2003 (SPIE), http://electronicimaging.org/program/03/conferences/index.efm?fuseaction=5021
    [23] ACM Multimedia, http://www.acm.org/sigmm/
    [24] ACM Special Interest on Information Retrieval, http://www.acm.org/sigir/
    [25] IEEE CBAIVL and ICASSP, http://www.computer.org/proceedings/
    [26] A.G.Hauptmann and M. Smith. Text, Speech and Vision for video Segmentation: The Informedia Project. AAAI Fall Symposium on Computational Models for Integrating Language and Vision, Boston, MA, USA, Nov 1995.
    [27] ACM SIGGRAPH, http://www.siggraph.org/
    [28] The Final Program of IEEE International Conference on CVPR 2003, http://www.cs.toronto.edu/cvpr2003/papers/program-final.html
    [29] ACM Special Interest on Knowledge Discovery in Data and Data Mining, http://www.acm.org/sigs/sigkdd/
    [30] Wang Weiqiang, Gao Wen. Automatic Parsing of News Video Using Multimodal Analysis. Journal of Software, 2001.9.
    [31] 白雪生.基于内容检索及其相关技术的研究[博士论文].北京:清华大学,1998.
    [32] 曹莉华.视频媒体的基于内容处理和检索的研究与实现[博士论文].长沙:国防科学技术大学,1998.
    [33] 熊华.视频内容结构化技术的研究与实现[博士论文].长沙:国防科学技术大学,2001.
    [34] 王辰.多媒体融合分析技术的研究与实现[博士论文].长沙:国防科学技术大学,2002.
    [35] C.G.M. Snoek and M. Worring. Multimodal Video Indexing: A Review of the State-of-the-Art. Technical Report 2001-20, Intelligent Sensory Information Systems Group, University of Amsterdam, 2001.
    [36] Howard D. Wactlar, Michael G. Christel, Yihong Gong, Alexander G. Hauptmann. Lessons Learned from Building a Terabyte Digital Video Library. IEEE Computer, Vol.32, No.2, pp66-73, Feb 1999.
    [37] Atsuo Yoshitaka, Tadao Ichikawa. A Survey on Content-Based Retrieval for Multimedia Databases. IEEE Transactions on Knowledge and Data Engineering, Vol. 11, No. 1, pp81-93, JANUARY/FEBRUARY 1999.
    [38] Chong-Wah Ngo, Ting-Chuen Pong, Hong-Jiang Zhang. Recent Advances in Content Based Video Analysis. International Journal of Image and Graphics, Vol. 1, No. 3, pp445-468, 2001.
    [39] Sameer Antani, Rangachar Kasturi, Ramesh Jain. A Survey on the Use of Pattern Recognition Methods for Abstraction. Indexing and Retrieval of Images and Video. Pattern Recognition, Vol.35, No.4, pp945-965, Apr 2002.
    [40] Yihong Gong. Video Summarization and Event Detection. The 5th Tutorial Program of the 11th ACM International Conference on Multimedia. Berkeley, CA, USA, Nov 2003.
    [41] Myron Flickner, Harpreet Sawhney, Wayne Niblack, et al. Query by Image and Video Content: The QBIC System. IEEE Computer, Vol.28, No.9, pp23-32, Sep 1995.
    [42] J. R. Bach, C. Fuller, A. Gupta, et al. The Virage Image Search Engine: an Open Framework for Image Management. In Proceedings of SPIE: Storage and Retrieval for Image and Video Databases Ⅳ, pp76-87, San Diego, CA, USA, 1996.
    [43] Ramesh Jain. InfoScopes: Multimedia Information Systems. In Multimedia Systems and Techniques, Edited by B. Furht, Kluwer Academic Publishers, pp217-253. Boston, 1996.
    [44] John R. Smith and Shih-Fu Chang. VisualSEEK: A Fully Automated Content-Based Image Query System. In Proceedings of the fourth ACM International Conference on Multimedia, pp87-98, Boston, MA, USA, Nov 1996.
    [45] Shih-Fu Chang, William Chen, H. J. Meng, et al. VideoQ: An Automated Content based Video Search System Using Visual Cues. In Proceedings of the fifth ACM International Conference on Multimedia, pp313-324, Seattle, USA, Nov 1997.
    [46] A. G. Hauptmann and M. Witbrock. Informedia: News-on-demand Multimedia Information Acquisition and Retrieval. In Intelligent Multimedia Information Retrieval, Chapter 10, pp215-240. MIT Press, Cambridge, Mass, 1997. http://www.cs.cmu.edu/afs/cs/user/alex/www/.
    [47] 吴玲达,老松扬,王辰等.多媒体信息系统.电子工业出版社,北京:中国,2002.
    [48] 李国辉.信息组织与检索.科学出版社,北京:中国,2001.
    [49] Ahmet Ekin, A. Murat Tekalp and Rajiv Mehrotra. Integrated Semantic Syntactic Video Event Modeling for Search and Browsing. Accepted for publication in IEEE Multimedia. http://www.ece.rochester.edu/users/ekin/publications.html
    [50] Milind R. Naphade and Thomas S. Huang. Probabilistic Multimedia Objects Multijects: A novel Approach to Indexing and Retrieval in Multimedia Systems. In Proceedings of IEEE International Conference on Image Processing (ICIP'98), Vol.3, pp 536-540, Chicago, IL, USA, Oct 1998.
    [51] Shih-Fu Chang. Optimal Video Adaptation and Skimming Using a Utility-based Framework. In Proceedings of Tyrrhenian International Workshop on Digital Communications (IWDC'02), Capri Island, Italy, Sept. 2002.
    [52] 钟玉琢,王琪,贺玉文.基于对象的多媒体数据压缩编码国际标准——MPEG-4及其校验模型.科学出版社,北京:中国,2000.
    [53] The MPEG Home Page. http://www.chiariglione.org/mpeg/
    [54] ISO. Overview of the MPEG-7 Standard. ISO/IEC JTC1/SC29/WG11, N3752. La Baule, October 2000.
    [55] ISO. MPEG-7 Context, Objectives and Technical Roadmap. ISO/IEC JTC1/SC29/WG11, N2861. Vancouver, July 1999.
    [56] ISO. MPEG-7 Principal Concept List. ISO/IEC JTC1/SC29/WG11, N3250. Noordwijkerhout, March 2000.
    [57] ISO. MPEG-7 Applications, Demos and Projects. ISO/IEC JTC1/SC29/WG11, N3546. Beijing, July 2000.
    [58] ISO. MPEG-21 Overview v.5. ISO/IEC JTC1/SC29/WG11, N5231. Shanghai, October 2002.
    [59] David A. Sadlier, Sean Marlow, Noel O'Connor and Noel Murphy. MPEG Audio Bitstream Processing Towards the Automatic Generation of Sports Programme Summaries. In Proceedings of IEEE International Conference on Multimedia and Expo (ICME'02), pp26-29, Laussane, Switzerland, August 2002.
    [60] F. Coldefy, P. Bouthemy. Unsupervised Soccer Video Abstraction Based on Pitch, Dominant Color and Camera Motion Analysis. In Proceedings of the 12th ACM International Conference on Multimedia, pp268-271, New York, NY, USA, October 10-16, 2004.
    [61] Cabasson R, Divakaran A. Automatic Extraction of Soccer Video Highlights Using a Combination of Motion and Audio Features. In Proceeding of SPIE: Storage and Retrieval for Multimedia Databases, SPIE Volume 5021, pp272-276, Santa Clara, CA,USA, January 2003.
    [62] Y. Rui, A. Gupta and A. Acero. Automatically Extracting Highlights for TV Baseball Programs. In Proceedings of the 8th ACM International Conference on Multimedia, pp105-115, Marina del Rey, CA, USA, Oct 30-Nov 4, 2000.
    [63] Erwin M. Bakker and Michael S. Lew. Semantic Video Retrieval Using Audio Analysis. In Proceedings of International Conference on Image and Video Retrieval (CIVR'02), pp271-277, London, UK, July2002.
    [64] Yuh-Lin Chang, Wenjun Zeng, Ibrahim Kamel, Rafael Alonso. Integrated Image and Speech Analysis for Content-based Video Indexing. In Proceedings of IEEE International Conference on Multimedia Computing and Systems (ICMCS'96), pp306-313. Hiroshima, Japan, June 1996.
    [65] D. Zhang and D. Ellis. Detecting Sound Events in Basketball Video Archive. Technical Report, Dept. of Electrical Engineering, Columbia University, 2001. http://www.ee.columbia.edu/～dqzhang/publication/
    [66] R. Dahyot, A. C. Kokaram, N. Rea and H. Denman. Joint Audio-Visual Retrieval for Tennis Broadcasts. In Proceedings of the 28th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'03), Hong Kong, April 2003.
    [67] Milind R. Naphade and Thomas S. Huang. Stochastic Modeling of Soundtrack for Efficient Segmentation and Indexing of Video. In Proceedings of SPIE Storage and Retrieval for Media Databases 2000, SPIE Volume 3972, pp168-176, San Jose, CA, USA, Jan 2000.
    [68] Sameer Antani, Ullas Gargi, David Crandall, et al. Extraction of Text in Video. Technical Report, CSE-99-016, Department of Computer Science and Engineering, Penn State University, Aug 1999.
    [69] Yu Zhong, Hongjiang Zhang and Anil K. Jain. Automatic Caption Localization in Compressed Video. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.22, No.4, pp385-392, April 2000.
    [70] A. Wernicke, R. Lienhart. On the Segmentation of Text in Videos. In Proceedings of IEEE International Conference on Multimedia and Expo (ICME'00), pp1511-1514, New York City, NY, USA, July 30-August 2, 2000.
    [71] Dongqing Zhang, R. K Rajendran and Shih-Fu Chang. General and Domain-specific Techiniques for Detecting and Recognizing Superimposed Text in Video. In Proceedings of IEEE International Conference on Image Processing (ICIP'02), Rochester, NY, USA, Sep 2002.
    [72] Dongqing Zhang, Shih-Fu Chang. Event Detection in Baseball Video Using Superimposed Caption Recognition. In Proceedings of the 10th ACM International Conference on Multimedia, pp315-318, Juan-les-Pins, France, Dec 2002.
    [73] HongJiang Zhang, Shuang Yeo Tan, Stephen W. Smoliar et al. Automatic Parsing of News Video. Multimedia Systems, Vol.2, Issue 6, pp256-266, January 1995.
    [74] D. Zhong and Shih-Fu Chang. Structure Analysis of Sports Video Using Domain Models. In Proceedings of IEEE International Conference on Multimedia and Expo (ICME'01), Tokyo, Japan, August 2001.
    [75] Baoxin Li, M. Ibrahim Sezan. Event Detection and Summarization in Sports Video. In Proceedings of IEEE Workshop on Content-based Access of Image and Video Libraries (CBAIVL' 01), pp 132-138, Kauai, Hawaii, December 2001.
    [76] Yap-Peng Tan, Drew D. Saur, Sanjeev R. Kulkarni. Rapid Estimation of Camera Motion from Compressed Video with Applications to Video Annotation. IEEE Transactions on Circuits and Systems for Video Technology, Vol.10, No.1, ppl33-146,Feb 2000.
    [77] P. Xu, L. X. Xie, Shih-Fu Chang, et al. Algorithms and System for Segmentation and Structure Analysis in Soccer Video. In Proceedings of IEEE Conference on Multimedia and Expo (ICME'01), Tokyo, Japan, August 2001.
    [78] L. X. Xie, Shih-Fu Chang, A. Divakaram, et al. Structure Analysis of Soccer Video with Hidden Markov Models. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'02), Orlando, FL, USA, May 2002.
    [79] L. X. Xie, Peng Xu, ShiF-Fu Chang. Structure Analysis of Soccer Video with Domain Knowledge and Hidden Markov Models. Pattern Recognition Letters, Vol. 25, Issue 7, pp767-775, May 2004.
    [80] Riccardo Leonardi and Pierangelo Migliorati. Semantic Indexing of Multimedia Documents. IEEE MultiMedia, Vol.9, No. 2, pp44-51, April/June 2003.
    [81] G. Sudhir, John C. M. Lee, Anil K. Jain. Automatic Classification of Tennis Video for High-level Content-based Retrieval. In Proceedings of International Workshop on Content-Based Access of Image and Video Databases (CAIVD '98), pp81-90, Bombay, India, January, 1998.
    [82] D. Y. Chen, S. Y. Lee. Motion-based Semantic Event Detection for Video Content Description on MPEG-7. In Proceedings of the 2nd IEEE Pacific-Rim Conference of Multimedia (PCM'01), pp110-117, Beijing, China, Sep2001.
    [83] Wensheng Zhou, Asha Vellaikal, C. C. Jay Kuo. Rule-based Video Classification System for Basketball Video Indexing. In Proceedings of the 8th ACM International Conference on Multimedia, pp213-216, Los Angeles, CA, USA, Oct30-Nov4, 2000.
    [84] Yihong Gong, Lim Teck Sin, Chua Hock Chuan, Hongjiang Zhang et al. Automatic Parsing of TV Soccer Programs. In Proceedings of International Conference on Multimedia Computing and Systems (ICMCS'95), ppl67-174, Washington, USA, May, 1995.
    [85] Sebastien Lefevre, Cyril Fluck, Benjamin Maillard and Nicole Vincent. A Fast Snake-based Method to Track Football Players. In Proceedings of International Workshop on Machine Vision Applications, pp501-504, Tokyo, Japan, Nov 2000.
    [86] O. Utsumi, K. Miura, I. Ide, S. Sakai, et al. An Object Detection Method for Describing Soccer Games from Video. In Proceedings of IEEE International Conference on Multimedia and Expo (ICME'02), pp45-48, Laussane, Switzerland, August 2002.
    [87] Ahmet Ekin, A. Murat Tekalp. Automatic Soccer Video Analysis and Summarization. In Proceedings of SPIE: Storage and Retrieval for Media Databases 2003, SPIE Volume 5021, pp339-350, Santa Clara, CA, USA Jan 2003.
    [88] Chris Needham and Roger Boyle. Tracking Multiple Sports Players through Occlusion, Congestion and Scale. In Proceedings of British Machine Vision Conference, Vol.1, pp93-102, Manchester, UK, Sept. 2001.
    [89] Xinguo Yu, Changsheng Xu, Hon Wai Leong, et al. Trajectory-based Ball Detection and Tracking with Applications to Semantic Analysis of Broadcast Soccer Video. In Proceedings of the 11~(th) ACM International Conference on Multimedia, pp11-20, Berkeley, CA, USA, Nov, 2003.
    [90] Haoran Yi, Deepu Raj an and Liang-Tien Chia. Automatic Extraction of Motion Trajectories in Compressed Sports Videos. In Proceedings of the 12th ACM International Conference on Multimedia, pp312-315, New York, NY, USA, October 10-16, 2004.
    [91] V. Kobla, D. S. Doermann. Detection of Slow-motion Replay Sequences for Identifying Sports Videos. In Proceedings of the 3rd IEEE workshop on Multimedia Signal Processing. Copenhagen, Denmark, September, 1999.
    [92] V. Kobla, D. DeMenthon and D. Doermann. Identifying Sports Videos Using Replay, Text and Camera Motion Features. In Proceedings of SPIE: Storage and Retrieval for Media Databases, SPIE Volume 3972, pp332-343, San Jose, CA, USA, Jan. 2000.
    [93] Noboru Babaguchi, Yoshihiko Kawai, Yukinobu Yasugi, et al. Linking Live and Replay Scenes in Broadcasted Sports Video. In Proceedings of the 8~(th) ACM International Conference on Multimedia, pp205-208, Los Angeles, CA, USA, October30-November 4, 2000.
    [94] Hao Pan, Baoxin Li and M. I. Sezan. Automatic Detection of Replay Segments in Broadcast Sports Programs by Detection of Logos in Scene Transitions. In Proceedings of IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP'02), Orlando, FL, USA, May 2002.
    [95] Hao Pan, P. van Beek, M. I. Sezan. Detection of Slow-motion Replay Segments in Sports Video for Highlights Generation. In Proceedings of IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP'01), Salt Lake City, UT, USA, May 2001.
    [96] Vasanth Tovinkere, Richard J. Qian. Detecting Semantic Events in Soccer Games: Towards a Complete Solution, In Proceedings of IEEE International Conference of Multimedia and Expo (ICME'01), ppl040-1043, Tokyo, Japan, August 2001.
    [97] David A. Sadlier, Noel O' Connor, Sean Marlow, Noel Murphy. A Combined Audio-Visual Contribution to Event Detection in Field Sports Broadcast Video. Case Study: Gaelic Football. In Proceedings of IEEE International Symposium on Signal Processing and Information Technology (ISSPIT'03), Darmstadt, Germany, Dec 2003.
    [98] Surya Nepal, Uma Srinivasan, Graham Reynolds. Automatic Detection of 'Goal' Segments in Basketball Videos. In Proceedings of the 9th ACM International Conference on Multimedia, pp261-269, Ottawa, Canada, Sep 2001.
    [99] Noboru Babaguchi, Yoshihiko Kawai, Tadahiro Kitashi. Event-based Indexing of Broadcasted Sports Video by Intermodal Collaboration. IEEE Transactions on Multimedia, Vol.4, No.l, pp68-75, March 2002. [100] E. Kijak, G. Gravier, L. Oisel, et al. Audiovisual Integration for Tennis Broadcast Structuring. In Proceedings of the 3~(rd) International Workshop on Content-based Multimedia Indexing (CBMP'03), pp421-428, Rennes, France, Sep 2003.
    [101] Jinjun Wang, Changsheng Xu, Engsiong Chng, et al. Automatic Replay Generation for Soccer Video Broadcasting. In Proceedings of the 12th ACM International Conference on Multimedia, pp32-39, New York, NY, USA, October 10-16, 2004.
    [102] Li Ying, Zhang Tong, Daniel Tretter. An Overview of Video Abstraction Techniques. HP Corporation Technology Report, HPL-2001-191, 20010809, External, 2001.
    [103] John Boreczky, Andreas Girgensohn, Gene Golovchinsky, Shingo Uchihashi. An Interactive Comic Book Presentation for Exploring Video. In Proceedings of the Conference on Human Factors in Computing Systems (CHI'00), ppl85-192, The Hague, The Netherlands, April, 2000.
    [104] Michael Christel, Alexander Hauptmann, Adrienne S. Warmack, Scott A.Crosby. Adjustable Filmstrips and Skims as Abstractions for a Digital Video Library. In Proceedings of the IEEE Forum on Research and Technology Advances in Digital Libraries, pp98-104, Baltimore, MD, USA, March, 1999.
    [105] Michael Christel, Alexander Hauptmann, H. Wactlar, et al. Collages as Dynamic Summaries for News Video. In Proceedings of the 10th ACM International Conference on Multimedia, pp561-569, Juan-les-Pins, France, Dec 2002.
    [106] Di Zhong, Raj Kumar and Shih-Fu Chang. Real-time Personalized Sports Video Filtering and Summarization. In Proceedings of the 9th ACM International Conference on Multimedia, pp623-625, Ottawa, Canada, Sep 30-Oct 05,2001.
    [107] Yu-Fei Ma, Lie Lu, Hong-Jiang Zhang, Mingjing Li. A User Attention Model for Video Summarization. In Proceedings of the 10th ACM International Conference on Multimedia, pp533-542, Juan-les-Pins, France, Dec 2002.
    [108] Hari Sundaram, L. X. Xie, Shih-Fu Chang. A Utility Framework for the Automatic Generation of Audio-Visual Skims. In Proceedings of the 10th ACM International Conference on Multimedia, pp189-198, Juan-les-Pins, France, Dec. 2002.
    [109] Baoxin Li, J. Errico, Hao Pan and M. I. Sezan. Bridging the Sematic Gap in Sports. In Proceedings of SPIE: Storage and Retrieval for Media Databases 2003, SPIE Volume 5021, pp314-326, Santa Clara, CA, USA, Jan 2003.
    [110] Dongqing Zhang, Shih-Fu Chang. A Bayesian Framework for Fusing Multiple Word Knowledge Models in Videotext Recognition. In Proceeding of the Computer Vision and Pattern Recognition 2003 (CVPR'03), Vol.2, pp528-533, Madison, Wisconsin, USA, June 2003.
    [111] Ling-Yu Duan, Min Xu, Tat-Seng Chua, et al. A Mid-level Representation Framework for Semantic Sports Video Analysis. In Proceedings of the 11~(th) ACM International Conference on Multimedia, pp33-44, Berkeley, CA, USA, Nov 2003.
    [112] Yi Wu, Edward Y. Chang, Kevin Chen-Chuan Chang, John R. Smith. Optimal Multimodal Fusion for Multimedia Data Analysis. In Proceedings of the 12th ACM International Conference on Multimedia, pp572-579, New York, NY, USA, October 10-16, 2004.
    [113] Alejandro Jaimes and Shih-Fu Chang. Concepts and Techniques for Indexing Visual Semantics. In Image Databases: Search and Retrieval of Digital Imagery, Edited by Vittorio Castelli and Lawrence D. Bergman, John Wiley & Sons Inc. pp497-565.2002.
    [114] R. Arnheim. Art and Visual Perception: A Psychology of the Creative Eye. University of California Press, Berkeley, CA, USA, 1984.
    [115] V. Castelli, L. D. Bergman, I. Kontoyiannis et al. Progressive Search and Retrieval in Large Image Archives. IBM Journal of Research and Development, Vol.42, No.2, pp253-268, 1998.
    [116] Ji Zhang, Wynne Hsu, Mong Li Lee. An Information-driven Framework for Image Mining. In Proceedings of the 12th International Conference on Database and Expert Systems Applications (DEXA'01), pp232-242, Munich, Germany, Sep 2001.
    [117] Ji Zhang, Wynne Hsu, Mong Li Lee. Image Mining: Issues, Frameworks and Techniques. In Proceedings of the Second International Workshop on Multimedia Data Mining, pp13-20, San Francisco, CA, USA, Aug 2001.
    [118] The Official Website of the Olympic Movement. http://www.olympic.org/uk/games/index uk.asp
    [119] Tennis Rule Book. http://www.mpa.cc/ten_bk.html
    [120] FIFA, 2002, Laws of the game, http://www.fifa.com/fifa/handbook/laws/2002/LOTG2002_E.pdf.
    [121] Songyang Lao, Alan F. Smeaton, Gareth J. F. Jones, Hyowon Lee. A Query Description Model Based on Basic Semantic Unit Composite Petri-Net for Soccer Video. In Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval, pp143-150, New York, NY, USA, October 15-16, 2004.
    [122] Richard O.Duda,Peter E.Hart,David G.Stork著,李宏东,姚天翔等译.模式分类.机械工业出版社,北京:中国,2003.
    [123] 蔡元龙.模式识别.西北电讯工程学院出版社,西安:中国,1986.
    [124] Brendan J. Frey. Graphical Models for Machine Learning and Digital Communication. The MIT Press, Cambridge, Mass, USA, 1998.
    [125] 李刚.知识发现的图模型方法[博士论文].北京:中国科学院软件研究所,2001.
    [126] 史忠植.第七章:贝叶斯网络.知识发现.北京:清华大学出版社,pp169-202.2002.
    [127] Kevin Murphy. A Brief Introduction to Graphical Models and Bayesian Networks. http://www.ai.mit.edu/～murphyk/Bayes/bnintro.html
    [128] Todd A. Stephenson. An Introduction to Bayesian Network Theory and Usage. IDIAP-Technical Report 00-03, Dalle Molle Institute for Perceptual Artificial Intelligence, Martigrry, Valais, Switzerland, Feb 2000.
    [129] 赵力.语音信号处理.北京:机械工业出版社,2003.
    [130] 杨行峻.语音信号处理.北京:电子工业出版社,1995.
    [131] Yao Wang, Jincheng Huang, Zhu Liu, et al. Multimedia Content Classification Using Motion and Audio Information. In Proceedings of IEEE International Symposium on Circuits and Systems, Vol.2, pp1488-1491, Hong Kong, June 1997.
    [132] Zhu Liu, Jincheng Huang, Yao Wang, et al. Audio Feature Extraction and Analysis for Scene Classification. In Proceedings of IEEE Workshop on Multimedia Signal Processing, pp343-348, Princeton, NJ, USA, June 1997.
    [133] Lie Lu, Hong-jiang Zhang, Hao Jiang. Content Analysis for Audio Classification and Segmentation. IEEE Transactions on Speech and Audio Processing, Vol.10, No.7, pp504-516, Oct 2002.
    [134] Savitha Srinivasan, Dragutin Petkovic, Dulce Ponceleon. Towards Robust Features for Classifying Audio in the CueVideo System. In Proceedings of the 7th ACM International Conference on Multimedia, pp 393-400, Orlando, FL, USA, 1999.
    [135] Mark Baillie, Joemon M. Jose and Comelis J. van Rijsbergen. HMM Model Selection Isssues for Soccer Video. In Proceedings of the 3rd International Conference on Image and Video Retrieval (CIVR'2004), pp70-78, Dublin, Ireland, July 2004.
    [136] Lawrence Rabiner. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. In Proceedings of the IEEE, Vol.77, No.2, pp 257-286, Feb 1989.
    [137] 姚天任.第九章:隐马尔可夫模型(HMM).数字语音处理.武汉:华中理工大学出版社,pp312-355,1992.
    [138] Starfish.动态规划—Dynamic Programming http://algorithm.myrice.com/algorithm/technique/dynamic_programming/
    [139] 何斌,马天予,王运坚,朱红莲.Visual C~(++)数字图像处理.北京:人民邮电出版社,2001.
    [140] C. G. Harris and M. J. Stephens. A Combined Corner and Edge Detector. In Proceedings of the Fourth Alvey Vision Conference, pp147-151, Manchester, 1988.
    [141] Pearl. J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers, Inc. San Francisco, CA, USA, 1988.
    [142] C. Huang and A. Darwiche. Inference in Belief Networks: A Procedural Guide. International Journal of Approximate Reasoning, Vol. 15, No.3, pp 225,263, 1996.
    [143] W. L. Buntine. Operations for Learning with Graphical Models. Journal of Artificial Intelligence Research, Vol.2, pp 159-225, 1994.
    [144] Xu Zhenning, Zhang Wenming. Study and Implementation of a Semantic Information Query System Based on Ontology. In Proceedings of the International Conference on Info-tech and Info-net, pp26-31, Beijing, China, Oct 2001.
    [145] Latifur Khan and Dennis Mcleod. Effective Retrieval of Audio Information from Annotated Text Using Ontologies. In Proceedings of the 1st International Workshop on Multimedia Data Mining (MDM/KDD'2000), in conjunction with ACM SIGKDD conference, pp 37-45, Boston, MA, USA, Aug 2000.
    [146] Simeon J. Simoff and Mary Lou Maher. Ontology-based Mutimedia Data Mining for Design Information Retrieval. Technical Report, Key Centre of. Design Computing, University of Sydney, Australia.
    [147] Guarino N, Giaretta P. Ontology and Knowledge Bases: Towards a Terminological Clarification. In Mars, N. J. I. (Ed.), Towards Very Large Knowledge Bases-Knowledge Building and Knowledge Sharing, IOS Press, Amsterdam, 1995.
    [148] Wang Fei, Ma Yufei, Zhang Hongjiang, et al. Dynamic Bayesian Network Based Event detection for Soccer Highlight Extraction. In Proceeding of IEEE International Conference on Image Processing (ICIP'04), Singapore, Oct. 24-27, 2004.
    [149] Chung-Yuan Chao, Huang-Chia Shih, Chung-Lin Huang. Semantics-Based Highlight Extraction of Soccer Program Using DBN. In Proceeding of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'05), Philadelphia, PA, USA, March 18-23, 2005.
    [150] Wu Chuan, Ma Yufei, He Yuwen, et al. Event Detection by Semantic Inference in Sports videos (in Chinese). Journal of Tsinghua Univ. (Sci & Tech), Vol.43, No.4, pp507-509, April 2003.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700