用户名: 密码: 验证码:
基于语义分析方法的视频流媒体大数据技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
视频语义分析是对视频信息所包含内容的描述和逻辑表示,涉及众多信息处理研究领域。大数据技术在数据存储、运算等方面具有优势,而粒计算理论在数据描述、特征分类等方面具有优势,因此将大数据技术和粒计算理论应用于视频语义分析系统具有明显的应用前景。
     本论文针对流媒体大数据处理分析中面临的关键问题展开工作,对流媒体大数据存储模型、视频语义分析模型和视频分析关键算法进行了深入研究。首先讨论了视频数据处理的关键问题,针对视频语义分析的存储、查询问题提出了一种新的流媒体大数据存储架构,构建了视频数据按时间特征存储、检索的解决方案;其次,将大数据技术用于视频语义分析,并将粒计算理论用于视频数据的结构描述,建立了不同粒度下的分层视频语义分析模型;最后,针对视频处理运动物体检测和阴影抑制,提出了一种基于运动矢量检测的阴影抑制算法,并讨论了该算法在大数据系统的实现问题。
     具体研究内容和创新点有如下几个方面:
     1、设计了一种按时间特征寻址、检索和分析流媒体大数据的新型存储架构。该架构支持流媒体数据编解码压缩,可以按帧存储寻址,有利于在海量存储介质中实现快速定位和分析结果统一存储。
     2、在流媒体大数据按帧存储寻址的基本结构上,提出了元数据描述的语义分析框架并建立了多层次数据库接口。设计了通用的流媒体大数据存储检索数据库,在此基础上可以根据不同应用建立各自的视频分析元数据模型以及数据库接口,解决了众多异构的视频采集模块、视频分析模块和系统的连接问题。
     3、采用粒计算理论,将视频数据中图像特征、图像对象、视频对象、语义对象等物理结构映射为粒子,形成包含粒属性结构的视频流媒体粒度模型。将不同粒层的视频粒子采用不同的分析手段进行处理,可以简化视频分析的算法结构,同时也有利于采用大数据技术实现并行运算
     4、提出了基于运动矢量检测的运动物体阴影抑制算法,算法将视频编码信息中的运动矢量信息用于运动物体阴影抑制,取得了较好的抑制效果,最后采用视频流媒体粒度模型对算法进行建模设计。
Video semantic analysis is the description and logical representation of the contents in video information, which is a complicated process of information processing in many research areas. BigData technology has the advantage of data storage and data computing, and granular computing theory is good at data representation and feature classification. Therefore, both BigData technology and granular computing theory are combined and applied to video semantic analysis system, which can solve the key problems such as data storage, computing, and representation.
     It is focused on structuring the storage framework for the big data in the streaming media, building the analysis model of video semantic, and proposing the key algorithms for video analysis. Firstly, the key problems in video data processing are discussed and a novel big data storage framework for streaming media is proposed to solve the storage and inquiry of the video semantic analysis results. The design solution is to storage and retrieves video data by time features. Secondly, the BigData technique is employed video semantic analysis and the granular computing theory is applied to the structure description of video data. On the basis, a model for the hierarchical video semantic analysis is built under the different granularities. At last, the detection for the moving object and shadow in video were discussed, and the motion vector detection-based shadow suppression algorithm was proposed and implemented in the big data systems.
     The main research contents and innovative works in the dissertation include:
     1. Structuring a novel storage framework for the addressing, retrieval and analysis of big data in the streaming media by time features. This framework supports the compression of encoding and decoding of streaming media data. It can store and address by frames and can realize the quick positioning and unified storage of analysis results in the mass storage medium.
     2. Structuring the metadata description framework of semantic analysis and setting the interface for hierarchical database. A multipurpose database for the big data storage and retrieval in streaming media is designed to build their respective metadata model and database interfaces for different applications. It can solve the connection problems in the heterogeneous video capture modules and subsystems.
     3. Mapping image features involving image objects, video objects and semantic objects into granules and building the granular model for video streaming media containing granular property structure. The video granule in different granular layer can be treated. It simplifies the algorithm structure of video analysis and good for the realization for parallel computing by adopting the Bigdata technique.
     4. Proposing the motion vector detection-based shadow suppression algorithm. The motion vector is extracted from the video coding information and applied in the shadow suppression of the moving objects. The algorithm used video streaming media granular model for video background modeling which obtains a good result of shadow detection.
引文
[1]上海安防网.浅析智能视频分析的技术发展与应用不足[EB/OL]. http://www.sh-anfang.org/Item/Show.asp?m=1&d=4505,2012-11-25.
    [2]赵海勇.基于视频流的运动人体行为识别研究[D].西安:西安电子科技大学,2011.
    [3]孔晓东.智能视频监控技术研究[D].上海:上海交通大学,2008.
    [4]Lori MacVittie. The Four V's of Big Data[EB/OL]. https://devcentral.f5.com/blogs/us/ the-four-v-rsquos-of-big-data,2012-4-18.
    [5]科技强警示范城市建设的背景[J].中国公共安全(综合版).2008,Z 1:56.
    [6]张一.智能视频监控中的目标识别与异常行为建模与分析[D].上海:上海交通大学,2009.
    [7]张浩.视频运动人体行为识别与分类方法研究[D].西安:西安电子科技大学,2011.
    [8]李志华.智能视频监控系统目标跟踪与分类算法研究[D].杭州:浙江大学,2008.
    [9]郭耸.人脸检测关键技术研究[D].哈尔滨:哈尔滨工程大学,2011.
    [10]阮锦新.多姿态人脸检测与表情识别关键技术研究[D].广州:华南理工大学,2010.
    [11]葛微.自动人脸识别的关键问题研究[D].北京:中国科学院研究生院,2010.
    [12]赵哲峰,张刚,谢克明,王一平,基于SIP的视频监控服务器设计[J],太原理工大学学报,2009,4:337-340.
    [13]Next Generation Identification [EB/OL]. http://www.fbi.gov/about-us/cjis/ fingerprints_biometrics/ngi/ngi2.2012-9-10.
    [14]Colin Holland. Premier向英国警署提供3D人脸识别技术[EB/OL]. http://www.eet-china.com/articleLogin.do?artId=8800398738.2005-12-2.
    [15]J. Huang. Color-Spatial Image Indexing and Applications[J]. Ph.D thesis, Cornell Univ., 1998.
    [16]Stricker, Markus A. Orengo, Markus. Similarity of color images[C]. Proceedings of SPIE-The International Society for Optical Engineering.1995:381-392.
    [17]Siying Liu, Guo Dong, Chye Hwang Yan, etc. Video Segmentation:Propagation, Validation and Aggregation of a Preceding Graph[J], CVPR 2008.
    [18]Pavan Turaga, Ashok Veeraraghavan, Rama Chellappa. From Videos to Verbs:Mining Videos for Activities using a cascade of dynamical systems[J], CVPR2007.
    [19]Macy, Shirai Y, Miura J, etc. Object tracking in cluttered background based on optical flow and edges[C], In:Proc the 13th International Conference on Pattern Recognition. 1996,196-200.
    [20]郑丹.视频监控场景中的运动物体检测和跟踪[D].合肥:中国科学技术大学,2009.05.
    [21]R Mech, M Wollborn. A noise robust method for segmentation of moving objects in video sequences[C], In:Proc International Conference on Acoustics, Speech and Signal Processing.1997,41-45.
    [22]Lipton A.J, Fujiyoshi, H,Patil, R.S. Moving target classification and tracking from reai-time video[C], Proc. IEEE Workshop on Applications of Computer Vision.1998,8-14.
    [23]Jain R, Nagel H. On the analysis of accumulative difference of picture from image sequences of real world scenes[C]. IEEE Transactions PAMI.1979,206-214.
    [24]陈朝阳,张桂林.基于图像对称差分运算的运动小目标检测方法[J].华中理工大学学报.1998,26(9):34-38.
    [25]Shoichi Araki, Takashi Matsuoaka, Naokazu Yokoya, etc. Real-time tracking of multiple moving object contours in a moving camera image sequence[C]. IEICE Trans Inf&Syst.2000,E832D(7).
    [26]Li Jinkui, Sang Xinzhu, Wang Yongqiang, etc. An improved segmentation algorithm to detect moving object in video sequences[C]. Proceedings of SPIE-The International Society for Optical Engineering,2010,7850.
    [27]Horn Berthold K.P., Schunck Brian G. Determining optical flow[J]. Artificial Intelligence,1981,17(1-3):185-203.
    [28]Kim K, Chalidabhongse T. H, Harwood D, et al. Background modeling and subtraction by Codebook construction[C]. Proceedings of IEEE International Conference on Image Processing, Singapore,2004:3061-3064.
    [29]Toyama K, Krumm J, Brumitt B, et al. Wallflower:Principles and practice of background maintenance[C]. Proceedings of International Conference on Computer Vision, Corfu, Greece.1999:255-261.
    [30]A Neri, S Colonnese, G Russo et al. Automatic moving object and background separation[J], Signal Processing.1998,66:219-232.
    [31]B.P.L.Lo, S.A Velastin. Automatic congestion detection system for underground platforms[C]. ISIMP 2001.2001,158-161.
    [32]R.Cucchiara, C.Grana, M.Piccardi, etc. Detecting moving objects, ghosts, and shadows in video streams[C]. IEEE Transactions on PAMI.2003,25(10):1337-1442.
    [33]Wren Christopher Richard, Azarbayejani Ali, Darrell Trevor, etc. Pfinder:real-time tracking of the human body[C]. IEEE Transactions on PAMI.1997,19(7):780-785.
    [34]C.Stauffer and W.E.L. Grimson. Adaptive background mixture models for real-time tracking[J]. IEEE CVPR"99.1999,246-252.
    [35]A.EIgammal, D.Hanvood, and L.S.Davis. Nonparametric model for background subtraction[J], Proceedings of ECCV 2000.2000,751-767.
    [36]M.Piccardi. T.Jan. Mean-shift background image modeling[C]. IEEE ICIP.2004(5), 3399-3402.
    [37]B.Han, D.Comaniciu, L,S.Davis. Sequential kernel density approximation through mode propagation:applications to background modeling[C]. Proceedings of ACCV 2004.
    [38]N.M.Oliver, B.Rosario, A.P.Rentland. A Bayesian computer vision system for modeling human interactions[C]. IEEE Transactions on PAMI.2000,22(8):831-843.
    [39]Haritaolu Ismail, Harwood David, Davis Larry S. W4:who? When? Where? What? A real time system of detecting and tracking people[C]. Proceeding of Third IEEE International Conference on Automatic Face and Gesture Recognition. Nara, Japan. 1998:222-227.
    [40]Elgammal A. Background and foreground modeling using nonparametric kernel density estimation for visual surveillance[J]. Proceedings of IEEE.2002,90(7):1151-1163.
    [41]Stenger B, Ramesh V, Garagios N, etc. Topology free hidden markov models: Application to background modeling[C]. Proceedings of IEEE International Conference on Computer Vision, Vancouver, BC, Canada,2001,1:294-30I.
    [42]Matsuyama T, Ohya T, Habe H. Background subtraction for nonstationary scenes[C]. Proceedings of the 4th Asian Conference on Computer Vision, Taipei, China,2000: 662-667.
    [43]Wada T, Matsuyama T. Appearance sphere:Background model for pan-tilt-zoom camera[C]. Proceedings of International Conference on Pattern Recognition, Vienna, Austria,1996:718-722.
    [44]Kass M, Witkin A, Terzopoulous D. Snakes:active contour models [J], International Journal of computer Vision,1987,1(4):321-331.
    [45]S Mallat, W L Hwang. Singularity detection and processing with wavelets[J], IEEE Transactions on IT,1992,38(2):617-643.
    [46]W A Thoet, T G Rainey, D WBrettle, et al. ANVIL neural network program for three-dimensional automatic target recognition[J]. Optical Engineering,1992,31(12):2532-2539.
    [47]S K Rogers, J M Colombi, C E Martin, et al. Neural networks for automatic target recognition[J]. Neural Networks,1995,18(7/8):1153-1184.
    [48]C E Daniell, D J K Emsley, W P Lincoln, et al. Artificial neural networks for automatic target recognition[J]. Optical Engineering,1992,31(12):2521-2530.
    [49]Fukui Shinji, Kurahashi Wataru, Iwahori Yuji,etc. Method of updating shadow model for shadow detection based on nonparametric Bayesian estimation[C]. Proceedings of the 12th I APR Conference on Machine Vision Applications,2011,10-13.
    [50]Gao Jun-Xiang, Tian Yan, Liu Yong. Moving shadow detection by ellipsoidal method in intelligent transportation system[J]. Guangdianzi Jiguang/Journal of Optoelectronics Laser,2009,20(10):1348-1352.
    [51]Batista Katherine, Caseiro Rui, Batista Jorge. Shadow modeling and detection for robust foreground segmentation in highway scenarios[C]. Proceedings of the International Conference on Computer Vision Theory and Applications,2010,2:148-157.
    [52]Martel-Brisson Nicolas, Zaeearin Andre. Moving cast shadow detection from a Gaussian mixture shadow model[C].2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2005,2:643-648.
    [53]Lin Hong-Hua, Petitioner Ji-Hong, Liu De-Jian, etc. A statistical parameter learning method for cast shadow model[C]. Proceedings of the 7th International Conference on Machine Learning and Cybernetics, ICMLC,2008,4:2234-2239.
    [54]Benedek Csaba, Sziranyi Tamas. Color models of shadow detection in video scenes[C]. 2nd International Conference on Computer Vision Theory and Applications, Proceedings,2007, IFP,225-232.
    [55]Bai Ke-Jia, Liu Wei-Ming, Tang Yi. Shadow detection algorithm based on Gabor wavelet and color model[J]. Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science),2009,37(1):64-68.
    [56]Pathan Saira Saleem, Al-Hamadi Ayoub, Michaelis Bernd. Integrating statistical and cognitive model for multi-object tracking in realistic scenarios [C]. International Conference Image and Vision Computing New Zealand,2010.
    [57]Mikic Ivana, Trivedi Mohan, Hunter Edward, etc. Articulated body posture estimation from multi-camera voxel data[C]. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2001,1:455-460.
    [58]Mikic Ivana, Trivedi Mohan, Hunter Edward, etc. Human body model acquisition and tracking using voxel data[C]. International Journal of Computer Vision,2003, 53(3):199-223.
    [59]Salvador Elena, Cavallaro Andrea, Ebrahimi Touradj. Cast shadow segmentation using invariant color features[J]. Computer Vision and Image Understanding,2004, 95(2):238-259.
    [60]L.A.Zadeh. Fuzzy sets and information granularity[J]. Advances in Fuzzy Set Theory and Applications (MM. Gupta, RK Ragade, RR Yager, eds.), North-Holland, Amsterdam,1979,3-18.
    [61]HOBBS J R. Granularity[A]. Proceedings of the Ninth International Joint Conference on Artificial Intelligence. Los Angeles, CA, USA:Morgan Kaufmann Publishers Inc, 1985:432-435.
    [62]LIN T Y. Granular computing on binary relations I:data mining and neighborhood systems, Ⅱ:rough set repre-sentations and belief functions, rough sets in knowledge discovery[M]. Physica-Verlag,1998:107-140.
    [63]LIN T Y. Neighborhood systems and relational database[A]. Proceedings of CSC'88[C]. New York,1988.
    [64]LIN T Y. Data mining:granular computing approach methodologies for knowledge discovery and data mining[A]. Proceedings of PA KDD'99,1999.
    [65]LIN T Y. Granular computing:fuzzy logic and rough sets. Computing with Words in Information Intelligent Systems[M]. Physica-Verlag,1999.
    [66]LIN T Y. Data mining and machine oriented modeling:a granular computing approach[J].Journal of Applied Intel-ligence,2000,13 (2):113-124.
    [67]LIN T Y. Granular computing rough set perspective[J].The Newsletter of the IEEE Computational Intelligence Society,2005,2 (4):1543-4281.
    [68]LIN T Y. Granular computing:structures,representations,applications and future directions[A].The Proceedings of 9th International Conference, RSFDGrC,2003.
    [69]LIN T Y. Granular computing:a problem solving paradigm[A]. The Proceedings of the 2005 IEEE Interna-tional Conference on Fuzzy Systems,2005.
    [70]刘清,刘群.粒及粒计算在逻辑推理中的应用[J].计算机研究与发展,2004,41(4):546-551.
    [71]刘清.逻辑及其归结推理[J].计算机学报,2004,27(7):865-874.
    [72]仇国芳,陈劲.概念知识系统与概念信息粒格[J].工程数学学报,2005,22(6):963-969.
    [73]仇国芳,荆彦玲.粗糙集与概念格[M].西安:西安交通大学出版社,2006.
    [74]中安网.大数据:下一个创新、竞争和生产力的前沿[EB/OL]. http://www.cps. com.cn/znaf/rdht/2013/0225/5OMDAwMDY4NzQ5OQ.html,2013-2-25.
    [75]黄荷.大数据时代降临[J].半月谈,2012,17.
    [76]David Floyer. Enterprise Big-data[EB/OL]. http://wikibon.org/wiki/v/Enterprise_ Big-data,2013.
    [77]Zhang Sheng, Wang Yao-Li, Zhang Gang.The service architecture of commercial bank[C]. NISS2010-4th International Conference on New Trends in Information Science and Service Science,2010,180-184.
    [78]Zhang Sheng, Wang Yao-Li, Zhang Gang.The service architecture of Integrated Financial Enterprise[C]. ICIME 2010-2010 2nd IEEE International Conference on Information Management and Engineering,2010,5:81-84.
    [79]Zhang Sheng, Zhang Gang, Pei Ke. The key technology of the new generation business system used in the domestic commercial bank[C]. ICIME 2010-20102nd IEEE International Conference on Information Management and Engineering,2010,2:433-438.
    [80]Zhang Gang, Zhang Bo. Research on dual-processor sharing single DRAM[C].2009 Second ISECS International Colloquium on Computing, Communication, Control, and Management, CCCM 2009,2:532-535.
    [81]Pei Ke, Zhang Gang, Zhang Zhong-Jie. Analysis of SOA-based middleware service layers for polynary resources SoC[C].3rd International Conference on Information Sciences and Interaction Sciences, ICIS 2010,2010,502-507.
    [82]Tian, Q., Hong, P., Huang, T.S. Update relevant image weights for content-based image retrieval using support vector machines[C]. IEEE International Conference on Multi-Media and Expo,2000:1199-1202.
    [83]Rui, Y., Gupta, A., Cadiz, J.J. Viewing meetings captured by an omni-directional camera[C]. Conference on Human Factors in Computing Systems-Proceedings, 2001:450-457.
    [84]吴洪,卢汉清,马颂德.基于内容图像检索中相关反馈技术的回顾[J].计算机学报,2005,28(12):1 969-1 979.
    [85]AmoidW. M., MarceW., Simone S. etc. Content-based image retrieval at the end of the early years[J]. IEEE Transactions on Pattern Analysis andMachine 1 ntelligence, 2000,22(12):1349-1379.
    [86]Buxton H., Mukerjee A.. Conceptualizing images[J]. Image and Vision Computing, 2000,18 (2):79.
    [87]Hermes T. Image retrieval for information systems in storage and retrieval for image and video Databases III[J]. In:Proceedings of SPIE 2420. SanJose, CA,1995.
    [88]Mojsilovic A. etc. Matching and retrieval based on the vocabulary and grammar of color patterns[J]. IEEE Transactions on I mage Processing,2000,9 (1):189-194.
    [89]Zhuang Y., Mehrotra S., Huang T. S.. A multimedia information retrievalmodel based on semantic and visual content [J]. In:Proceedings of the 5 th International I CYCS Conference, Nanjing, China,1999,468-475.
    [90]Colombo C. Semantics in visual infor mation retrieval [J]. IEEE Multi media,1999,6 (3):38-53.
    [91]CavazzaM., Green R. J., Palmer I. J. Multimedia semantic features and image content description[J]. In:Proceedings of the 1998 Multimedia Modeling, Lausanne, Switzerland,1998,39-46.
    [92]Biederman I. Aspects and extensions of a theory of human image understanding[J] Computational Processes in Human Vision:An Interdiscip linary Pers pective. Nor wood, NJAblex,1988,370-428.
    [93]Li Fu-Jiang, Zhang Gang, Jia Li-Na. Adaptive threshold for AVS all-zero block detection[C], Proceedings:1st International Symposium on Computer Network and Multimedia Technology, CNMT 2009.
    [94]Li Fu-Jiang, Zhang Gang. Predictive skip mode based all zero block detection for AVS encoder[C].2009 Second ISECS International Colloquium on Computing, Communication, Control, and Management, CCCM 2009,2:490-493.
    [95]Zhang Qi-Gui, Zhang Bo, Zhang Gang. Design of a Video Processor Based on FPGA[J] Source:Advanced Materials Research, Manufacturing Systems Engineering,2012, 429:190-194.
    [96]Chang Qing, Li Fujiang, Zhang Gang. Assembly optimization of AVS sub-pixel interpolation module based on C64x+DSP[J]. Journal of Computational Information Systems,2012,8(9):3785-3789.
    [97]Zhang Gang, Xie Kerning, Zhao Zhefeng, etc. The LD-CELP gain filter based on BP neural network[C]. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bio informatics),3973:150-155.
    [98]LSCOM Lexicon Definitions and Annotations Version 1.0[R], DTO Challenge Workshop on Large Scale Concept Ontology for Multimedia, Columbia University ADVENT Technical Report#217-2006-3,2006.
    [99]Select LSCOM Concept List[EB/OL]. http://lastlaugh.inf.cs.cmu.edu/libscom/concept. htm,2012.
    [100]赵哲峰,张刚,谢克明,王一平,低延迟低码率语音编码研究[J],计算机工程与应用,2008,34:100-102.
    [101]Zhao, Zhefeng; Zhang, Gang; Wang, Yiping, Research on low delay 11.2kbps speech coding algorithm[C], AICI 2011,2011,7002:276-281.
    [102]Amit P. Sheth, Clemens Bertram, David Avant, etc. Managing Semantic Content for the Web[J]. IEEE Internet Computing,2002,6(4):80-87.
    [103]邱桃荣.面向本体学习的粒计算方法研究[D].北京交通大学,2009.
    [104]薛瑞华.基于GMM的运动目标检测和阴影抑制算法研究[D].太原理工大学,2012.
    [105]Nicolas Henri, Pinel Jean-Marie. Joint moving cast shadows segmentation and light source detection in video sequences[C]. Signal Processing:Image Communication, 2006,21(1):22-43.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700