辅助动画视频分析的相似视频片段匹配技术研究

英文题名：Research on Near Duplicate Video Clip Detection Technology Supporting Cartoon Video Analysis
作者：邓莉琼
论文级别：博士
学科专业名称：控制科学与工程
中文关键词：动画 ; 视频片段 ; 图像特征 ; 相似视频片段匹配 ; 视频分析
英文关键词：Cartoon ; Video Clip ; Image Feature ; Near Duplicate Video Clip
英文关键词：Match ; Video Analyse
学位年度：2012
导师：吴玲达
学科代码：0811
学位授予单位：国防科学技术大学
论文提交日期：2011-10-01

摘要

伴随着视频技术和互联网的快速发展，动画视频作为视频数据的一个重要分支，正日益受到人们的关注，其产业规模也在不断扩大。然而，大量的动画视频资源也使得用户难以在浩如烟海的视频数据中快速找到所需的视频信息。作为动画视频数据中重要的组成部分，动画视频片段由于承载了大量可重复使用的内容，重复利用率高，因而成为视频研究领域中的主要研究对象之一。近年来，随着动画视频片段的来源日益广泛，数量日益庞大，相似动画视频片段匹配技术已成为多媒体检索领域中的热门课题。如何找到一种有效的动画视频片段匹配方法，使之可以辅助用户对动画视频进行自动化分析，已经成为当前视频分析领域迫切需要解决的问题之一。
     相似视频片段匹配技术在实际中的应用主要受两个因素影响：匹配准确度和匹配速度。由于视频片段匹配技术涉及多个结构层次和多个语义层次的大量视频信息，计算复杂度大，因此，如何实现准确度与速度之间的优化平衡一直是相似视频片段匹配研究中的难点问题。为此，本文对动画视频片段匹配技术进行了深入研究，提出了一系列相关技术和方法，其目的是为动画片段的辅助分析工作提供技术支持，从而帮助用户获取所需的动画视频信息。
     为实现上述目标，本文从系统的角度对以下几个关键技术进行了研究，分别是：动画图像的特征提取与匹配技术、相似动画视频片段的特征提取与匹配技术、基于内容的动画视频片段实时探测技术、动画视频片段标注技术以及相似动画视频片段关联技术等。这些关键技术从不同方面对动画视频辅助分析提供了支持，其研究循序渐进，方法相辅相成，因而形成了一套比较完整的理论和方法体系。具体来说，本文的主要贡献体现在以下几个方面：
     首先，提出了一种结合颜色信息的动画图像特征的提取与匹配方法。针对动画图像区别于自然图像的若干特点，对传统的全局特征提取与匹配方法进行了改进，将动画图像颜色组成分布之间的关联特征融合进全局颜色特征的描述子中；此外，针对局部特征提取方法中常常丢失颜色信息的问题，将颜色不变量作为图像局部特征的提取对象，用于描述动画图像的细节组成信息；最后，研究了动画图像全局特征与局部特征的加权融合方法，实现了两类特征的优势互补。
     其次，提出了动画视频片段在不同结构层次和不同语义层次上的多种相似匹配方法。结合相似视频片段定义的多语义特点，分别从底层特征层和中层逻辑层对视频片段相似度的衡量方法进行了研究。其中，针对视频片段的底层语义特征，分别对基于关键帧的词袋方法和基于关键帧的编辑距离方法进行了改进，上述方法分别利用语言模型和扩展后的编辑距离对视频关键帧序列的视觉特征和时序特征之间的相似度进行了描述；其次，针对视频片段的中层语义特征，分别提出了基于关键帧的时序网络方法和基于视频单元的视频距离轨迹方法，前者通过时序网络对视频片段的时序特征与视觉特征进行融合，从而有效解决了视频片段的部分对齐问题，后者在对视频单元的特征描述参数进行定义的基础上，结合图论中的最优匹配技术实现了相似视频片段的最佳对齐；最后，研究了不同语义层次上的相似动画视频片段匹配方法之间的相似度融合方法，通过调整权重系数，可以使其满足不同的应用场合和任务需求。
     然后，提出了一种基于相似视频片段匹配的动画视频辅助分析方法。其中，为了实现基于内容的相似动画视频片段实时探测，首先，在相似视频片段匹配技术的基础上，利用改进后的局部敏感哈希函数建立了视频片段的索引结构；此外，为提高相似视频探测结果的排序准确度，采用了关联图对相似动画视频片段进行聚类，实现了对探测结果的重排序；其次，针对传统视频片段自动标注方法中错误信息过多的问题，利用随机漫步的方法对标注信息进行了重排序，在完善了视频片段语义信息的同时，实现了基于检索的动画视频片段自动标注；最后，利用可视化技术从三个方面对相似动画视频片段之间的关联关系进行了挖掘和表现，分别是：聚类关联、特征关联以及演化历程。
     最后，设计和实现了一个动画视频片段辅助分析系统NCLIPs。详细介绍了NCLIPs系统的设计思路、结构层次和模块功能，给出了系统原型的具体实现界面。NCLIPs系统以本文研究的各种技术和方法作为支撑，实现了对动画视频片段基于内容的分析、检索以及个性化的组织表现。
     综上所述，本文通过研究相似动画视频片段匹配方法实现了对动画视频片段的辅助分析。从实际效果看，本文提出的相似动画视频片段匹配方法具有较高的匹配准确度和匹配速度，将其用于动画视频片段的辅助分析，可以为视频片段的分析、检索以及个性化的组织与表现提供技术支持。从研究意义上看，本文的研究为获取动画视频片段的相关信息提供了一条有效途径，其研究成果无论是在理论还是在实践上都具有十分重要的意义。
With the fast development of video technology and the rapid spread of internetmedia，the cartoon video which is an important embranchment of video data is nowattracting more and more attention while the cartoon industry is broadening continually.Howerver, the abundant cartoon video resource makes users difficult to find theirneeded information in video dataset. As an important composing of cartoon videomaterial, cartoon video clips often bear tremendous repeated information and have ahigh reuse rate, which make it becomes the main research object in video domain.Recently, as the amount of cartoon video clip is increasing exponentially, the nearduplicate cartoon video clip matching technology has become one of the hotspots inmultimedia retrieval domian. How to obtain an effective matching technology of nearduplicate cartoon video clips, and use it to realize the automaticlly analyse of cartoonvideo has become one of problems which are needed to solve urgently in video domain.
     The near duplicate video clip matching technology is affected by two factors:veracity and velocity. Because of the large video information among different structurelevels and semantic levels of video clip, the computation complicacy is huge too. Thus,how to implement the tradeoff between veracity and velocity has always been thediffculty in the near duplicate video clip matching research. This thesis explores thetopic on near cartoon video clip matching technology, a num of correlative technologiesand methods are proposed. The target is to support the video clips analyse and helpusers to obtain needed cartoon video information.
     To realize those targets, the following key technologies are researched fromsystem’s point: cartoon image feature extraction and matching, near duplicate cartoonvideo clip matching, caontent-based online cartoon video clip detection, the annoatationof cartoon video clip and relevancy among near duplicate cartoon video clips. In detail,the main contributions of this thesis can be concluded as follows:
     Firstly, a color combined visual feature extraction and matching method of cartoonimage is proposed. Aimed at the cartoon image’s features which are distinc from naturalimage, through the embeding of the relevancy feature of image’s color distribution intothe global color descriptor, the matching accuracy is improved. Meanwhile, Aimed atthe problem of missing color information in local feature extraction method, the colorinvarance is used as the input of local feature extraction, so the compoment details ofcartoon image is well described. Finally, the weighted fusing methods of the cartoonimage’s global and local feature are researched in order to realize the advantages’complementation between the two features.
     Secondly, the matching methods of near duplicate cartoon video clip on differentstructure levels and semantic levels are proposed. The research is done from the bottom feature level and middle logical level. Firstly, aimed at the bottom level, thekeyframe-based bag of word method and edit distance method are improved by usinglanguage model and extended edit distance to describe the visual feature and sequencefeature of keyframe respectively. Secondly, aimed at the middle level, the keyframebased sequence net method and unit based video track method are proposed. Theanterior one fuses the visual feature and sequence feature by building the net whichsolve the problem of partially alignment. Based on the disciption parameter of videounit feature, the latter one realizes the best alignment by employing the optimizationmatching technology of graph theory in order to achieve the traderoff between veracityand velocity. Finally, the similarity fusing approach among different levels of cartoonvideo clip matching is researched, during which the diverse application situation andtask demands can be satisfied by adjusting the weighted coefficient.
     Thirdly, methods of cartoon video clip supporting analyse technology are proposed.Firstly, a content-based online near duplicate video clip detection method is realized,which employs an improved index structure to speed up the detection rate and proposesa relevancy graph based resorting method in order to resort the detection result;Secondly, an automatical annotation method of cartoon video clip which empoly therandom walk approach is proposed to solve the problem of overfull wrong labels andconsummate the clip’s semantic information; Lastly, kinds of visualization structuresare proposed to present the relevancy among near duplicate cartoon clips, which are:class relevancy, feature relevancy and evolution process, in order to provide a novelthought for mining the deep information of cartoon video clip.
     Lastly, a system for supporting analyse cartoon video clip is designed andimplemented. The design idea and each functional module of NCLIPs system aredescribed in detail, and the implementation of prototype system is also presented. TheNCLIPs system use the technologies of this thesis as the support to implement thecontent-based analyse, retrieval and bardian organazation of cartoon video clips.
     As a general, the supporting analyse of cartoon video clip is realized based on nearduplicate video clip matching. From the point of pratical effect, the proposed nearduplicate cartoon video clip matching method has a good veracity and velocity. Thistechnology can be use as a technical support for videos analyse, retrieval andorganazation. From the point of research meaning, achievements of this thesis providean effective method to obtain the cartoon video information. The research productionhas an important meaning in theory and pratical use.

引文

[1] A.G.Hauptmann, R.Yan, R.Jin, et al．Video Classification and Retrieval with theInformedia Digital Video Library System[C]//In Proceedings of TRECVID2002,November, Gaithersburg, USA.
    [2] Wikipedia.http://en.wikipedia.org/wiki/Youtube.
    [3] Wu X, Hauptmann A.G., Ngo C.W. Practical elimination of near-duplicates fromweb video search[C]//In Proc. of MM’07, New York: ACM,2007:218~227.
    [4] Shin’ichi Satoh. News video analysis based on identical shot detection[C]//Proc.Of ICME,2002.
    [5] Brown LG. A survey of image registration techniques[J]．ACM ComputingSurveys,1992,24(4):325~376.
    [6] B. Srinivasa Reddy, B. N. Chatterji. A FFT-based technique for translation,rotation and sacle-invirant image registration[J]. IEEE Transactions on imageprocessing,1996,8(5):1266~1271.
    [7] Jacqueline Le Moigne, William J. Campbell, Robert F. Cromp. An automatedparallel image registration technique based on the correlation of waveletfeatures[J]. IEEE Transactions on Geoscience and Romote Sensing,2002,40(8):1849~1864.
    [8] George Lazarids, Maria Petrou. Image registration using the walsh transform[J].IEEE Transactions on image processing,2006,15(8):2343~2357.
    [9] D-Q.Zhang, S-F.Chang. Detecting Image Near-duplicate by Stochastic AttributedRelational Graph Matching with Learning[C]//ACM International Conference onMultimedia (MM), New York, USA, October2004:877~884.
    [10] J R Smith, S F Chang. Tools and Techniques for Color Image Retrieval[C]//InProceeding of SPIE: Storage and Retrieval for Image and Video Database. SanJose, CA,1996,2670:426~437.
    [11] D.S. Guru, H.S. Nagendraswamy. Symbolic representation of tow-dimensionalshapes[J]. Pattern Recognition Letters,2007,28:144~155.
    [12] S K Chang, Q Y Shi, C W Yan. Iconic Indexing by22Dstrings[J]. IEEE Transon Pattern Analysis and Machine Intelligence,1987,9(3):413~428.
    [13] Wan-Lei Zhao, Chong-Wah Ngo, Hung-Khoon Tan, Xiao Wu. Near-DuplicateKeyframe Identification with Interest Point Matching and Pattern Learning[J],IEEE Transaction on Multimedia,2007,9(5):1037~1048.
    [14] Y.Ke, R.Sukthankar, L.Huston.Efficient Near-duplicate Detection and Sub-imageRetrieval[C]//ACM International Conference on Multimedia(MM), NewYork,USA, Oct.2004:869~876.
    [15] Chong-Wah Ng, et al. Fast Tracking of Near-Duplicate Keyframes in BroadcastDomain with Transitivity Propagation[C]//Proceedings of the14th annual ACMinternational conference on Multimedia(ACM MM2006), October23~27, SantaBarbara, USA,2006:845~854.
    [16] D.G.Lowe． Distinctive image features from scale invariant keypoints[J],International Journal of Computer Vision,2004,60(2):90~110.
    [17] Pinar Duygulu, Jia Yu Pan and David A.Forsyth．Towards auto-documentary:tracking the evolution of news stories[C]//Proceedings of the12th annual ACMinternational conference on Multimedia(ACM MM2004), October10~16, Newyork, USA:820~827.
    [18] W. Hsu, S.-F. Chang. Topic tracking across broadcast news videos with visualduplicates and semantic concepts[C]//IEEE International Conference on ImageProcessing, October8~11, Atlanta, USA,2006:141~144.
    [19] Y. Ke, R. Sukthankar．PCA-SIFT: A more distinctive representation for localimage descriptors[C]//Proceedings of the2004IEEE Computer SocietyConference on Computer Vision and Pattern Recognition(CVPR2004), June27~July2,2004, Washington DC, USA:506~513.
    [20]郝明非,张建秋,胡波．一种超复数鲁棒相关图像配准算法[J]．复旦学报(自然科学版),2007,46(1):91~95.
    [21]高富强,张帆．一种快速彩色图像匹配算法[J]．计算机应用,2005,25(11):2604~2611.
    [22] J.Sivic, A.Zisserman. Video Google: A Text Retrieval Approach to ObjectMatching in Videos[C]//IEEE International Conference on Computer Vision(ICCV),2003:1470~1477.
    [23] J.M.Ponte, W.B.Croft. A Language Modeling Approach to InformationRetrieval[C]//ACM SIGIR Conference on Research and Development inInformation Retrieval(SIGIR),2005.
    [24] C.Zhai, J.Lafferty. A Study of Smoothing Methods for Language ModelsApplied to Ad Hoc Information Retrieval[C]//ACM SIGIR Conference onResearch and Development in Information Retrieval(SIGIR), New Orleans, USA,September2001:334~342.
    [25] P.Pantel, D.Lin. Document Clustering with Committees[C]//ACM Conference onResearch and Development in Information Retrieval (SIGIR), Tampere, Finland,2002:199~206.
    [26] Y.Zhang, J.Callan, T.Minka. Novelty and Redundancy Detection in AdaptiveFiltering[C]//ACM Conference on Research and Development in InformationRetrieval (SIGIR), Tampere,Finland,August2002:81~88.
    [27] K.Mcdonald, A.F.Smeaton. A Comparison of Score,Rank and Probability-BasedFusion Methods for Video Shot Retrieval[C]//International Conference on Imageand Video Retrieval (CIVR), Singapore,2005:61~70.
    [28] T.Westerveld. Using Generative Probabilistic Models for Multimedia Retrieval.Ph.D thesis,2004.
    [29] T.Westerveld and A.P.Vries. Multimedia Retrieval Using Multiple Examples[C]//International Conference on Image and Video Retrieval(CIVR), Dublin,Ireland, July2004:344~352.
    [30] WuXiao. On clustering, detection and threading of topics for large scale videoswith multiple modalities[D], PhD Thesis, City University of HongKong,2008.
    [31] M.Davis, S.King, N.Good, R.Sarvas. From Context to Content: LeveragingContext to Infer Media Metadata[C]//ACM International Conference onMultimedia, New York, USA,2004:188~195.
    [32] CMU Informedia.http://www.informedia.cs.cmu.edu.
    [33] Ichiro Ide, Hiroshi Mo, Norio Katayama, et al．Threading news video topics
    [C]//Proceedings of the5th ACM SIGMM international workshop on MultimediaInformation Retrieval (MIR), November7,2003, California, USA:239~246.
    [34] Ichiro Ide, Hiroshi Mo, Norio Katayama, et al．Topic threading for structuring alarge-scale news video archive[C]//Third International Conference on Image andVideo Retrieval (CIVR2004), July21~23,2004, Dublin, Ireland:123~131.
    [35] Norio Katayama, Hiroshi Mo, Ichiro Ide, et al．Mining large-scale broadcastvideo archives towards inter-video structuring[C]//Pacific Rim Conference onMultimedia (PCM), November30~December3,2004, Tokyo, Japan:489~496.
    [36] Ichiro IDE, Hiroshi MO, Norio KATAYAMA, et al．Exploiting topic threadstructures in a news video archive for the semi-automatic generation of videosummaries[C]//IEEE International Conference on Multimedia and Expo (ICME),July9~12,2006, Toronto, Canada:1473~1476.
    [37]王鹏,蔡锐,杨士强．‘文本为主’的多模态特征融合的新闻视频分类算法[J]。清华大学学报(自然科学版),2005,45(4):475~478.
    [38]徐凤亚,罗震声．文本自动分类中权重算法的改进研究[J]。计算机工程与应用,2005(1):181~184.
    [39]文军,新闻视频故事单元跟踪关键技术研究[D],博士学位论文,国防科学技术大学,2008.
    [40] S. Cheung, A. Zakhor. Efficient video similarity measurement with videosignature[J]. IEEE Transactions on Circuits System Video Technology,2003,13(1):59~74.
    [41] H. T. Shen, B. C. Ooi, X. Zhou, Z. Huang. Towards effective indexing for verylarge video sequence database[C]//In SIGMOD, pages730~741,2005.
    [42] X. Zhu, X. Wu, J. Fan, et al. Aref. Exploring video content structure forhierarchical summarization[J]. Multimedia System,2004,10(2):98~115.
    [43] S.-L. Lee, S.-J. Chun, D.-H. Kim, J.-H. Lee, C.-W. Chung. Similarity search formultidimensional data sequences[C]//In ICDE, pages599~608,2000.
    [44] Yan-Tao Zheng, Shi-Yong Neo et al. Fast near-duplicate keyframe detection inlarge-scale video corpus for video search[C]//International Workshop onAdvanced Image Technology2007(IWAIT), January8~9, Bangkok, Thailand.
    [45] Marcel Worring, Cees Snoek, Ork de Rooij, et al．Mediamill: advanced browsingin news video archives[C]//International Conference on Image and VideoRetrieval(CIVR), July13~15,2006, Tempe, USA:533~536.
    [46] W. Dong, Z. Wang, M. Charikar, K. Li. Efficiently matching sets of features withrandom histograms[C]//Proceeding of ACM Multimedia,2008.
    [47] X. Wu, C.-W. Ngo, A. G. Hauptmann, H.-K. Tan. Real-time near-duplicateelimination for web video search with content and context[J]. TMM,2009,11(2):196~207.
    [48] Yan-Tao Zheng, Shi-Yong Neo et al. The Use of Temporal, Semantic and VisualPartitioning Model for Efficient Near Duplicate Detection in Large Scale NewsCorpus[C]//International Conference on Image and Video Retrieval(CIVR), July9~11, Amsterdam, The Netherlands,2007:409~416.
    [49] J. Law-To, O. Buisson, V. Gouet-Brunet and N. Boujemaa. Robust votingalgorithm based on labels of behavior for video copy detection. ACMMultimedia,2006.
    [50] X. Wu, M. Takimoto, S. Sato and J. Adachi. Scene duplicate detection based onthe pattern of discontinuities in feature point trajectories[C]//ACM Multimedia,2008.
    [51] W. Ren and S. Singh. Video sequence matching with spatio-temporal constraints
    [C]//In Proceedings of ICPR, pages834~837,2004.
    [52] L. Chen, M. T. Ozsu, V. Oria. Robust and fast similarity search for movingobject trajectories[C]//In Proceedings of SIGMOD,2005.
    [53] L. Chen and R. Ng. On the marriage of lp-norm and edit distance[C]//InProceedings of VLDB, pages792~803,2004.
    [54] M. Bertini, A. D. Bimbo, W. Nunziati. Video clip matching using mpeg-7descriptors and edit distance[C]//In Proceedings of CIVR, pages133~142,2006.
    [55] J. Zhou, X.-P. Zhang. Automatic identification of digital video based onshot-level sequence matching[C]//In ACM MM, pages515~518,2005.
    [56] A. Joly, O. Buisson, C. Frélicot. Content-based copy detection usingdistortion-based probabilistic similarity search[J]. IEEE Transactions onMultimedia,2007.
    [57] A. Joly, C. Frelicot, O. Buisson. Robust content based video copy identificationin a large reference database[C]//In Proceedings of CIVR,2003.
    [58] F. Yamagishi, S. Satoh, T. Hamada, M. Sakauchi, Identical Video SegmentDetection for Large-Scale Broadcast Video Archives[C]//International Workshopon Content-Based Multimedia Indexing (CBMI), Rennes,2003:135~142.
    [59] M.R. Naphade, T.S. Huang, Discovering recurrent events in video usingunsupervised methods[C]//Proceedings of ICIP2002.
    [60] YouTube. http://www.youtube.com/.
    [61] P. Over, W. Kraaij, A.F. Smeaton. TRECVID2008-Goals, Tasks, Data,EvalationMechanisms and Metrecs[C]//Proceedings of TRECVID2008.
    [62] C.G. Snoek, M. Worring. The challenge problem for automated detection of101semantic concepts in multimedia[C]//In Proceeding of ACM Multimedia,2006.
    [63] A. Yanagawa, S.-F. Chang, L. Kennedy, W. Hsu. Columbia University’sBaseline detectors for374LSCOM Semantic Visual Concepts[R]. ColumbiaUniversity ADVENT Technical Report222-2006-8, March20,2007.
    [64] Yu-Gang Jiang, Chong-Wah Ngo, Jun Yang. Towards Optimal Bag-of-Featuresfor Object Categorization and Semantic Video Retrieval[C]//ACM InternationalConference on Image and Video Retrieval (CIVR), The Netherlands,2007.
    [65] J.Allan． Topic detection and tracking:event-based information retrieval[J]．Norvell, Massachusetts, USA, KIuwer Academic Publishers,2002.
    [66]凌坚．新闻视频主题识别与跟踪的研究[D]．博士学位论文,浙江大学,2007.
    [67] R. Nallapati, A. Feng, F. Peng and J. Allan．Event threading within newstopics[C]//In Proceedings of the thirteenth ACM international conference onInformation and Knowledge Management,2004, Washington, USA:446~453.
    [68] X. Zhu, J. Fan, A.K. Elmagarmid, X. Wu. Hierarchical video content descriptionand summarization using unified semantic and visual similarity[J]. MultimediaSystem,2003,9(1):31~53.
    [69] Ork de Rooij, Cees G. M. Snoek, Marcel Worring．Query on demand videobrowsing[C]//Proceedings of the15th ACM international conference onMultimedia(MM), September24~29,2007, Augsburg, Germany:811~814.
    [70] Jelena Tesic, Apostol Natsev, et al．IBM multimodal interactive video threading
    [C]//International Conference on Image and Video Retrieval(CIVR), July9~11,2007, Amsterdam, The Netherlands:124~126
    [71] A.Hauptmann, R.V.Baron, M Y.Chen．Informedia at TRECVID2003analyzingand searching broadcast news video[C]//In Proceedings of TRECVID2003,November,2003, Gaithersburg, USA
    [72] Michael G. Christel, Alexander G. Hauptmann．The use and utility of high-levelsemantic features in video retrieval[C]//International Conference on Image andVideo Retrieval (CIVR), July20~22,2005, Singapore:134~144
    [73] NIST TREC Video Retrieval Evaluation. http://www-nlpir.nist.gov/projects/trecvid/
    [74] http://www-nlpir.nist.gov/projects/tv2010/tv2010.html
    [75] http://www.cdvp.dcu.ie/aboutfishclar.html
    [76] Swain, M. J.&Ballard, D. H.(1991) Color Indexing[J], International Journal ofComputer Vision,7,1,11-32.
    [77] M A Stricker, M Orengo. Similarity of Color Images[C]//In Proceedings of SPIE:Storage and Retrieval for Image and Video Databases III. San Jose, CA,1995,2420:381~392.
    [78] G Pass, R Zabih, J Miller. Comparing images Using Color Coherence Vectors
    [C]//In Proceedings of ACM Intern Conf Multimedia. Boston, MA,1996.
    [79] R Rickmann, S John1. Content Based Image Retrieval Using Color TupleHistograms[C]//In Proceedings of SPIE: Storage and Retrieval for Image andVideo Database. San Jose, CA,1996.
    [80]夏定元．基于内容的图像检索通用技术研究及应用[D]．博士学位论文,武汉,华中科技大学,2004.
    [81] Robert M Haralick, Shanmugam K, Its' hak Dinstein．Texture features for imageclassification[J]．IEEE Transaction on Systems, Man and Cybernetics,1973,SMC-3(6):610~621
    [82] Jing Li, Nigel M. Allinson．A comprehensive review of current local features forcomputer vision[J]．Neurocomputing71(10-12),2008:1771~1787
    [83] T. Tuytelaars, K. Mikolajczyk．Local Invariant Feature Detectors: A Survey[J]．Foundations and Trends in Computer Graphics and Vision,2008,3(3):177~280
    [84] H.P.Moravec．Towards automatic visual obstacle avoidance[C]//Proceedings ofthe5th International Joint Conference on Artificial Intelligence, August,1977:584~590
    [85] H.P. Moravec．Visual mapping by a robot rover[C]//Proceedings of the6thInternational Joint Conference on Artificial Intelligence, August1979, Tokyo,Japan:599~601
    [86] C. Harris, M. Stephens．A combined corner and edge detector[C]//Proceedings ofThe Fourth Alvey Vision Conference, Manchester,1988:147~151.
    [87] Shi.J, Tomasi C．Good Features to Track[C]//Proceedings of IEEE Conferenceon Computer Vision and Pattern Recognition,1994, Seattle, Washington, USA:593~600.
    [88] Crowley J．A Representation for Visual Information[D]．PhD Thesis, CarnegieMellon University,1981.
    [89] Mikolajczyk, Schmid． Scale Affine invariant interest point detectors[J]．International Journal of Computer Vision,2004,60(1):63~86.
    [90] Z. Zhang, R. Deriche, O. Faugeras, Q. Luong．A robust technique for matchingtwo uncalibrated images through the recovery of the unknown epipolargeometry[J]．Artificial Intelligence,1995,78(1-2):87~119.
    [91] C. Schmid and R. Mohr．Matching by local invariants[R]．Research report2644INRIA Rhone-Alpes,1995, Grenoble, France.
    [92] T. Lindeberg．Feature detection with automatic scale selection[J]．InternationalJournal of Computer Vision,1998,30(2):77~116.
    [93] Mikolajczyk, Schmid．A performance evaluation of local descriptors[J]．IEEETransactions on Pattern Analysis and Machine Intelligence,2005,10(27):1615~1630.
    [94] H. Bay, T. Tuytelaars, L. van Gool． SURF: Speeded up robust features
    [C]//Proceedings of the9th European Conference on Computer Vision,Springer LNCS,2006,3951(1):404~417.
    [95] J. van de Weijer, T. Gevers and A. Bagdanov．Boosting color saliency in imagefeature detection[J]． IEEE Transactions on Pattern Analysis and MachineIntelligence,2006,28(1):150~156.
    [96] A. Bosch, A. Zisserman and X. Munoz．Representing shape with a spatialpyramid kernel[C]//Proceedings of the6th ACM international conference onImage and video retrieval,2007, Amsterdam, The Netherlands:401~408.
    [97] M. Stark, B. Schiele．How good are local features for classes of geometricobjects[C]//In Proceedings of the11th International Conference on ComputerVision,2007, Rio de Janeiro, Brazil:1~8.
    [98] Koen E.A. van de Sande, Theo Gevers, Cees G.M. Snoek. A Comparison ofColor Features for Visual Concept Classification[C]//International Conference onImage and Video Retrieval (CIVR),2008, Niagara Falls, Canada:141~150.
    [99] C. Kim, B. Vasudev．Spatiotemporal sequence matching for efficient video copydetection[J]．IEEE Transactions on Circuits and Systems for Video Technology,2005,15(1):127~132
    [100] J. M. Gauch, A. Shivadas．Finding and identifying unknown commercials usingrepeated video sequence detection[J]．Computer vision and image understanding,2006,103(1):80~88
    [101] W-L. Zhao, C-W. Ngo. Scale-rotation invariant pattern entropy for keypointbased near-duplicate detection[J], IEEE. Transaction on Image Processing,2009.
    [102] Y. Zhuang, Y. Rui, T. S. Huang, S. Mehrotra. Adaptive key frame extractionusing unsupervised clustering[C]//Proceedings of IEEE ICIP’98, vol.1,1998:866~870.
    [103] H.S. Chang, S. Sull, S.U. Lee, Efficient video indexing scheme for content-basedretrieval[J], IEEE Trans. Circuits Syst. Video Technol, Dec1999.
    [104] M. Douze, A. Gaidon, H. Jegou, et al. Inria-lear’s video copy detection system
    [C]//Proceedings of TREVCID,2008.
    [105] Y. Ke, R. Sukthankar, L. Huston. Efficient near duplicate detection and subimage retrieval[C]//ACM Multimedia,2004.
    [106] H. Jegou, M. Douze, C. Schmid. Hamming embedding and weak geometricconsistency for large scale image search[C]//Proceedings of ECCV,2008.
    [107] H. T. Shen, X. Zhou, Z. Huang, J. Shao, E. Zhou. UQLIPS: A Real-time Nearduplicate Video Clip Detection System[C]//In Proceedings of Int'l Conf. on VeryLarge Data Bases, pages1374~1377,2007.
    [108] T. C. Hoad, J. Zobel. Fast video matching with signature alignment[C]//InProceedings of MIR, pages262~269,2003.
    [109] S. Poullot, M. Crucianu, O. Buisson. Scalable mining of large video databasesusing copy detection[C]//In Proceedings Of ACM Int'l Conf. on Multimedia,pages61~70,2008.
    [110] Tzvetanka Ianeva Ianeva. Detecting Cartoons: a Case Study in Automatic VideoGenre Classification[D], Ph.D. Research work, Valencia2003.
    [111] Wyszecki G, Stiles W S. Color Science: Concepts and Methods[J], QuantitativeData and Formulae. New York:Wiley,1982.
    [112] DimaiA, Stricker M. Spectral covariance and fuzzy regions for image indexing
    [R], Technical Report BIW I2TR2173. ETH, Zurich: Swiss Federal Institute ofTechnologies.1996.
    [113] Huang J, Kumar S R, Mitra M, et al. Image indexing using correlograms[C]//Proceedings of Computer Vision and Pattern Recognition CVPR’97, San Juan,Puerto Rico,1997.17~19.
    [114] Jianguo Li, Weixin Wu, Tao Wang, Yimin Zhang. One Step Beyond Histograms:Image Representation using Markov Stationary Features[C]//Proceedings ofCVPR2008.
    [115]林元烈.应用随机过程[M].北京:清华大学出版社,2002:78~130.
    [116] L. Breiman. Probability[M]. reprinted by SIAM,1992. Chapter7.3,4.
    [117] Yossi R, CAarlo T, Leonidas J. G. The Earth Mover’s Distance as a Metric forImage Retrieval[C]//Proceedings of the2000IEEE International Conference onComputer Vision, Bombay, India, January2000:59~66.
    [118] Peter J. Burt and Edward H. Adelson．The laplacian pyramid as a compact imagecode[J]．IEEE Transactions on Communication,1983,31(4):532~540.
    [119] Koenderink J．The structure of images[J]．Biological Cybernetics,1984(50):363~396.
    [120] M. Brown, D. G. Lowe. Invariant features from interest point groups[C]//BritishMachine Vision Conference,2002.656~665.
    [121] Aly A. Farag, Alaa E. Abdel-Hakim. Detection, categorization and recognition ofroad signs for autonomous navigation[C]//Proceeding of Advanced Concepts inIntelligent Vision Systems, Brussels, Belgium,2004,31(3):125~130.
    [122] Alaa E. Abdel-Hakim, Aly A.Farag. CSIFT: A SIFT descriptor with colorinvariant characteristics[C]//IEEE Computer Society Conference on ComputerVision and Pattern Recognition,2006,2:1978~1983.
    [123] J. M. Geusebroek, R. van den Boomgaard, A. W. M. Smeulders et al. Colorinvariance[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2001,23(12):1338~1350/
    [124] Lowe D. G． Object recognition from local scale-invariant features． TheProceedings of the Seventh IEEE International Conference on Computer Vision,September20~27,1999, Kerkyra, Greece:1150~1157.
    [125] M. Brown, D. G. Lowe. Unsupervised3D object recognition and reconstructionin unordered datasets[C]//International Conference on3-D Digital Imaging andModeling Ottawa, Canada,2005.1~8.
    [126] Fischler, M.A. and Bolles, R.C. Random Sample Consensus: A Paradigm forModel Fitting with Applications to Image Analysis and Automated Cartography[J]. Communications of the ACM,24(6):381~395,1981.
    [127] http://en.wikipedia.org/wiki/Gaussian_function.
    [128] Nezamabadi H, Kabir E. Image retrieval using histograms of uni2color andbi2color blocks and directional changes in intensity gradient[J]. PatternRecognition Letters,2004,25(14):1547~1557.
    [129] J. Law To, L. Chen, et al. Video copy detection: a comparative study[C]//inProceedings of CIVR,2007:371~378.
    [130] A. Joly, O. Buisson, C. Fr′elicot. Content-based copy retrieval using distortionbased probabilistic similarity search[J]. IEEE Transactions on Multimedia,2007,9(2):293~306.
    [131] Z. Huang, L. Wang, H. T. Shen, J. Shao, X. Zhou. Online near-duplicate videoclip detection and retrieval: An accurate and fast system[C]//Proceedings ofICDE,2009,1511~1514.
    [132] Mauro Cherubini, Rodrigo de Oliveira, Nuria Oliver, Understanding NearDuplicate Videos:A User-Centric Approach[C]//Proceedings of the17th annualACM international conference on Multimedia (MM),2009, Bei Jing, China.
    [133] Basharat A, Zhai Y, Shan M. Content based video matching using spatiotemporalvolumes[J], Journal of Computer Vision and Image Understanding110,3(June2008),360~377.
    [134] Kruitbosch G, Nack F. Broadcast yourself on YouTube: really?[C]//InProceedings of HCC’08, ACM, New York,2008:7~10.
    [135] http://www-nlpir.nist.gov/projects/trecvid/.
    [136]徐新文.基于内容的新闻视频挖掘方法研究[D],博士学位论文,长沙,国防科技大学研究生院,2009.
    [137]凌坚,练益群.新闻单元的自动快速分割方法[J],电视技术,2009,33(7),59~63.
    [138]刘远超,王晓龙,徐志明,关毅.文档聚类综述[J],中文信息学报,2006(3),55~62.
    [139] D. A. Adjeroh, M. C. Lee, I. King. A distance measure for video sequencesimilarity matching[C]//Proceedings of the International Workshop on MultiMedia DatabaseManagement Systems, pages72~79,1998.
    [140] O. Chum, J. Philbin, M. Isard, A. Zisserman. Scalable near identical image andshot detection[C]//Proceedings of CIVR,2007.
    [141] R. K. Ahuja, T. L. Magnanti, J. B. Orlin. Network Flows: Theory, Algorithmsand Applications[M]. PrenticeHall, Reading, Massachusetts,1993.
    [142] D. Goldfarb, Z. Jin. An o(nm)-time network simplex algorithm for the shortestpath problem[J]. Operations Research,47(3):445~448,1999.
    [143] Hampapur, A. Bolle, M. Rudolf, K.-H. Hyun. Comparison of sequencematching techniques for video copy detection[C]//Proceedings of Storage andRetrieval for Media Databases(SPIE),2001.
    [144]戴一奇,胡冠章,陈卫.图论与代数结构[M].北京:清华大学出版社,1995.89~91.
    [145] MYERs C.S, RABINER L.R．A comparative study of sevemI dynamic timewarping algorithms for connecte word recognition[J]．The Bell System TechnicaIJoumal,198l,60(7):1389~1409．
    [146] W. R. Pearson, D. J. Lipman. Improved tools for biological sequence comparison.
    [C]//Proceedings of the National Academy of Sciences of the United States ofAmerica,85(8):2444~2448,1988.
    [147] Chong-Wah Ngo, Yu-Gang Jiang, et al. VIREO/DVMM at TRECVID2009:High-Level Feature Extraction, Automatic Video Search and Content-BasedCopy Detection[C]//Proceedings of TRECVID2009,Oct23.
    [148] Zhou Xiangmin, Zhou Xiaofang, Shen Hengtao. A new similarity for nearduplicate video clip detection[C]//Proceedings of the9th Asia-Pacific web,Heidelberg: Springer,2007:176~187.
    [149]周献忠,史迎春,王韬．基于HSV颜色空间加权Hu不变矩的台标识别[J]．南京理工大学学报,29(3),2005:363~367
    [150] Q. Lv, W. Josephson, Z. Wang, M. Charikar, K. Li. Multi-probe LSH: Efficientindexing for high-dimensional similarity search[C]//In Proceedings of Int'l Conf.on Very Large Data Bases, pages950~961,2007.
    [151] P. Indyk, R. Motwani. Approximate nearest neighbors: Towards removing thecurse of dimensionality[C]//In Proceedings Of ACM Symposium on Theory ofComputing, pages604~613,1998.
    [152] M. Ryyndnen, A. Klapuri. Query by humming of midi and audio using localitysensitive hashing[C]//Proceedings of IEEE Int'l Conf. on Acoustics, Speech andSignal Processing, pages2249~2252,2008.
    [153] Y. Ke, R. Sukthankar, L. Huston. An efficient parts-based near-duplicate andsub-image retrieval system[C]//Proceedings of ACM Int'l Conf. on Multimedia,pages869~876,2004.
    [154] X. Wang, L. Zhang, et al. AnnoSearch: Image Auto-Annotation by Search
    [C]//Proceedings of CVPR2006.
    [155] X. Li, L. Chen, L. Zhang, et al. Image Annotation by Large-Scale Content-basedImage Retrieval[C]//Proceedings of ACM MM’06.
    [156] X. Wang, L. Zhang, et al. Annotating Images by Mining Image Search Results
    [C]//Proceedings of PAMI’08.
    [157] Page L, Brin S, Motwani R, Winograd T. The Pagerank Citation Ranking:Bringing Order to the web[R]. technical report, Stanford University, Stanford,CA,1998.
    [158]冯艺东,汪国平,董士海.信息可视化[J],北京大学计算机科学技术系,北京大学学报,2001.
    [159] Jingdong Wang, Xian Sheng Hua, Yinghai Zhao. Color-Structured Image Search
    [R], Techreport, MSR-TR-2009-82,2009.
    [160]郭庆琳,李艳梅,唐琦.基于VSM的文本相似度计算的研究[J].计算机应用研究,2008年第11期.
    [161] Ying li, et al. An Overview of Video Abstraction Techniques[R]. Image SystemsLaboratory, HP Laboratory Palo Alto, HPL-2001-191,2001.
    [162] P. Schulze-Wollgast, H. Schumann, C. Tominski．Visual Analysis of HumanHealth Data[C]//Proceedings14th Intertional Conference of the InformatonResources Management Association (IRMA’03), Philadelphia, USA,2003．
    [163]邓莉琼,吴玲达,陈丹雯,袁志民.基于OpenGL的时空信息可视化系统设计与实现[C]//第九届全国虚拟现实与可视化学术会议(系统仿真学报增刊),2009.11.
    [164] Jin Y, Khan L, Wang L, Awad M. Image Annotations By Combining MultipleEvidence&Wordnet[C]//Proceedings of ACM Multimedia, Singapore,2005.
    [165] Jeon, J, Lavrenko, V, Manmatha R. Automatic Image Annotation and RetrievalUsing Cross-media Relevance Models[C]//Proceedins of SIGIR, Toronto,2003.
    [166] Hung-Khoon Tan, Chong-Wah Ngo, Richang Hong, et al. Scalable Detection ofPartial Near-Duplicate Videos by Visual-Temporal Consistency[C]//Proceedingsof MM’09, October19~24,2009, Beijing, China.
    [167]彭宇新, Ngo Chong-Wah,董庆杰等.一种通过视频片段进行视频检索的方法[J],软件学报,2003,14(8).
    [168] Changhu Wang, Feng Jing, Lei Zhang, Hong-Jiang Zhang, Image AnnotationRefinement using Random Walk with Restarts[C]//Proceedings of MM'06,October23~27,2006, Santa Barbara, California, USA.
    [169] Smith, Temple F, Waterman Michael S. Identification of Common MolecularSubsequences[J]. Journal of Molecular Biology1981,147:195~197.
    [170] Zi Huang, Bo Hu, Hong Cheng, et al. Mining Near-Duplicate Graph for ClusterBased Reranking of Web Video Search Results[J], ACM Transactions onInformation Systems, Vol.28, No.4, Article22.
    [171] Hung-Khoon Tan, Xiao Wu, Chong-Wah Ngo,et al. Accelerating Near DuplicateVideo Matching by Combining Visual Similarity and AlignmentDistortion[C]//Proceedings of MM, October,2008, Vancouver, British Columbia.
    [172] Hangzai Luo Jianping Fan Yuli Gao, et al. LargeScale News Video Retrieval viaVisualization[C]//Proceedings of MM, October23~27,2006, Santa Barbara,California, USA.
    [173] Fuminori Yamagishi, Shin’ichi Satoh, Masao Sakauchi. A News Video BrowserUsing Identical Video Segment Detection[C]//Proceedings of PCM2004, LNCS3332,2004:205~212.
    [174] Ork de Rooij, Cees G. M. Snoek, Marcel Worring, Balancing Thread BasedNavigation for Targeted Video Search[C]//Proceedings of CIVR'08, July7~9,2008, Niagara Falls, Ontario, Canada.
    [175]邓智耽，贾克斌.一种支持不同时间尺度的视频相似性匹配算法[J],计算机应用研究,2009,26(1).
    [176]吴德敏，陈俊.双序列比对的算法研究[J],计算机工程与应用,2008,44(36).
    [177] Sakrapee Paisitkriangkrai, Tao Mei, Jian Zhang, et al. Scalable Clip-basedNear-duplicate Video Detection with Ordinal Measure[C]//InternationalConference on Image and Video Retrieval (CIVR), July5~7, China,2010.
    [178] Werner Bailer. Evaluating Detection of Near Duplicate Video Segments[C]//International Conference on Image and Video Retrieval (CIVR), July5~7,China,2010.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700