Bridging semantic gap between high-level and low-level features in content-based video retrieval using multi-stage ESN–SVM classifier
详细信息    查看全文
文摘
Content-based video retrieval system aims at assisting a user to retrieve targeted video sequence in a large database. Most of the search engines use textual annotations to retrieve videos. These types of engines offer a low-level abstraction while the user seeks high-level semantics. Bridging this type of semantic gap in video retrieval remains an important challenge. In this paper, colour, texture and shapes are considered to be low-level features and motion is a high-level feature. Colour histograms convert the RGB colour space into YcbCr and extract hue and saturation values from frames. After colour extraction, filter mask is applied and gradient value is computed. Gradient and threshold values are compared to draw the edge map. Edges are smoothed for sharpening to remove the unnecessary connected components. These diverse shapes are then extracted and stored in shape feature vectors. Finally, an SVM classifier is used for classification of low-level features. For high-level features, depth images are extracted for motion feature identification and classification is done via echo state neural networks (ESN). ESN are a supervised learning technique and follow the principle of recurrent neural networks. ESN are well known for time series classification and also proved their effective performance in gesture detection. By combining the existing algorithms, a high-performance multimedia event detection system is constructed. The effectiveness and efficiency of proposed event detection mechanism is validated using MSR 3D action pair dataset. Experimental results show that the detection accuracy of proposed combination is better than those of other algorithms

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700