A comparative study of video-based object recognition from an egocentric viewpoint

详细信息查看全文

作者：Mang Shao ; ^{ms2308@ic.ac.uk" class="auth_mail" title="E-mail the corresponding author}Author Vitae ; Danhang TangAuthor Vitae ; Yang LiuAuthor Vitae ; Tae-Kyun KimAuthor Vitae
关键词：Object instance recognition ; Egocentric video ; Comparative study
刊名：Neurocomputing
出版年：2016
出版时间：1 January 2016
年：2016
卷：171
期：Complete
页码：982-990
全文大小：5489 K

文摘

Videos tend to yield a more complete description of their content than individual images. And egocentric vision often provides a more controllable and practical perspective for capturing useful information. In this study, we presented new insights into different object recognition methods for video-based rigid object instance recognition. In order to better exploit egocentric videos as training and query sources, diverse state-of-the-art techniques were categorised, extended and evaluated empirically using a newly collected video dataset, which consists of complex sculptures in clutter scenes. In particular, we investigated how to utilise the geometric and temporal cues provided by egocentric video sequences to improve the performance of object recognition. Based on the experimental results, we analysed the pros and cons of these methods and reached the following conclusions. For geometric cues, the 3D object structure learnt from a training video dataset improves the average video classification performance dramatically. By contrast, for temporal cues, tracking visual fixation among video sequences has little impact on the accuracy, but significantly reduces the memory consumption by obtaining a better signal-to-noise ratio for the feature points detected in the query frames. Furthermore, we proposed a method that integrated these two important cues to exploit the advantages of both.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700