Learning Contextual Dependencies for Optical Flow with Recurrent Neural Networks

详细信息查看全文

刊名：Lecture Notes in Computer Science
出版年：2017
出版时间：2017
年：2017
卷：10114
期：1
页码：68-83
丛书名：Computer Vision ? ACCV 2016
ISBN：978-3-319-54190-7
卷排序：10114

文摘

Pixel-level prediction tasks, such as optical flow estimation, play an important role in computer vision. Recent approaches have attempted to use the feature learning capability of Convolutional Neural Networks (CNNs) to tackle dense per-pixel predictions. However, CNNs have not been as successful in optical flow estimation as they are in many other vision tasks, such as image classification and object detection. It is challenging to adapt CNNs designated for high-level vision tasks to handle pixel-level predictions. First, CNNs do not have a mechanism to explicitly model contextual dependencies among image units. Second, the convolutional filters and pooling operations result in reduced feature maps and hence produce coarse outputs when upsampled to the original resolution. These two aspects render CNNs limited ability to delineate object details, which often result in inconsistent predictions. In this paper, we propose a recurrent neural network to alleviate this issue. Specifically, a row convolutional long short-term memory (RC-LSTM) network is introduced to model contextual dependencies of local image features. This recurrent network can be integrated with CNNs, giving rise to an end-to-end trainable network. The experimental results demonstrate that our model can learn context-aware features for optical flow estimation and achieve competitive accuracy with the state-of-the-art algorithms at a frame rate of 5 to 10 fps.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700