Learning Contextual Dependencies for Optical Flow with Recurrent Neural Networks
详细信息    查看全文
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2017
  • 出版时间:2017
  • 年:2017
  • 卷:10114
  • 期:1
  • 页码:68-83
  • 丛书名:Computer Vision ? ACCV 2016
  • ISBN:978-3-319-54190-7
  • 卷排序:10114
文摘
Pixel-level prediction tasks, such as optical flow estimation, play an important role in computer vision. Recent approaches have attempted to use the feature learning capability of Convolutional Neural Networks (CNNs) to tackle dense per-pixel predictions. However, CNNs have not been as successful in optical flow estimation as they are in many other vision tasks, such as image classification and object detection. It is challenging to adapt CNNs designated for high-level vision tasks to handle pixel-level predictions. First, CNNs do not have a mechanism to explicitly model contextual dependencies among image units. Second, the convolutional filters and pooling operations result in reduced feature maps and hence produce coarse outputs when upsampled to the original resolution. These two aspects render CNNs limited ability to delineate object details, which often result in inconsistent predictions. In this paper, we propose a recurrent neural network to alleviate this issue. Specifically, a row convolutional long short-term memory (RC-LSTM) network is introduced to model contextual dependencies of local image features. This recurrent network can be integrated with CNNs, giving rise to an end-to-end trainable network. The experimental results demonstrate that our model can learn context-aware features for optical flow estimation and achieve competitive accuracy with the state-of-the-art algorithms at a frame rate of 5 to 10 fps.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700