用户名: 密码: 验证码:
基于自适应可分离卷积核的视频压缩伪影去除算法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Video compression artifact removal algorithm based on adaptive separable convolution network
  • 作者:聂可卉 ; 刘文哲 ; 童同 ; 杜民 ; 高钦泉
  • 英文作者:NIE Kehui;LIU Wenzhe;TONG Tong;DU Min;GAO Qinquan;School of Physics and Information Engineering, Fuzhou University;Key Laboratory of Medical Instrumentation & Pharmaceutical Technology of Fujian Province (Fuzhou University);Imperial Vision Technology Company;
  • 关键词:视频质量增强 ; 光流估计 ; 运动补偿 ; 自适应可分离卷积 ; 去视频压缩伪影
  • 英文关键词:video quality enhancement;;optical flow estimation;;motion compensation;;adaptive separable convolution;;video compression artifact removal
  • 中文刊名:JSJY
  • 英文刊名:Journal of Computer Applications
  • 机构:福州大学物理与信息工程学院;福建省医疗器械与医药技术重点实验室(福州大学);福建帝视信息科技有限公司;
  • 出版日期:2019-05-10
  • 出版单位:计算机应用
  • 年:2019
  • 期:v.39;No.345
  • 语种:中文;
  • 页:JSJY201905040
  • 页数:7
  • CN:05
  • ISSN:51-1307/TP
  • 分类号:233-239
摘要
针对目前视频质量增强和超分辨率重建等任务中常采用的光流估计相关算法只能估计像素点间线性运动的问题,提出了一种新型多帧去压缩伪影网络结构。该网络由运动补偿模块和去压缩伪影模块组成。运动补偿模块采用自适应可分离卷积代替传统的光流估计算法,能够很好地处理光流法不能解决的像素点间的曲线运动问题。对于不同视频帧,运动补偿模块预测出符合该图像结构和像素局部位移的卷积核,通过局部卷积的方式实现对后一帧像素的运动偏移估计和像素补偿。将得到的运动补偿帧和原始后一帧联结起来作为去压缩伪影模块的输入,通过融合包含不同像素信息的两视频帧,得到对该帧去除压缩伪影后的结果。与目前最先进的多帧质量增强(MFQE)算法在相同的训练集和测试集上训练并测试,实验结果表明,峰值信噪比提升(ΔPSNR)较MFQE最大增加0.44 dB,平均增加0.32 dB,验证了所提出网络具有良好的去除视频压缩伪影的效果。
        The existing optical flow estimation methods, which are frequently used in video quality enhancement and super-resolution reconstruction tasks, can only estimate the linear motion between pixels. In order to solve this problem, a new multi-frame compression artifact removal network architecture was proposed. The network consisted of motion compensation module and compression artifact removal module. With the traditional optical flow estimation algorithms replaced with the adaptive separable convolution, the motion compensation module was able to handle with the curvilinear motion between pixels, which was not able to be well solved by optical flow methods. For each video frame, a corresponding convolutional kernel was generated by the motion compensation module based on the image structure and the local displacement of pixels. After that, motion offsets were estimated and pixels were compensated in the next frame by means of local convolution. The obtained compensated frame and the original next frame were combined together as input for the compression artifact removal module. By fusing different pixel information of the two frames, the compression artifacts of the original frame were removed. Compared with the state-of-the-art Multi-Frame Quality Enhancement(MFQE) algorithm on the same training and testing datasets, the proposed network has the improvement of Peak Signal-to-Noise Ratio(ΔPSNR) increased by 0.44 dB at most and 0.32 dB on average. The experimental results demonstrate that the proposed network performs well in removing video compression artifacts.
引文
[1] DONG C,DENG Y,CHEN C L,et al.Compression artifacts reduction by a deep convolutional network[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision.Piscataway,NJ:IEEE,2015:576-584.
    [2] GUO J,CHAO H.Building dual-domain representations for compression artifacts reduction [C]// ECCV 2016:Proceedings of the 2016 European Conference on Computer Vision.Berlin:Springer,2016:628-644.
    [3] GOODFELLOW I J,POUGET-ABADIE J,MIRZA M,et al.Generative adversarial networks[J/OL].arXiv Preprint,2014,2014:arXiv:1406.2661 [2014- 06- 10].https://arxiv.org/abs/1406.2661.
    [4] GUO J,CHAO H.One-to-many network for visually pleasing compression artifacts reduction [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2017:4867-4876.
    [5] GALTERI L,SEIDENARI L,BERTINI M,et al.Deep generative adversarial compression artifact removal [C]// Proceedings of the 2017 IEEE International Conference on Computer Vision.Piscataway,NJ:IEEE,2017:4836-4845.
    [6] 杨丽丽,盛国.一种基于卷积神经网络的矿井视频图像降噪方法[J].矿业研究与开发,2018,38(2):106-109.(YANG L L,SHENG G.A mine video image denoising method based on convolutional neural network[J].Mining Research and Development,2018,38(2):106-109.)
    [7] REN W,PAN J,CAO X,et al.Video deblurring via semantic segmentation and pixel-wise non-linear kernel[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision.Piscataway,NJ:IEEE,2017:1086-1094.
    [8] SAJJADI M S M,VEMULAPALLI R,BROWN M.Frame-recurrent video super-resolution[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2018:6626-6634.
    [9] TAO X,GAO H,LIAO R,et al.Detail-revealing deep video super-resolution [C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2018:6626-6634.
    [10] 李玲慧,杜军平,梁美玉,等.基于时空特征和神经网络的视频超分辨率算法[J].北京邮电大学学报,2016,39(4):1-6.(LI L H,DU J P,LIANG M Y,et al.Video super resolution algorithm based on spatiotemporal features and neural networks[J].Journal of Beijing University of Posts and Telecommunications,2016,39(4):1-6.)
    [11] WANG T,CHEN M,CHAO H.A novel deep learning-based method of improving coding efficiency from the decoder-end for HEVC[C]// Proceedings of the 2017 Data Compression Conference.Piscataway,NJ:IEEE,2017:410-419.
    [12] YANG R,XU M,WANG Z.Decoder-side HEVC quality enhancement with scalable convolutional neural network[C]// Proceedings of the 2017 IEEE International Conference on Multimedia and Expo.Piscataway,NJ:IEEE,2017:817-822.
    [13] YANG R,XU M,WANG Z,et al.Enhancing quality for HEVC compressed videos [J/OL].arXiv Preprint,2018,2018:arXiv:1709.06734 (2017- 09- 20) [2018- 07- 06].https://arxiv.org/abs/1709.06734.
    [14] YANG R,XU M,LIU T,et al.Multi-frame quality enhancement for compressed video [C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2018:6664-6673.
    [15] DOSOVITSKIY A,FISCHERY P,ILG E,et al.FlowNet:learning optical flow with convolutional networks [C]// Proceedings of the 2015 IEEE International Conference on Computer Vision.Piscataway,NJ:IEEE,2015:2758-2766.
    [16] BAILER C,TAETZ B,STRICKER D.Flow fields:dense correspondence fields for highly accurate large displacement optical flow estimation[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision.Piscataway,NJ:IEEE,2015:4015-4023.
    [17] REVAUD J,WEINZAEPFEL P,HARCHAOUI Z,et al.EpicFlow:edge-preserving interpolation of correspondences for optical flow [C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2015:1164-1172.
    [18] ILG E,MAYER N,SAIKIA T,et al.FlowNet2.0:evolution of optical flow estimation with deep networks[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2017,2:6.
    [19] MAHAJAN D,HUANG F C,MATUSIK W,et al.Moving gradients:a path-based method for plausible image interpolation [J].ACM Transactions on Graphics,2009,28(3):Article No.42.
    [20] JADERBERG M,SIMONYAN K,ZISSERMAN A,et al.Spatial transformer networks[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems.Cambridge,MA:MIT Press,2015:2017-2025.
    [21] NIKLAUS S,MAI L,LIU F.Video frame interpolation via adaptive separable convolution [C]// Proceedings of the 2017 IEEE International Conference on Computer Vision.Washington,DC:IEEE Computer Society,2017:261-270.
    [22] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2016:770-778.
    [23] HE K,ZHANG X,REN S,et al.Identity mappings in deep residual networks [C]// ECCV 2016:Proceedings of the 2016 European Conference on Computer Vision.Berlin:Springer,2016:630-645.
    [24] DROZDZAL M,VORONTSOV E,CHARTRAND G,et al.The importance of skip connections in biomedical image segmentation [M]// Deep Learning and Data Labeling for Medical Applications.Berlin:Springer,2016:179-187.
    [25] BOSSEN F.Common test conditions and software reference configurations [S/OL].[2013- 06- 20].http://wftp3.itu.int/av-arch/jctvc-site/2010_07_B_Geneva/JCTVC-B300.doc.
    [26] GLOROT X,BENGIO Y.Understanding the difficulty of training deep feedforward neural networks [C]// Proceedings of the 13th International Conference on Artificial Intelligence and Statistics.Sardinia,Italy:JMLR,2010:249-256.
    [27] KINGMA D,BA J.Adam:a method for stochastic optimization[EB/OL].[2018- 03- 20].http://yeolab.weebly.com/uploads/2/5/5/0/25509700/a_method_for_stochastic_optimization_.pdf.
    [28] BARRON J T.A more general robust loss function[J/OL].arXiv Preprint,2017,2017:arXiv:1701.03077 (2017- 01- 11) [2017- 01- 11].https://arxiv.org/abs/1701.03077.
    [29] LAI W S,HUANG J B,AHUJA N,et al.Deep Laplacian pyramid networks for fast and accurate super-resolution[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2017:5835-5843.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700