课程直播系统中基于H.264的屏幕视频编码器的优化

英文题名：Optimized H.264 Encoder for Live Courses Broadcasting
作者：金磊
论文级别：硕士
学科专业名称：教育技术学
中文关键词：远程教育 ; 屏幕编码 ; 视频编码 ; H.264标准 ; 混合图像压缩 ; 码率控制
英文关键词：distance education ; screen encoding ; video encoding ; H.264 standard ; compound image compression ; rate control
学位年度：2011
导师：谢伟凯
学科代码：040110
学位授予单位：上海交通大学
答辩委员会主席：俞勇

摘要

屏幕编解码（Screen codec）是一种针对计算机生成的屏幕图像序列的特殊的视频编解码技术。它将显示于一台计算机屏幕上的内容实时采集下来，进行压缩编码后，实时传输到远程终端上进行解码显示，或者存储到文件供事后回放。传统的屏幕编解码技术是基于复合图像压缩算法（Compound image compression），然后对屏幕视频进行变化区域检测和简单的运动搜索，但是一旦屏幕视频运动内容复杂，这样的压缩方式效率就比较低。
     目前压缩率最高的通用视频编码器是H.264编码器，使用这一编码器进行屏幕编码已经成为研究的新方向。X264是该标准编码器的最优的一个开源实现。本文首先对x264框架和编码算法进行了深入分析，而后在此基础实现了一个基于H.264视频编码器的屏幕编码系统。但是X264直接用于实时屏幕视频的压缩仍然存在着两个主要问题：一是编码器运算复杂，CPU占用率高，会影响屏幕编码系统录制的屏幕操作的流畅性；二是H.264编码器实时码率控制算法不能有效限制峰值码率，且所得压缩视频画面质量波动较大，会影响接收端用户的体验。
     针对这些问题，本文在x264的基础上实现了2个优化方案：首先，加入了基于Mirror Driver变化区域检测的编码模式决策优化，为无变化的区域的宏块快速模式决策，从而加快编码速度，经Intel VTune测试发现，该方法将CPU的执行时间有效降低了30%；然后又提出了一种自适应帧率的峰值码率控制方法（frame rate adaptive constantquantization parameter, FRACQP），它在x264原有的固定量化参数（Quantizaiton Parameter）模式基础上通过降低局部帧率来限制峰值码率。通过与x264原有峰值码率的对比实验,我们发现FRACQP所得的视频,峰值码率严格遵循上限,且画面质量的稳定,视频的平均PSNR提高了3~8db。
Screen codec is a special codec to compress computer-synthesized video. It real-time captures the computer screen content and compress them into a video and live broadcasting it to the network or store it locally for playback. Traditional screen codec is based on compound image compression and then integrates the changed regions detection and simply motion search steps into the algorithm. However, the compression ratio of these hybrid compression methods are far from being enough if the screen video contains complicated motions.
     Currently, the general video encoder with highest compression effiency is H.264 encoder, which to develop screen encoder based on has become a new research direction. X264 is one of the best implementation of the encoders following this standard. We firstly investigated its framwork and encoding algorithms deeply and then implement a screen encoding system by modifying it.
     However, to apply x264 directly into real-time screen video compression will bring two majors problems: (1) its high computational complexity and high CPU occupancy rate will affect other operations on the sender’s computer; (2) H.264 encoder’s real-time bitrate control algorithm is sometimes ineffective in limiting the peak bitrate. Furthermore, the fluctuation of the picture quality of the compressed video will deteriorate the user experience.
     To solve the problems above, in one hand, we integrate a Mirror Driver-based changed region computation step into the classic H.264 encoder, x264. After the step, we can find the unchanged macroblocks and then quick-find appropriate prediction modes for them. Intel VTune profiling results show that CPU time has been reduced by about 30%. On the other hand, we propose an adaptive frame rate based constant Quantization Parameter rate control algorithm (FRACQP), which limits peak bitrate by reducing the local frame rate. FRQCQP can strictly control the peak bitrate while smoothing the video quality and finally achieves 3 ~ 8db PSNR improvement.

引文

[1] I. Richardson,“H.264 and MPEG-4 video compression,”Wiley Online Library, 2003.
    [2] R. C. Gonzalez and R. E. Woods, "Digital image processing," Prentice Hall, pp. 409-518, 2002.
    [3] A. Al, B. P.ao, S. S. Kudva, et al, "Quality and complexity comparison of H.264 intra mode with JPEG2000 and JPEG," in Image Processing, 2004. ICIP '04. 2004 International Conference on, 2004, pp. 525-528 Vol. 1.
    [4] "Recommendation ITU-T H.264," 2010.
    [5] M. H. Pinson, S. Wolf and G.. Cermak, "HDTV Subjective Quality of H.264 vs. MPEG-2, With and Without Packet Loss," Broadcasting, IEEE Transactions on, vol. 56, pp. 86-91, 2010.
    [6]刘毓敏,数字视音频技术与应用,电子工业出版社, 2003.
    [7] T. Wiegand, G. J. Sullivan, G. Bjontegaard and A. Luthra, "Overview of the H.264/AVC video coding standard," Circuits and Systems for Video Technology, IEEE Transactions on, vol. 13, pp. 560-576, 2003.
    [8] T. Lin and P. Hao, "Compound image compression for real-time computer screen image transmission," Image Processing, IEEE Transactionson on, vol.14, pp.993-1005, 2005.
    [9] X. Zhang and H. Takahashi, "A Hybrid Data Compression Scheme for Improved VNC," Systemics, Cybernetics and Informatics, vol. 5, p.1 C4, 2007.
    [10] C. Lu, W. Xie and Z. Zhang, "An Enhanced Screen Codec for Live Lecture Broadcasting," presented at the 2010 International Conference on Audio, Language and Image Processing, 2010.
    [11] D. Wenpeng, L. Dong, H. Yuwen and W. Feng, "Block-based Fast Compression for Compound Images," in Multimedia and Expo, 2006 IEEE International Conference on, 2006, pp.809-812.
    [12] H. Shen, Y. Lu, F. Wu and S. Li, "A high-performanance remote computing platform," in Pervasive Computing and Communications, 2009. PerCom 2009. IEEE International Conference on, 2009, pp.1-6.
    [13] L. Xin and L. Shawmin, "Block-based segmentation and adaptive coding for visually lossless compression of scanned documents," in Image Processing, 2001. Proceedings. 2001 International Conference on, 2001, pp.450-453 vol.3.
    [14] W. Ding, Y. Lu and F. Wu, "Enable Efficient Compound Image Compression in H.264/AVC Intra Coding," in Image Processing, 2007. ICIP 2007. IEEE International Conference on, 2007, pp. II - 337-II - 340.
    [15] C. Lan, G. Shi, F. Wu, "Compress Compound Images in H.264/MPGE-4 AVC by Exploiting Spatial Correlation," Image Processing, IEEE Transactions on, vol.19, pp.946-957, 2010.
    [16] Y. Wu and D. C. Coll, "Single bit-map block truncation coding of color images," Selected Areas in Communications, IEEE Journal on, vol.10, pp.952-959, 1992.
    [17] W. Shuhui and L. Tao, "A Unified LZ and Hybrid Coding for Compound Image Partial-Lossless Compression," in Image and Signal Processing, 2009. CISP '09. 2nd International Congress on, 2009, pp.1-5.
    [18] R. Huang, H. Liu and Y. Chen, "Screen Capture and Arithmetic Coding Based on Content Increment," in Machine Learning and Cybernetics, 2007 International Conference on, 2007, pp. 4170-4175.
    [19] B. Han, D. Wu and H. Zhang, "Block-based method for real-time compound video compression," in Mobile Multimedia/Image Processing, Security, and Applications 2010, Orlando, Florida, USA, 2010, pp. 77080S-8.
    [20] J. Wenbin and Z. Manli, "A fast BMA based on combining search candidate subsampling and APDS," in Multimedia and Expo, 2004. ICME '04. 2004 IEEE International Conference on, 2004, pp. 1115-1118 Vol.2.
    [21]蒋文斌,金海,过敏意, et al,, "一种面向窄带环境的远程屏幕同步方法,"软件学报, vol. 17, pp. 233-242, 2006.
    [22] A. Zaghetto and R. L. de Queiroz, "Segmentation-Driven Compound Document Coding Based on H.264/AVC-INTRA," Image Processing, IEEE Transactions on, vol. 16, pp. 1755-1760, 2007.
    [23] W. Shuhui and L. Tao, "United coding for compound image compression," in Image and Signal Processing (CISP), 2010 3rd International Congress on, 2010, pp. 566-570.
    [24] X. Weikai, C. Lu, Z. Zhang, et al, "Eliminating periodical key frame in online lecture broadcasting by Peer-Assisted Reference Frame Synchronization," in Broadband Multimedia Systems and Broadcasting (BMSB), 2010 IEEE International Symposium on, 2010, pp. 1-6.
    [25] "X264," http://www.videolan.org/developers/x264.html, 2011.
    [26] "Mirror Drivers " http://msdn.microsoft.com/en-us/library/windows/hardware/ff568315%28v=vs.85%29.aspx, 2011.
    [27] "JM Software," http://iphome.hhi.de/suehring/tml/, 2011.
    [28] "H.264/MPEG-4 AVC," http://en.wikipedia.org/wiki/H.264/MPEG-4_AVC, 2011.
    [29] L. Merritt and R. Vanam, "Improved Rate Control and Motion Estimation for H.264 Encoder," in Image Processing, 2007. ICIP 2007. IEEE International Conference on, 2007, pp. V - 309-V - 312.
    [30] D. Vatolin, D. Kulikov and A. Parshin, "MPEG-4 AVC/H.264 Video Codecs Comparison," 2010.
    [31]张洁琼and章义来, "远程屏幕差异截屏的研究与实现,"福建电脑, pp.178-179, 2010.
    [32] T. Opferman, "Driver Development Part 6: Introduction to Display Drivers," http://www.codeproject.com/KB/system/driverdev6asp.aspx, 2006.
    [33] "DfMirage SDK v1.0 Developer's Guide", http://www.demoforge.com/sdk/dfmirage-sdk-1.2.74.0.zip,2011
    [34] "Intel? VTune? Amplifier XE ". http://software.intel.com/en-us/articles/intel-vtune-amplifier-xe/, 2011.
    [35] G. J. Sullivan, P. Topiwala and A. Luthra, "The H. 264/AVC advanced video coding standard: Overview and introduction to the fidelity range extensions," in SPIE conference on Applications of Digital Image Processing XXVII, 2004, pp. 454-474.
    [36] "Method for the Subjective Assessment of the Quality of Television Pictures," http://www.itu.int/md/R07-SG06-C-0150/en, 2009.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700