文摘
This paper presents the algorithm and the architecture of the high-throughput motion estimation system for the H.265/HEVC encoder. The design allows the processing of 2160p@30fps videos at the clock frequency of 400 MHz. The architecture embeds two parallel processing paths for the integer-pel and the fractional-pel motion estimation. The paths share the same memories. Access conflicts are avoided by the use of dual-port modules and register buffers for reused samples. In each clock cycle, the integer-pel and the fractional-pel path can evaluate one and four motion vectors for an 8 × 8 luma block, respectively. A separate interpolator for chroma additionally increases the throughput. The integer-pel path supports test zone search for 8 × 8 prediction blocks. The motion estimation for larger blocks is performed by the utilization of results of the 8 × 8 search. The search for rectangular PUs is performed only at the fractional-pel level and reuses partial costs computed for square PUs. As a consequence, a significant amount of computation is saved. Synthesis results show that the design can operate at 200 and 400 MHz when implemented in FPGA Arria II and TSMC 90 nm, respectively. The implemented algorithm is verified in the HM16 software. If 2160p@30fps videos are encoded with the low-delay configuration, BD-PSNR and BD-rate are equal to −0.026 dB and 1.64 %, respectively.KeywordsVideo codingMotion estimationInterpolationH.265/HEVCFPGAVery large-scale integration (VLSI)