摘要
提出了基于CUDA的并行拉普拉斯金字塔算法。算法采用的并行拉普拉斯算法很好地解决了共享存储器的bank冲突和全局存储器的合并访问的问题,为了最大化并行效率,计算了SM占用率,并通过公式进行了论证。在GTX480平台下,基于CUDA的并行拉普拉斯金字塔算法获得了几十倍的加速比。最后,将基于CUDA的并行拉普拉斯金字塔算法成功地应用于图像融合和增强图片的细节处理,充分证明了并行拉普拉斯金字塔算法广泛的有效性和必要性。
This paper presents a parallel Laplacian pyramid algorithm using CUDA. The parallel Laplacian pyramid algorithm using CUDA is a good match to the banked structure of shared memory and the coalescing requirement for high device memory throughput. The occupancy analysis for kernel is calculated and measured to maximize utilization. With a programmable NVIDIA GTX 480 GPU,the GPU-accelerated Laplacian pyramid algorithm performs dozens of times of speedup. The effective image fusion and the detail manipulation further demonstrate the feasibility and necessity of the parallel Laplacian pyramid algorithm.
引文
[1]赵健,高军,罗超,等.基于数字图像处理的玻璃缺陷在线检测系统[J].电子技术应用,2013,39(12):90-92.
[2]李波,梁攀,关沫.一种基于边缘提取的交互式图像分割算法[J].微型机与应用,2013,32(10):41-47.
[3]PARIS S,HASINOFF S,KAUTZ J.Local Laplacian filters:edgeaware image processing with a Laplacian pyramid[J].ACM Transactions on Graphics,2011,30(4):1244-1259.
[4]祁艳杰.LOG算子在FPGA中的实现[J].电子技术应用,2007,33(3):63-65.
[5]NVIDIA.NVIDIA CUDA programming guide 4.0[EB/OL].(2011-3-2)[2016-03-29]http://developer.nvidia.com/cudatoolkit-40.