面向高性能图形绘制的加速结构设计
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着图形硬件的飞速发展和虚拟现实系统、电影游戏制作等领域对高真实感图片的需求增加,关于高性能图形绘制技术的研究变得更加重要和迫切。高性能的图形绘制要求在保证绘制质量的前提下,提高绘制的速度。使用光线跟踪技术来生成高质量的图片需要进行大量的可见性计算,借助有效的加速结构可以明显减少计算量,降低场景绘制的复杂性,缩短绘制的过程。高性能的图形绘制对加速结构的质量、构造和访问速度都提出了更高的要求,加大了加速结构的设计难度。
     本文主要从两个角度深入研究面向高性能图形绘制的加速结构的有效设计方法:一方面,研究高质量加速结构的快速并行构造方法和高效遍历方法,尤其是解决如何在GPU (Graphics Process Unit)并行计算架构上有效地处理算法的非相似性计算问题和不规则动态计算问题。另一方面,改进现有的图形绘制算法,对一些高级真实感图形效果的绘制,根据它们的结构特点以及对加速结构的影响,设计专门的加速结构,降低计算的复杂度,以进行更加有效的绘制。
     具体而言,本文从高质量加速结构的并行构造与高效遍历方法、对动态场景的支持、对二级光线追踪效果的支持以及对运动模糊效果的支持四个方面深入研究加速结构的设计方法,主要贡献包括:
     ·提出了一个新的加速结构MKD (Multi-KD)树,从构造速度和访问速度两方面解决了现有层次结构在GPU架构上执行所存在的问题。设计了多维度SAH (surface area heuristic)并行构造方法,快速构造高质量的MKD树。设计了MKD树的快速遍历算法,通过一种渐进式有序组合的方法,实现高效的有序访问,并通过光线包的自适应组织方法,动态调整数据处理的方式,维持计算的相似性。另外,设计了有效的队列通信机制,实现计算任务在各个处理核之间的均衡分配。
     ·提出了对层次包围盒BVH (bounding volume hierarchies)小规则数据结构的分阶段快速并行构造方法,实现对动态场景的快速光线跟踪计算。设计了基于GPU并行架构的特点,使用不同的并行计算粒度,分别在构造初期、构造中期及构造后期进行快速而有效的并行构造的方法。
     ·提出了面向二级光线的加速结构遍历方法,有效减轻了二级光线在遍历过程中可能出现的动态不规则执行行为对计算和访存效率的影响。设计了以数据驱动的方式来合理调度线程的执行方法和动态的结点访问策略,结合GPU架构的存储层次结构特点,从减少访存次数以及维持光线之间相似性的角度出发,对数据进行重新组织,挖掘潜在的并行性,优化带宽的使用。
     ·提出了一种新的加速结构MBBVH以及相应的遍历方法,用来加速含有大量不规则运动行为面片的场景中运动模糊效果的绘制。提出了面向运动模糊效果绘制的运动行为分类方法,将场景中面片的运动行为根据移动向量的特点分为规则运动和不规则运动两类。并基于此,在构造过程中追踪具有不规则运动行为的面片,在遍历过程中,进行动态的调整计算,从而维持加速结构的高效性。为了控制调整计算的开销,设计了基于时间维度的分割花费模型,可以进行基于时间特征的结点构造与遍历;以及两种评估策略,可以自适应地在遍历过程中实现线性插值与调整计算两种处理方式的自动切换。另外,还对场景中可能存在的形状较大或者较长的不规则面片进行了特别的处理,减少结点包围盒的重叠区域。
As the rapid development of graphics hardware and the growing needs for photorealistic pictures from virtual reality system, games and film production, research on high-performance graphics rendering becomes more important and urgent. High-performance graphics rendering requires reducing rendering time while improving rendering quality. Ray tracing used to generate photorealistic pictures needs intense visual computation. Efficient acceleration structure helps significantly reduce computation, decrease the complexity of rendering, and reduce rendering time. However, high-performance graphics rendering requires acceleration structure with higher quality, faster construction and traversal, which increases the design complexity.
     This dissertation discusses an efficient acceleration structure methodology for high performance graphics rendering:Our first focus is what a fast parallel construction algorithm and efficient traversal algorithm will be fundamental for high quality acceleration structure, especially how to solve the irregular, dynamic and incoherent computation. Our second focus is how the current graphics algorithm should be evolved to efficiently render some advanced photorealistic effects. Based on their structure features and the influence on acceleration structure, we propose special acceleration structure to reduce the computational complexity, and render the scene more efficiently.
     In summary, this dissertation mainly identifies four problem areas:a highly parallel construction algorithm and efficient traversal algorithm for high quality acceleration structure, support for dynamic scenes, support for secondary ray effects, support for motion blur effect. The main contributions are listed as follows:
     ●We propose a novel GPU data structure MKD to solve complex problems caused by hierarchy structure on GPU at both construction time and traversal time. We introduce a multi-dimensional SAH parallel construction to rapidly construct high quality acceleration structure. We design a MKD fast traversal algorithm, and use an incremental ordered combination to implement highly efficient ordered traversal. We offer an adaptive ray packet organization, and dynamically adjust data processing method. We also describe an efficient queue communication mechanism to distribute the work efficiently across the available processing units.
     ●In order to fast ray tracing for dynamic scene, we propose an approach for fast BVH construction based on tree structure feature and multi-core architecture. This technique uses fine-grained parallelism granularity, and adaptes different construction strategies for the early, midterm and late construction phases.
     ●We describe an efficient traversal algorithm for secondary rays. Considering dynamic and irregular features under the GPU memory hierarchy, we schedule tasks according to coherent data under the data-driven execution, and design a dynamic node visit strategy. These schemes reorganize the data, exploit inherent parallelism, reduce memory access, increase ray coherence, and optimize bandwidth usage.
     ●We present a construction and ray traversal algorithm designed to accelerate rendering of scenes with high quality motion blur, which focuses on capturing irregular motion. We classify primitive motions into regular motion and irregular motion based on motion vector. At build-time, we track primitives with significantly irregular motion, and handle these difficult primitives at traversal time. We design a based time cost model to construct and traversal according to the time attribute. We also describe two heuristics to switch automatically the shrinking and updating during traversal. These two schemes can efficiently control the adjusting cost. At last, we present a building technique for the scene containing the big triangles, and reduce ray shoot costs.
引文
[1]V. Govindaraju, P. Djeu, K. Sankaralingam, et al. Toward a multicore architecture for real-time ray-tracing [C]. Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture, Como, Italy,2008:176-187.
    [2]D. Luebke and S. Parker. Interactive raytracing with CUDA [R]. NVIDIA,2008.
    [3]S. Woop, J. Schmittler and P. Slusallek. RPU:a programmable ray processing unit for realtime ray tracing [C]. Proceeding of ACM SIGGRAPH, Los Angeles, California, USA,2005:434-444.
    [4]P. H. Christensen, J. Fong, D. M. Laur, et al. Ray tracing for the movie 'cars'[C]. Proceedings of the IEEE Symposium on Interactive Ray Tracing, Salt Lake City, Utah, USA,2006:1-6.
    [5]J. Ragan-Kelley, C. Kilpatrick, B. W. Smith, et al. The Lightspeed automatic interactive lighting preview system [C]. Proceeding of ACM SIGGRAPH, San Diego, California, USA,2007:25.
    [6]E. Hammon. GPU Gems 3-chapter practical post-process depth of field [M]. Addison-Wesley,2007.
    [7]L. Gritz. Production Perspectives on High Performance Graphics [R]. High Performance Graphics Conference,2009.
    [8]A. Appel. Some techniques for shading machine renderings of solids [C]. Proceedings of Spring Joint Computer Conference, Atlantic City, New Jersey, USA, 1968:37-45.
    [9]E. Mansson, J. Munkberg and T. Akenine-Moller. Deep coherent ray tracing [C]. Proceeding of 2007 IEEE Symposium on Interactive Ray Tracing, Ulm, Germany, 2007:79-85.
    [10]K. Egan, Y.-T. Tseng, N. Holzschuch, et al. Frequency analysis and sheared reconstruction for rendering motion blur [C]. Proceeding of ACM SIGGRAPH, New Orleans, Louisiana, USA,2009:1-13.
    [11]T. Aila and S. Laine. Understanding the efficiency of ray traversal on GPUs [C]. Proceedings of the Conference on High Performance Graphics, New Orleans, Louisiana, USA,2009:145-149.
    [12]T. Aila and T. Karras. Architecture considerations for tracing incoherent rays [C], Proceedings of the Conference on High Performance Graphics, Saarbrucken, Germany,2010:113-122.
    [13]D. Cederman. Concurrent algorithms and data structures for many-core processors [CP/DK]. Chalmers University of Technology,2011.
    [14]T. Foley and J. Sugerman. KD-tree acceleration structures for a GPU raytracer [C]. Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware, Los Angeles, California, USA,2005:15-22.
    [15]C. Lauterbach, M. Garland, S. Sengupta, et al. Fast bvh construction on gpus [J]. Computer Graphics Forum,2009,28(2):375-384.
    [16]Q. Hou, H. Qin, W. Li, et al. Micropolygon ray tracing with defocus and motion blur [C]. Proceeding of ACM SIGGRAPH, Los Angeles, California, USA,2010: 1-10.
    [17]A. Appel. On calculating the illusion of reality [J]. IFIP Congress,1968, (2): 945-950.
    [18]T. Whitted. An improved illumination model for shaded display [J]. Commun. ACM,1980,23(6):343-349.
    [19]R. L. Cook, T. Porter and L. Carpenter. Distributed ray tracing [C]. Proceeding of SIGGRAPH, Minneapolis, Minnesota, USA,1984:137-145.
    [20]J. T. Kajiya. The rendering equation [J]. SIGGRAPH Comput. Graph.,1986, 20(4):143-150.
    [21]E. Veach and L. J. Guibas. Metropolis light transport [C]. Proceedings of the 24th annual Conference on Computer graphics and interactive techniques, New York, NY, USA,1997:65-76.
    [22]H. W. Jensen. Realistic image synthesis using photon mapping [M]. AK Peters, Wellesley,2001.
    [23]S. Boulos, D. Edwards, J. D. Lacewell, et al. Packct-based whitted and distribution ray tracing [C]. Proceedings of Graphics Interface 2007, Montreal, Canada,2007:177-184.
    [24]I. Wald. Realtime ray tracing and interactive global illumination [CP/DK]. Utah University,2004.
    [25]W. R. Mark and D. Fussell. Real-time rendering systems in 2010 [C]. ACM SIGGRAPH 2005 Courses, Los Angeles, California, USA,2005:19.
    [26]I. Wald, P. Slusallek, C. Benthin, et al. Interactive rendering with coherent ray tracing [J]. Computer Graphics Forum,2001,20(3):153-164.
    [27]R. M. Ramanathan. Extending the world's most popular processor architecture [R]. Intel,2006.
    [28]C. Benthin. Realtime ray tracing on current CPU architectures [CP/DK]. Saarland University,2006.
    [29]G. Marmitt, A. Kleer, I. Wald, et al. Fast and accurate ray-voxel Intersection techniques for iso-surface ray tracing [C]. Proceedings of Vision, Modeling, and Visualization, Stanford, California, USA,2004:429-435.
    [30]C. Lauterbach, S. Yoon and D. Manocha. Interactive ray tracing of dynamic scenes using BVHs [C]. Proceedings of the 2006 IEEE Symposium on Interactive Ray Tracing, Salt Lake City, Utah, USA,2006:39-45.
    [31]A. Reshetov, A. Soupikov and J. Hurley. Multi-level ray tracing algorithm [J]. ACM TOG SIGGRAPH,2005,24(3):1176-1185.
    [32]S. Boulos, I. Wald and P. Shirley. Geometric and arithmetic culling methods for entire ray packets [R]. School of Computing, University of Utah,2006.
    [33]I. Wald, S. Boulos and P. Shirley. Ray tracing deformable scenes using dynamic bounding volume hierarchies [J]. ACM Trans. Graph.,2007,26(1):6.
    [34]A. Reshetov. Faster ray packets-triangle intersection through vertex culling [C]. ACM SIGGRAPH 2007 Posters, San Diego, California, USA,2007:171.
    [35]I. Wald, T. Ize, A. Kensler, et al. Ray tracing animated scenes using coherent grid traversal [J]. ACM Trans. Graph.,2006,25(3):485-493.
    [36]J. Schmittler. SaarCOR-A hardware-architecture for realtime ray tracing [CP/DK]. Saarland University,2006.
    [37]I. Wald, T. Ize and S. G. Parker. Fast, parallel, and asynchronous construction of BVHs for ray tracing animated scenes [J]. Computer&Graphics,2008,32(1):3-13.
    [38]A. Reshetov. Omnidirectional ray tracing traversal algorithm for kd-trees [C]. Proceeding of IEEE Symposium on Interactive Ray Tracing, Salt Lake City, Utah, USA,2006:57-60.
    [39]J. Arvo and D. Kirk. An introduction to ray tracing-a survey of ray tracing acceleration techniques [M]. Academic Press,1989.
    [40]J. Arvo and D. Kirk. Fast ray tracing by ray classification [J]. Computer Graphics, 1987,21(4):55-64.
    [41]N. A. Carr, J. D. Hall and J. C. Hart. The ray engine [C]. Proceedings of Graphics Hardware, Saarbrucken, Germany,2002:37-46.
    [42]H. Dammertz, J. Hanika and A. Keller. Shallow bounding volume hierarchies for fast SIMD ray tracing of incoherent rays [J]. Computer Graphics Forum,2008, 27(4):1225-1233.
    [43]M. Ernst and G. Greiner. Multi bounding volume hierarchies [C]. Proceeding of IEEE/EG Symposium on Interactive Ray Tracing Los Angeles, California, USA, 2008:35-40.
    [44]I. Wald, C. Benthin and S. Boulos. Getting rid of packets:efficient SIMD single-ray traversal using multibranching BVHs [C]. Proceeding of IEEE/Eurographics Symposium on Interactive Ray Tracing, Los Angeles, California, USA,2008:49-57.
    [45]M. Pharr, C. Kolb, R. Gershbein, et al. Rendering complex scenes with memory-coherent ray tracing [C]. Proceedings of the 24th annual Conference on Computer graphics and interactive techniques, Los Angeles, California, USA,1997: 101-108.
    [46]P. A. Navratil, D. S. Fussell, C. Lin, et al. Dynamic ray scheduling for improved system performance [C]. Proceedings of the 2007 IEEE Symposium on Interactive Ray Tracing, Ulm, Germany,2007:95-104.
    [47]A. Fujimoto. T. Tanaka and K. Iwata. Accelerated ray-tracing system [J]. IEEE Computer Graphics and Applications,1986,6(4):16-26.
    [48]F. Cazals, G. Drettakis and C. Puech. Filtering, clustering and hierarchy construction:a new solution for ray-tracing complex scenes [J]. Computer Graphics Forum,1995,14(3):371-382.
    [49]A. S. Glassner. An Introduction to ray tracing [M]. Academic Press,1989.
    [50]J. L. Bentley. Multidimensional binary search trees used for associative searching [J]. Commun. ACM,1975,18(9):509-517.
    [51]D. Gordon and S. Chen. Front-to-back display of BSP trees [C]. IEEE Computer Graphics and Applications,1991:79-85.
    [52]J. Goldsmith and J. Salmon. Automatic creation of object hierarchies for ray tracing [J]. IEEE Computer Graphics and Applications,1987,7(5):14-20.
    [53]C. Wachter and A. Keller. The bounding interval hierarchy [C]. Proceeding of 17th Eurographics Symposium on Rendering, Nicosia, Cyprus,2006:139-149.
    [54]V. Havran. Heuristic ray shooting algorithms [CP/DK]. Czech Technical University,2001.
    [55]V. Havran. About the relation between spatial subdivisions and object hierarchies used in ray tracing [C]. Proceedings of Conference SCCG Budmerice, Slovakia, 2007:55-60.
    [56]G. Cadet and B. Lecussan. Coupled use of BSP and BVH trees in order to exploit ray bundle performance [C]. Proceedings of the 2007 IEEE Symposium on Interactive Ray Tracing, Washington, DC, USA,2007:63-71.
    [57]S. Popov, J. Gunther, H.-P. Seidel, et al. Experiences with streaming construction of SAH KD-trees [C]. Proceedings of the 2006 IEEE Symposium on Interactive Ray Tracing, Salt Lake City, Utah, USA,2006:89-94.
    [58]W. Hunt, W. R. Mark and G. Stoll. Fast kd-tree construction with an adaptive error-bounded heuristic [C]. Proceeding of IEEE Symposium on Interactive Ray Tracing, Salt Lake City, Utah, USA,2006:81-88.
    [59]D. J. MacDonald and K. S. Booth. Heuristics for ray tracing using space subdivision [J]. Vis. Comput.,1990,6(3):153-166.
    [60]S. KR. A Search Structure based on K-d Trees for Efficient Ray Tracing [CP/DK]. University of Texas at Austin,1990.
    [61]I. Wald and V. Havran. On building fast kd-trees for ray tracing, and on doing that in O(N log N) [C]. Proceedings of the 2006 IEEE Symposium on Interactive Ray Tracing, Salt Lake City, Utah, USA,2007:61-69.
    [62]K. Zhou, Q. Hou, R. Wang, et al. Real-time kd-tree construction on graphics hardware [J]. ACM Transactions on Graphics,2008,27(5):1-11.
    [63]T. Ize, I. Wald, C. Robertson, et al. An evaluation of parallel grid construction for ray tracing dynamic scenes [C]. Proceedings of the 2006 IEEE Symposium on Interactive Ray Tracing, Salt Lake City, Utah, USA,2006:27-55.
    [64]M. Eisemann, T. Grosch, S. Muller, et al. Automatic creation of object hierarchies for ray tracing dynamic scenes [C]. Proceedings of WSCG Short Communications Plzen, Czech Republic,2007:119-126.
    [65]李静,王文成,吴恩华.基于空盒自适应生成的动态场景光线跟踪计算[J].计算机学报,2009,32(6):1172-1182.
    [66]M. Geimer and S. Muller. A cross-platform framework for interactive ray tracing [EB/OL]. [2003]. http://citeseerx.ist.psu.edu/viewdoc/summary.
    [67]M. Shevtsov, A. Soupikov and E. Kapustin. Highly parallel fast KD-tree construction for interactive ray tracing of dynamic scene [J]. Computer Graphics Forum,2007,26(3):395-404.
    [68]Q. Hou, X. Sun, K. Zhou, et al. Memory-scalable gpu spatial hierarchy construction [J]. IEEE Transactions on Visualization & Computer Graphics,2010, 17(4):466-474.
    [69]R. Torres and P. J. Martin. Ray casting using a roped BVH with CUDA [C]. Proceedings of the 2009 Spring Conference on Computer Graphics, Budmerice, Slovakia,2009:95-102.
    [70]K. Garanzha and C. Loop. Fast ray sorting and breadthfirst packet traversal for gpu ray tracing [J]. Computer Graphics Forum,2010,29(2):289-298.
    [71]J. Pantaleoni and D. Luebke, HLBVH:hierarchical LBVII construction for real-time ray tracing of dynamic geometry [C]. Proceedings of the Conference on High Performance Graphics, Saarbrucken, Germany,2010:87-95.
    [72]B. Choi, R. Komuravelli, V. Lu, et al. Parallel SAH k-D tree construction [C]. Proceedings of the Conference on High Performance Graphics, Saarbrucken, Germany,2010:77-86.
    [73]Z. Wu, F. Zhao and X. Liu. SAH KD-tree construction on GPU [C]. Proceedings of the ACM SIGGRAPH Symposium on High Performance Graphics, Vancouver, British Columbia, Canada,2011:71-78.
    [74]J. Nickolls, I. Buck, M. Garland, et al. Scalable parallel programming with CUDA [J]. Queue,2008,6(2):40-53.
    [75]E. Lindholm, J. Nickolls, S. Oberman, et al. NVIDIA Tesla:A unified graphics and computing architecture [J]. IEEE Micro,2008,28(2):39-55.
    [76]张舒,褚艳利.GPU高性能运算之CUDA [M].中国水利水电出版社,2009.
    [77]Intel. Intel AVX [EB/OL]. [2008]. http://softwareprojects.intel.com/avx.
    [78]L. Seiler, D. Carmean, E. Sprangle, et al. Larrabee:a many-core x86 architecture for visual computing [J]. ACM Transactions on Graphics,2008,27(3):1-15.
    [79]Z. Fan, F. Qiu, A. Kaufman, et al. GPU cluster for high performance computing [C]. Proceedings of the ACM/IEEE Conference on Supercomputing Pittsburgh, Pennsylvania, USA,2004:47.
    [80]H. Takizawa and H. Kobayashi. Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing [J]. The Journal of Supercomputing,2006,36(3):219-234.
    [81]D. Goddeke, R. Strzodka, J. Mohd-Yusof, et al. Exploring weak scalability for FEM calculations on a GPU-enhanced cluster [J]. Parallel Comput.,2007,33(10-11): 685-699.
    [82]M. Showerman, J. Enos, A. Pant, et al. QP:A heterogeneous multi-accelerator clu-ster [C]. Proceeding of the 10th LCI International Conference on High-Performance Clustered Computing, Boulder, Colorado, USA,2009:34-41.
    [83]J. C. Phillips, J. E. Stone and K. Schulten. Adapting a message-driven parallel application to GPU-accelerated clusters [C]. Proceedings of the ACM/IEEE Conference on Supercomputing, Austin, Texas, USA,2008:1-9.
    [84]杨晓奇,郑启龙,陈国良.扩充OpenMP并行编程模型支持事务存储执行[J].中国科技大学学报,2009,(11):1224-1231.
    [85]单莹,吴建平,王正华.基于SMP集群的多层次并行编程模型与并行优化技术[J].计算机应用研究,2006,23(10):254-256.
    [86]曲洋,黄永忠,王磊.流式缩减技术在GPU上的研究与应用[J].计算机工程与设计,2008,29(5):1268-1270.
    [87]M. Segal and K. Akeley. The OpenGL 2.1 Specification [EB/OL]. [2006]. http://www.opengl.org/registry/doc/glspec21.20061201.pdf.
    [88]D. Blythe. The Direct3D 10 system [J]. ACM Trans. Graph.,2006,25(3): 724-734.
    [89]Khronos OpenGL Working Group. The OpenGL shading language [EB/OL]. [2006].http://www.cse.chalmers.se/edu/year/2011/course/TDA361/GLSLangSpec.F-ull.1.30.08.pdf.
    [90]M. Oneppo. HLSL shader model 4.0 [C]. ACM SIGGRAPH 2007 Courses, San Diego, California, USA,2007:112-152.
    [91]Intel. Intel thread building blocks product documentation [EB/OL]. [2008]. http://www.intel.com/cd/software/products/asmo-na/eng/294797.htm.
    [92]Khronos OpenGL Working Group. The OpenCL 1.0 specification [EB/OL]. [2008]. http://www.khronos.org/registry/cl/specs/opencl-1.0.pdf.
    [93]AMD. ATI stream computing [EB/OL]. [2008]. http://ati.amd.com/technology/streamcomputing/.
    [94]I. Buck. Stream computing on graphics hardware [CP/DK]. Stanford University, 2005.
    [95]M. McCool, S. D. Toit, T. Popa, et al. Shader algebra [C]. ACM SIGGRAPH 2004 Papers, Los Angeles, California, USA,2004:787-795.
    [96]D. Tarditi, S. Puri and J. Oglesby. Accelerator:Using data parallelism to program GPUs for general-purpose uses [J]. SIGPLAN Not.,2006,41(11):325-335.
    [971 J-Gunther. S. Popov, H.-P. Seidel, et al. Realtime ray tracing on GPU with BVH-based packet traversal [C]. Proceedings of the 2007 IEEE Symposium on Interactive Ray Tracing, Ulm, Germany,2007:113-118.
    [98]S. Venkatasubramanian. The graphics card as a stream computer [C]. In SIGMODDIMACS Workshop on Management and Processing of Data Streams, San Diego, California, USA,2003.
    [99]N. S. Arora, R. D. Blumofe and C. G. Plaxton. Thread scheduling for multiprogrammed multiprocessors [C]. Proceedings of the tenth annual ACM Symposium on Parallel algorithms and architectures, Puerto Vallarta, Mexico,1998: 119-129.
    [100]D. Cederman and P. Tsigas. On sorting and load balancing on GPUs [J]. SIGARCH Comput. Archit. News,2009,36(5):11-18.
    [101]S. Sengupta, M. Harris, Y. Zhang, et al. Scan primitives for GPU computing [C]. Proceedings of the 22nd ACM SIGGRAPH/EUROGRAPHICS Symposium on Graphics hardware, San Diego, California, USA,2007:97-106.
    [102]A. Patney and J. D. Owens. Real-time Reyes-style adaptive surface subdivision [J]. ACM Transactions on Graphics,2008,27(5):1-8.
    [103]N. Satish, M. Harris and M. Garland. Designing efficient sorting algorithms for manycore GPUs [C]. Proceedings of the 2009 IEEE International Symposium on Parallel& Distributed Processing, Rome, Italy,2009:1-10.
    [104]W. W. L. Fung, I. Sham, G. Yuan, et al. Dynamic warp formation and scheduling for efficient GPU control flow [C]. Proceeding of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, Chicago, IL, USA,2007:407-420.
    [105]T. S. Popa. Compiling data dependent control flow on SIMD GPUs [CP/DK]. University of Waterloo,2004.
    [106]K. Bennett. NVIDIA's "Fermi" architecture white paper [EB/OL]. [2009]. http://www.hardocp.com/article/2009/09/30/nvidias_fermi_architecture_white_pape r/.
    [107]T. J. Purcell, I. Buck, W. R. Mark, et al. Ray tracing on programmable graphics hardware [J]. ACM Transactions on Graphics,2002,21(3):703-712.
    [108]T. J. Purcell. Ray tracing on a stream processor [CP/DK]. Stanford University, 2004.
    [109]N. A. Carr, J. Hoberock, K. Crane, et al. Fast GPU ray tracing of dynamic meshes using geometry images [C]. Proceedings of Graphics Interface, Quebec, Canada,2006:203-209.
    [110]N. Thrane and L. O. Simonsen. A comparison of acceleration structures for GPU assisted ray tracing [CP/DK]. University of Aarhus,2005.
    [111]M. Ernst, C. Vogelgsang and G. Greiner. Stack implementation on programmable graphics hardware [C]. Proceeding of Vision Modeling and Visualization, Stanford, California, USA,2004:255-262.
    [112]S. Popov, J. Gunther, H.-P. Seide, et al. Stackless KD-tree traversal for high performance GPU ray tracing [J]. Computer Graphics Forum,2007,26(3):415-424.
    [113]D. R. Horn, J. Sugerman, M. Houston, et al. Interactive k-d tree GPU raytracing [C]. Proceedings of the 2007 Symposium on Interactive 3D graphics and games, Seattle, Washington, USA,2007:167-174.
    [114]P. Hanrahan. Using caching and breadth first search to speed up ray tracing [C]. Proceedings on Graphics Interface, Vancouver, British Columbia, Canada,1986: 56-61.
    [115]R. Overbeck, R. Ramamoorthi and W. R. Mark. Large ray packets for real-time whitted ray tracing [C]. Proceeding of IEEE/EG Symposium on Interactive Ray Tracing Los Angeles, California, USA,2008:41-48.
    [116]I. Wald, C. P. Gribble, S. Boulos, et al. SIMD Ray Stream Tracing-SIMD ray traversal with generalized ray packets and on-the-fly re-ordering [R]. SCI Institute, University of Utah,2007.
    [117]L. Szecsi. The hierarchical ray engine [C]. Proceeding of WSCG, Bory, Czech Republic,2006:249-256.
    [118]D. Roger and N. Holzschuch. Accurate specular reflections in real-time [J]. Computer Graphics Forum,2006,25(3):293-302.
    [119]S. T. Davis and C. Wyman. A GPU-driven algorithm for accurate interactive reflections on curved objects [C]. Proceedings of Graphics Interface, Montreal, Canada,2007:185-190.
    [120]L. Szirmay-Kalos, B. Aszodi, I. Lazanyi, et al. Approximate ray-tracing on the GPU with distance impostors [J]. Computer Graphics Forum,2005,24(3):695-704.
    [121]A. Aggarwal, R. J. Anderson and M.-Y. Kao. Parallel depth-first search in general directed graphs [C]. Proceedings of the twenty-first annual ACM Symposium on Theory of Computing, Seattle, Washington, USA,1989:297-308.
    [122]Y. Kitamura, A. Smith, H. TAKEMURA, et al. Parallel algorithms for real-time colliding face detection [C]. Robot and Human Communication, Tokyo, Japan,1995: 211-218.
    [123]V. Kumar, A. Y. Grama and N. R. Vempaty. Scalable load balancing techniques for parallel computers [J]. Journal of Parallel and Distributed Computing,1994, 22(1):60-79.
    [124]I. Grinberg and Y. Wiseman. Scalable parallel collision detection simulation [C]. Proceedings of the Ninth IASTED International Conference on Signal and Image Processing, Honolulu, Hawaii, USA,2007:380-385.
    [125]F. Pellacini, K. Vidimce, A. Lefohn, et al. Lpics:a hybrid hardware-accelerated relighting engine for computer cinematography [J]. ACM Transactions on Graphics, 2005,24(3):464-470.
    [126]J. D. Owens, D. Luebke, N. Govindaraju, et al. A survey of general-purpose computation on graphics hardware [J]. Computer Graphics Forum,2007,26(1): 80-113.
    [127]D. Wexler, L. Gritz, E. Enderton, et al. GPU-accelerated high-quality hidden surface removal [C]. Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware, Los Angeles, California, USA,2005:7-14.
    [128]M. Christen. Implementing ray tracing on GPU [CP/DK]. University of Applied Sciences,2005.
    [129]S. E. Yoon, S. Curtis and D. Manocha. Ray tracing dynamic scenes using selective restructuring [C]. ACM SIGGRAPH 2007 Sketches, San Diego, California, USA,2007:55.
    [130]J. Korein and N. Badler. Temporal anti-aliasing in computer generated animation [C]. Proceedings of the 10th annual Conference on Computer Graphics and Interactive Techniques, Detroit, Michigan, USA,1983:377-388.
    [131]C. W. Grant. Integrated analytic spatial and temporal anti-aliasing for polyhedra in 4-space [C]. Proceedings of the 12th annual Conference on Computer Graphics and Interactive Techniques, San Francisco, California, USA,1985:79-84.
    [132]E. Catmull. An analytic visible surface algorithm for independent pixel processing [J]. SIGGRAPH Comput. Graph.,1984,18(3):109-115.
    [133]K. Sung and A. Pearce. Spatial-temporal antialiasing [J]. IEEE Transactions on Visualization & Computer Graphics,2002,8(2):144-153.
    [134]P. Haeberli and K. Akeley. The accumulation buffer:hardware support for high-quality rendering [C]. Proceedings of the 17th annual Conference on Computer Graphics and Interactive Techniques, Dallas, TX, USA,1990:309-318.
    [135]M. Deering, S. Winner, B. Schediwy, et al. The triangle processor and normal vector shader:a VLSI system for high performance graphics [J]. Computer Graphics, 1988,22(4):21-31.
    [136]K. Zhou, Q. Hou, Z. Ren, et al. Renderants:Interactive Reyes rendering on GPUs [C]. Proceeding of ACM SIGGRAPH Asia, Yokohama, Japan,2009:1-11.
    [137]R. L. Cook, L. Carpenter and E. Catmull. The Reyes image rendering architecture [J]. SIGGRAPH Comput. Graph.,1987,21(4):95-102.
    [138]A. Apodaca. Advanced RenderMan:creating CGI for motion pictures [M]. Morgan Kaufmann,2000.
    [139]R. L. Cook. Stochastic sampling in computer graphics [J]. ACM Trans. Graph., 1986,5(1):51-72.
    [140]T. Akenine-Moller, J. Munkberg and J. Hasselgren. Stochastic rasterization using time-continuous triangles [C]. Proceedings of the 2007 ACM SIGGRAPH/EUROR- APHICS Conference on Graphics Hardware, San Diego, Califomia, USA,2007: 7-16.
    [141]A. S. Glassner. Spacetime ray tracing for animation [J], IEEE Comput. Graph. Appl.,1988,8(2):60-70.
    [142]T. Hachisuka, W. Jarosz, R. P. Weistroffer, et al. Multidimensional adaptive sampling and reconstruction for ray tracing [C]. Proceeding of ACM SIGGRAPH, Los Angeles, California, USA,2008:1-10.
    [143]M. McGuire, E. Enderton, P. Shirley, et al. Real-time stochastic rasterization on conventional GPU architectures [C]. Proceedings of the Conference on High Performance Graphics, Saarbrucken, Germany,2010:173-182.
    [144]K. Fatahalian, E. Luong, S. Boulos, et al. Data-parallel rasterization of micropolygons with defocus and motion blur [C]. Proceeding of High Performance Graphics, New Orleans, LA, USA,2009:59-68.
    [145]K. E. Batcher. Sorting networks and their applications [C]. Proceedings of the Spring Joint Computer Conference, Atlantic City, New Jersey, USA,1968:307-314.
    [146]T. H. Cormen, C. E. Leiserson, R. L. Rivest, et al. Introduction to algorithms [M]. MIT,2009.
    [147]R. Baraglia, V. G. Moruzzi, G. Capannini, et al. Sorting using bltonic network with CUD A [C]. The 7th Workshop on LSDS-IR Boston, Massachusetts, USA, 2009.
    [148]H. Peters, O. Schulz-Hildebrandt and N. Luttenberger. Fast in-place sorting with CUD A based on bitonic sort [C]. Proceedings of the 8th International Conference on Parallel Processing and Applied Mathematics:Part I, Wroclaw, Poland,2010: 403-410.
    [149]E. Sintorn and U. Assarsson. Fast parallel GPU-sorting using a hybrid algorithm [J]. J. Parallel Distrib. Comput.,2008,68(10):1381-1388.
    [150]CUDPP. CUDA data parallel primitives library [CP/DK]. http://www.gpgpu.org/developer/cudpp/.2009.
    [151]K. Sung. A DDA octree traversal algorithm for ray tracing [C]. Proceeding of Eurographics, North-Holland,1991:73-85.
    [152]N. University. Project Arauna [CP/DK]. NVTV University of Appied Sciences, 2010.
    [153]P. Djeu, W. Hunt, R. Wang, et al. Razor:An architecture for dynamic multiresolution ray tracing [J]. ACM Trans. Graph.,2011,30(5):1-26.
    [154]K. Garanzha. Efficient Clustered BVH Update Algorithm for Highly-Dynamic Models [C], Proceedings of the 2008 IEEE Symposium on Interactive Ray Tracing, Los Angeles, California, USA,2008:123-130.
    [155]T. Ize, I. Wald and S. G. Parker. Asynchronous BVH Construction for Ray Tracing Dynamic Scenes on Parallel Multi-Core Architectures [C]. Proceedings of the 2007 Eurographics Symposium on Parallel Graphics and Visualization, Lugano, Switzerland,2007:101-108.
    [156]J. Hanika, A. Keller and H. P. A. Lensch. Two-level ray tracing with reordering for highly complex scenes [C]. Proceedings of Graphics Interface, Ottawa, Ontario, Canada,2010:145-152.
    [157]H. Dammertz and A. Keller. Edge Volume Heuristic-robust triangle subdivision for improved BVH performance [C]. Proceeding of IEEE Symposium on Interactive Ray Tracing, Los Angeles, California, USA,2008:155-158.
    [158]M. Ernst and G. Greiner. Early split clipping for bounding volume hierarchies [C]. Proceedings of IEEE Symposium on Interactive Ray Tracing, Ulm, Germany, 2007:73-78.
    [159]M. Stich, H. Friedrich and A. Dietrich. Spatial splits in bounding volume hierarchies [C]. Proceedings of the Conference on High Performance Graphics, New Orleans, Louisiana, USA,2009:7-13.
    [160]L. GrunschloB, M. Stich, S. Nawaz, et al. An efficient acceleration data structure for ray traced motion blur [C]. Proceedings of the Conference on High Performance Graphics, Vancouver, BC, Canada,2011:65-70.
    [161]I. Wald. On fast construction of SAH-based bounding volume hierarchies [C]. Proceedings of the IEEE/Eurographics Symposium on Interactive Ray Tracing, Ulm, Germany,2007:33-40.
    [162]Bullet. Bullet 3D game multiphysics library [EB/OL]. [2011]. http://code.google.com/p/bullet/.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700