An optimization for limited workspace on the GPUs.
Allowing trade-off between workspace size and redundant computation.
Automatic translators to automate the optimizations.
Delivered 1.7 to 2.4 times speedups compared to the CPU systems.
Superior or comparable performance compared to other CFDs running on GPUs.