摘要
In a complex processor landscape dominated by multi-and many-core processors, simplifying programming plays a crucial role in enhancing developers鈥?productivity. One way is to use highly tuned library functions. In this paper we present fastsg, an optimized library for the sparse grid technique with support for dimensional truncation. With optimizations for best cache use and vectorization, we improve the performance on one processor core up to a factor of 10. Parallelization using OpenMP scales almost linearly on a 12-core system.