摘要
We present the GPU calculation with the common unified device architecture (CUDA) for the Swendsen-Wang multi-cluster algorithm of two-dimensional classical spin systems. We adjust the two connected component labeling algorithms recently proposed with CUDA for the assignment of the cluster in the Swendsen-Wang algorithm. Starting with the q-state Potts model, we extend our implementation to the system of vector spins, the q-state clock model, with the idea of embedded cluster. We test the performance, and the calculation time on GTX580 is obtained as 2.51 nsec per a spin flip for the Potts model (Ising model) and 2.42 nsec per a spin flip for the clock model with the linear size at the critical temperature, respectively. The computational speed for the Potts model on GTX580 is 12.4 times as fast as the calculation speed on a current CPU core. That for the clock model on GTX580 is 35.6 times as fast as the calculation speed on a current CPU core.