Three approaches to develop multi-device heterogeneous applications are proposed. Easy, efficient and coherent subarray usage for kernels and movements is implemented. Simple argument annotations allow to easily split kernels and arrays among devices. Accurate automatic workload balancing is provided by means of a friendly API. The results are very promising both in terms of performance and programmability.