GPU programming coming to low-power and mobile devices – from EU Mont Blanc supercomputer to smartphones

Supercomputing and low-power computing are not normally associated; but at the SC11 Supercomputing conference the Barcelona Supercomputing Center (BSC) has announced a new supercomputer, called the called the Mont-Blanc Project, which will combine the ARM-based NVIDIA Tegra SoC with separate CUDA GPUs. CUDA is NVIDIA’s parallel computing architecture, enabling general purpose computing on the GPU.

The project’s publicity says this enables power saving of 15 to 30 times, versus today’s supercomputers:

The analysis of the performance of HPC systems since 1993 shows exponential improvements at the rate of one order of magnitude every 3 years: One petaflops was achieved in 2008, one exaflops is expected in 2020. Based on a 20 MW power budget, this requires an efficiency of 50 GFLOPS/Watt. However, the current leader in energy efficiency achieves only 1.7n GFLOPS/Watt. Thus, a 30x improvement is required.

NVIDIA is also creating a new hardware and software development kit for Tegra + CUDA, to be made available in the first half of 2012.

image

The combination of fast concurrent processing, low power draw and mobile devices is enticing. Features like speech recognition and smart cameras depend on rapid processing, and the technology has the potential to make smart devices very much smarter.

NVIDIA has competition though. ARM, which designs most of the CPUs in use on smartphones and tablets today, has recently started designing mobile GPUs as well, and its Mali series supports OpenCL, an open alternative to CUDA for general-purpose computing on the GPU. The Mali-T604 has 1 to 4 cores while the recently announced Mali-T658 has 1 to 8 cores. ARM specifically optimises its GPUs to work alongside its CPUs, which must be a concern for GPU specialists such as NVIDIA. However, we have yet to see devices with either T604 or T658: the first T604 devices are likely to appear in 2012, and T658 in 2013.