China’s Tianhe-2 Supercomputer takes top ranking, a win for Intel vs Nvidia

The International Supercomputing Conference (ISC) is under way in Leipzig, and one of the announcements is that China’s Tianhe-2 is now the world’s fastest supercomputer according to the Top 500 list.

This has some personal interest for me, as I visited its predecessor Tianhe-1A in December 2011, on a press briefing organised by NVidia which was, I guess, on a diplomatic mission to promote Tesla, the GPU accelerator boards used in Tianhe-1A (which was itself the world’s fastest supercomputer for a period).

It appears that the mission failed, insofar as Tianhe-2 uses Intel Phi accelerator boards rather than Nvidia Tesla.

Tianhe-2 has 16,000 nodes, each with two Intel Xeon IvyBridge processors and three Xeon Phi processors for a combined total of 3,120,000 computing cores.

says the press release. Previously, the world’s fastest was the US Titan, which does use NVidia GPUs.

Nvidia has reason to worry. Tesla boards are present on 39 of the top 500, whereas Xeon Phi is only on 11, but it has not been out for long and is growing fast. A newly published paper shows Xeon Phi besting Tesla on sparse matrix-vector multiplication:

we demonstrate that our implementation is 3.52x and 1.32x faster, respectively, than the best available implementations on dual IntelR XeonR Processor E5-2680 and the NVIDIA Tesla K20X architecture.

In addition, Intel has just announced the successor to Xeon Phi, codenamed Knight’s Landing. Knight’s Landing can function as the host CPU as well as an accelerator board, and has integrated on-package memory to reduce data transfer bottlenecks.

Nvidia does not agree that Xeon Phi is faster:

The Tesla K20X is about 50% faster in Linpack performance, and in terms of real application performance we’re seeing from 2x to 5x faster performance using K20X versus Xeon Phi accelerator.

says the company’s Roy Kim, Tesla product manager. The truth I suspect is that it depends on the type of workload and I would welcome more detail on this.

It is also worth noting that Tianhe-2 does not better Titan on power/performance ratio.

  • Tianhe-2: 3,120,00 cores, 1,024,000 GB Memory, Linpack perf 33,862.7 TFlop/s, Power 17,808 kW.
  • Titan: 560,640 cores, 710,144 GB Memory, Linpack perf 17,590 TFlop/s, Power 8,209 kW.