Tag Archives: nvidia

NVIDIA’s GPU in the cloud: will you still want an Xbox or PlayStation?

NVIDIA’s GPU Technology conference is an unusual event, in part a get-together for academic researchers using HPC, in part a marketing pitch for the company. The focus of the event is on GPU computing, in other words using the GPU for purposes other than driving a display, such as processing simulations to model climate change or fluid dynamics, or to process huge amounts of data in order to calculate where best to drill for oil. However NVIDIA also uses the event to announce its latest GPU innovations, and CEO Jen-Hsun Huang used this morning’s keynote to introduce its GPU in the cloud initiative.

This takes two forms, though both are based on a feature of the new “Kepler” wave of NVIDIA GPUs which allows them to render graphics to a stream rather than to a display. It is the world’s first virtualized GPU, he claimed.

image

The first target is enterprise VDI (Virtual Desktop Infrastructure). The idea is that in the era of BYOD (Bring Your Own Device) there is high demand for the ability to run Windows applications on devices of every kind, perhaps especially Apple iPads. This works fine via virtualisation for everyday applications, but what about GPU-intensive applications such as Autocad or Adobe Photoshop? Using a Kepler GPU you can run up to 100 virtual desktop instances with GPU acceleration. NVIDIA calls this the VGX Platform.

image

What actually gets sent to the client is mostly H.264 video, which means most current devices have good support, though of course you still need a remote desktop client.

The second target is game streaming. The key problem here – provided you have enough bandwidth – is minimising the lag between when a player moves or clicks Fire, and when the video responds. NVIDIA has developed software called the Geforce GRID which it will supply along with specially adapted Kepler GPUs to cloud companies such as Gaikai. Using the Geforce GRID, lag is reduced, according to NVIDIA, to something close to what you would get from a game console.

image

We saw a demo of a new Mech shooter game in which one player is using an Asus Transformer Prime, an Android tablet, and the other an LG television which has a streaming client built in. The game is rendered in the cloud but streamed to the clients with low latency.

image

“This is your game console,” said NVIDIA CEO Jen-Sun Huang, holding the Ethernet cable that connected the TV to the internet.

image

The concept is attractive for all sorts of reasons. Users can play games without having to download and install, or connect instantly to a game being played by a friend. Game companies are protected from piracy, because the game code runs in the cloud, not on the device.

NVIDIA does not plan to run its own cloud services, but is working with partners, as the following slide illustrates. On the VDI side, Citrix, Microsoft, VMWare and Xen were mentioned as partners.

image

If cloud GPU systems take off, will it cannibalise the market for powerful GPUs in client devices, whether PCs, game consoles or tablets? I put this to Huang in the press Q&A after the keynote, and he denied it, saying that people like designers hate to share their PCs. It was an odd and unsatisfactory answer. After all, if Huang is saying that your games console is now an Ethernet cable, he is also saying that there is no need any longer for game consoles which contain powerful NVIDIA GPUs. The same might apply to professional workstations, with the logic that cloud computing always presents: that shared resources have better utilisation and therefore lower cost.

NVIDIA Nsight comes to Eclipse for Mac, Linux GPU programming

NVIDIA has ported its Nsight development tools, previously a plug-in for Visual Studio, to run within the open source Eclipse IDE for use on Mac and Linux.

image

The Nsight tools include profiling, refactoring, syntax highlighting and auto-completion, as well as a bunch of code samples.

The Windows version for Visual Studio has also been updated, and now supports local GPU debugging as well as new support for DirectX frame debugging and analysis.

Although Eclipse of course runs on Windows, Nsight users should continue to use the Visual Studio version. NVIDIA is not supporting use of the Eclipse Nsight on Windows.

The tools are in preview and you can sign up to try them here.

Another significant development is the availability of the CUDA LLVM Compiler. NVIDIA has contributed CUDA compiler code to the open source LLVM project. This means that other languages which compile to LLVM intermediate assembly language can be adapted to support parallel processing on NVIDIA GPUs. The CUDA Compiler SDK will be made available this week at the NVIDIA GPU Technology Conference in San Jose.

Adobe turns to OpenCL rather than NVIDIA CUDA for Mercury Graphics Engine in Creative Suite 6

Adobe has just announced Creative Suite 6. CS 5.5 used the Mercury Playback Engine in Premiere Pro, which takes advantage of NVIDIA’s CUDA library in order to accelerate processing when an NVIDIA GPU is present. Just to be clear, this is not just graphics acceleration, but programming the GPU to take advantage of its many processor cores for general-purpose computing.

Premiere Pro CS6 also uses the Mercury Playback Engine, and while CUDA is still recommended there is new support for OpenCL:

The Mercury Playback Engine brings performance gains to all the GPUs supported in Adobe Creative Suite 6 software, but the best performance comes with specific NVIDIA® CUDA™ enabled GPUs, including support for mobile GPUs and NVIDIA Maximus™ dual-GPU configurations. New support for the OpenCL-based AMD Radeon HD 6750M and 6770M cards available with certain Apple MacBook Pro computers running OS X Lion (v10.7x), with a minimum of 1GB VRAM, brings GPU-accelerated mobile workflows to Mac users.

PhotoShop CS6 also uses the GPU to accelerate processing, using the new Mercury Graphics Engine. The Mercury Graphics Engine uses the OpenCL framework, which is not specific to any one GPU vendor, rather than CUDA:

The Mercury Graphics Engine (MGE) represents features that use video card, or GPU, acceleration. In Photoshop CS6, this new engine delivers near-instant results when editing with key tools such as Liquify, Warp, Lighting Effects and the Oil Paint filter. The new MGE delivers unprecedented responsiveness for a fluid feel as you work. MGE is new to Photoshop CS6, and uses both the OpenGL and OpenCL frameworks. It does not use the proprietary CUDA framework from nVidia.

It seems to me that this amounts to a shift by Adobe from CUDA to OpenCL, which is a good thing for users of non-NVIDIA GPUs.

This also suggests to me that NVIDIA will need to ensure excellent OpenCL support in its GPU cards, as well as continuing to evolve CUDA, since Creative Suite is a key product for designers using the workstations which form a substantial part of the market for high-end GPUs.

Multicore processor wars: NVIDIA squares up to Intel

I first became aware of NVIDIA’s propaganda war against Intel at the 2012 GPU Technology conference in Beijing. CEO Jen-Hsun Huang stated that CPUs are remarkably inefficient for multicore processing:

The CPU is fast and is terrific at single-threaded performance, but because so much of the electronics inside the CPU is dedicated to out of order execution, branch prediction, speculative execution, all of the technology that has gone into sustaining instruction throughput and making the CPU faster at single-threaded applications, the electronics necessary to enable it to do that has grown tremendously. With four cores, in order to execute an operation, a floating point add or a floating point multiply, 50 times more energy is dedicated to the scheduling of that operation than the operation itself. If you look at the silicone of a CPU, the floating point unit is only a few percentage of the overall die, and it is consistent with the usage of the energy to sequence, to schedule the instructions running complicated programs.

That figure of 50 times surprised me, and I asked Intel’s James Reinders for a comment. He was quick to respond, noting that:

50X is ridiculous if it encourages you to believe that there is an alternative which is 50X better.  The argument he makes, for a power-efficient approach for parallel processing, is worth about 2X (give or take a little). The best example of this, it turns out, is the Intel MIC [Many Integrated Core] architecture.

Reinders went on to say:

Knights Corner is superior to any GPGPU type solution for two reasons: (1) we don’t have the extra power-sucking silicon wasted on graphics functionality when all we want to do is compute in a power efficient manner, and (2) we can dedicate our design to being highly programmable because we aren’t a GPU (we’re an x86 core – a Pentium-like core for “in order” power efficiency). These two turn out to be substantial advantages that the Intel MIC architecture has over GPGPU solutions that will allow it to have the power efficiency we all want for highly parallel workloads, but able to run an enormous volume of code that will never run on GPGPUs (and every algorithm that can run on GPGPUs will certainly be able to run on a MIC co-processor).

So Intel is evangelising its MIC vs GPCPU solutions such as NVIDIA’s Tesla line. Yesterday NVIDIA’s Steve Scott spoke up to put the other case. If Intel’s point is that a Tesla is really a GPU pressed into service for general computing, then Scott’s first point is that the cores in MIC are really CPUs, albeit of an older, simpler design:

They don’t really have the equivalent of a throughput-optimized GPU core, but were able to go back to a 15+ year-old Pentium design to get a simpler processor core, and then marry it with a wide vector unit to get higher flops per watt than can be achieved by Xeon processors.

Scott then takes on Intel’s most compelling claim, compatibility with existing x86 code. It does not matter much, says Scott, since you will have to change your code anyway:

The reality is that there is no such thing as a “magic” compiler that will automatically parallelize your code. No future processor or system (from Intel, NVIDIA, or anyone else) is going to relieve today’s programmers from the hard work of preparing their applications for the future.

What is the real story here? It would, of course, be most interesting to compare the performance of MIC vs Tesla, or against the next generation of NVIDIA GPGPUs based on Kepler; and may the fastest and most power-efficient win. That will have to wait though; in the meantime we can see that Intel is not enjoying seeing the world’s supercomputers install NVIDIA GPGPUs – the Oak Ridge National Laboratory Jaguar/Titan (the most powerful supercomputer in the USA) being a high profile example:

In addition, 960 of Jaguar’s 18,688 compute nodes now contain an NVIDIA graphical processing unit (GPU). The GPUs were added to the system in anticipation of a much larger GPU installation later in the year.

Equally, NVIDIA may be rattled by the prospect of Intel offering strong competition for Tesla. It has not had a lot of competition in this space.

There is an ARM factor here too. When I spoke to Scott in Beijing, he hinted that NVIDIA would one day produce GPGPUs with ARM chips embedded for CPU duties, perhaps sharing the same memory.

NVIDIA releases CUDA Toolkit 4.1 with LLVM compiler

NVIDIA has released version 4.1 of its CUDA Toolkit for general purpose GPU computing.

image

There is a lot in this release, including a compiler based on LLVM, which will make it easier to support other programming languages; 1000 new imaging functions; and a re-designed visual profiler.

There is also an update to Parallel Nsight, for debugging and profiling CUDA applications in Visual Studio. This is free, though you have to register as an NVIDIA developer. You need this update to work with the 4.1 toolkit.

You do have to update your graphics card driver:

image

using a new build which NVIDIA has not gotten around to signing:

image

Still, lots of goodies here and a must-have for developers wishing to put their NVIDIA GPU to work for more than just games.

ITWriting.com awards 2011: ten key happenings, from Nokia’s burning platform to HP’s nightmare year

2011 felt like a pivotal year in technology. What was pivoting? Well, users are pivoting away from networks and PCs and towards cloud and devices. The obvious loser is Microsoft, which owns PCs and networks but is a distant follower in devices and has mixed prospects in the cloud. Winners include Apple, Google, Amazon, and Android vendors. These trends have been obvious for some time, but in 2011 we saw dramatic evidence of their outcome. As 2011 draws to a close, here is my take on ten happenings, presented as the first ever ITWriting.com annual awards.

1. Most dramatic moment award: Nokia’s burning platform and alliance with Microsoft

In February Nokia’s Stephen Elop announced an alliance with Microsoft and commitment to Windows Phone 7. In October we saw the first results in terms of product: the launch of the Lumia smartphone. It is a lovely phone though with some launch imperfections like too short battery life. We also saw greatly improved marketing, following the dismal original Windows Phone 7 launch a year earlier. Enough? Early indications are not too good. Simply put, most users want iOS or Android, and the app ecosystem, which Elop stated as a primary reason for adoption Windows Phone, is not there yet. Both companies will need to make some smart moves in 2012 to fix these issues, if it is possible. But how much time does Nokia have?

2. Riskiest technology bet: Microsoft unveils Windows 8

In September 2011 Microsoft showed a preview of Windows 8 to developers at its BUILD conference in California. It represents a change of direction for the company, driven by competition from Apple and Android. On the plus side, the new runtime in Windows 8 is superb and this may prove to be the best mobile platform from a developer and technical perspective, though whether it can succeed in the market as a late entrant alongside iOS and Android is an open question. On the minus side, Windows 8 will not drive upgrades in the same way as Windows 7, since the company has chosen to invest mainly in creating a new platform. I expect much debate about the wisdom of this in 2012.

Incidentally, amidst all the debate about Windows 8 and Microsoft generally, it is worth noting that the other Windows 8, the server product, looks like being Microsoft’s best release for years.

3. Best cloud launch: Office 365

June 2011 saw the launch of Office 365, Microsoft’s hosted collaboration platform based on Exchange and SharePoint. It was not altogether new, since it is essentially an upgrade of the older BPOS suite. Microsoft is more obviously committed to this approach now though, and has built a product that has both the features and the price to appeal to a wide range of businesses, who want to move to the cloud but prefer the familiarity of Office and Exchange to the browser-based world of Google Apps. Bad news though for Microsoft partners who make lots of money nursing Small Business Server and the like.

4. Most interesting new cross-platform tool: Embarcadero Delphi for Windows, Mac and iOS

Developers, at least those who have still heard of Embarcadero’s rapid application development tool, were amazed by the new Delphi XE2 which lets you develop for Mac and Apple iOS as well as for Windows. This good news was tempered by the discovery that the tool was seemingly patched together in a bit of a hurry, and that most existing application would need extensive rewriting. Nevertheless, an interesting new entrant in the world of cross-platform mobile tools.

5. Biggest tech surprise: Adobe shifts away from its Flash Platform

image

This one caught me by surprise. In November Adobe announced a shift in its business model away from Flash and away from enterprise development, in favour of HTML5, digital media and digital marketing. It also stated that Flash for mobile would no longer be developed once existing commitments were completed. The shift is not driven by poor financial results, but rather reflects the company’s belief that this will prove a better direction in the new world of cloud and device. Too soon and too sudden? Maybe 2012 will show the impact.

6. Intriguing new battle award: NVIDIA versus Intel as GPU computing catches on

In 2011 NVIDIA announced a number of wins in the supercomputing world as many of these huge machines adopted GPU Computing, and I picked up something of a war of words with Intel over the merits of what NVIDIA calls heterogeneous computing. Intel is right to be worried, in that NVIDIA is seeing a future based on its GPUs combined with ARM CPUs. NVIDIA should worry too though, not only as Intel readies its “Knight’s Corner” MIC (Many Integrated Core) chips, but also as ARM advances its own Mali GPU; there is also strong competition in mobile GPUs from Imagination, used by Apple and others. The GPU wars will be interesting to watch in 2012.

7. Things that got worse award: Spotify. Runners up: Twitter, Google search

Sometimes internet services come along that are so good within their niche that they can only get worse. Spotify is an example, a music player that for a while let you play almost anything almost instantly with its simple, intuitive player. It is still pretty good, but Spotify got worse in 2011, with limited plays on free account, more intrusive ads, and sign-up now requires a Facebook login. Twitter is another example, with URLS now transformed to t.co shortcuts whether you like it not and annoying promoted posts and recommended follows. Both services are desperately trying to build a viable business model on their popularity, so I have some sympathy. I have less sympathy for Google. I am not sure when it started making all its search results into Google links that record your click before redirecting you, but it is both annoying and slow, and I am having another go with Bing as a result.

8. Biggest threat to innovation: Crazy litigation from Lodsys, Microsoft, Apple

There has always been plenty of litigation in the IT world. Apple vs Microsoft regarding graphical user interfaces 1994; Sun vs Microsoft regarding Java in 1997; SCO vs IBM regarding UNIX in 2003; and countless others. However many of us thought that the biggest companies exercised restraint on the grounds that all have significant patent banks and trench warfare over patent breaches helps nobody but lawyers. But what if patent litigation is your business model? The name Lodsys sends a chill though any developer’s spine, since if you have an app that supports in-app purchases you may receive a letter from them, and your best option may be to settle though others disagree. Along with Lodsys and the like, 2011 also brought Microsoft vs several OEMs over Android, Apple vs Samsung over Android, and much more.

9. Most horrible year award: HP

If any company had an Annus Horribilis it was HP. It invested big in WebOS, acquired with Palm; launched the TouchPad in July 2011; announced in August that it was ceasing WebOS development and considering selling off its Personal Systems Group; and fired its CEO Leo Apotheker in September 2011.

10. Product that deserves better award: Microsoft LightSwitch

On reflection maybe this award should go to Silverlight; but it is all part of the same story. Visual Studio LightSwitch, released in July 2011, is a model-driven development tool that generates Silverlight applications. It is nearly brilliant, and does a great job of making it relatively easy to construct business database applications, locally or on Windows Azure, complete with cross-platform Mac and Windows clients, and without having to write much code. Several things are unfortunate though. First, usual version 1.0 problems like poor documentation and odd limitations. Second, it is Silverlight, when Microsoft has made it clear that its future focus is HTML 5. Third, it is Windows and (with limitations) Mac, at a time when something which addresses the growing interest in mobile devices would be a great deal more interesting. Typical Microsoft own-goal: Windows Phone 7 runs Silverlight, LightSwitch generates Silverlight, but no, your app will not run on Windows Phone 7.  Last year I observed that Microsoft’s track-record on modelling in Visual Studio is to embrace in one release and extinguish in the next. History repeats?

On Supercomputers, China’s Tianhe-1A in particular, and why you should think twice before going to see one

I am just back from Beijing courtesy of Nvidia; I attended the GPU Technology conference and also got to see not one but two supercomputers:  Mole-8.5 in Beijing and Tianhe-1A in Tianjin, a coach ride away.

Mole-8.5 is currently at no. 21 and Tianhe-1A at no. 2 on the top 500 list of the world’s fastest supercomputers.

There was a reason Nvidia took journalists along, of course. Both are powered partly by Nvidia Tesla GPUs, and it is part of the company’s campaign to convince the world that GPUs are essential for supercomputing, because of their greater efficiency than CPUs. Intel says we should wait for its MIC (Many Integrated Core) CPU instead; but  Nvidia has a point, and increasing numbers of supercomputers are plugging in thousands of Nvidia GPUs. That does not include the world’s current no. 1, Japan’s K Computer, but it will include the USA’s Titan, currently no. 3, which will add up to 18.000 GPUs in 2012 with plans that may take it to the top spot; we were told that that it aims to be twice as fast as the K Computer.

Supercomputers are important. They excel at processing large amounts of data, so typical applications are climate research, biomedical research, simulations of all kinds used for design and engineering, energy modelling, and so on. These efforts are important to the human race, so you will never catch me saying that supercomputers are esoteric and of no interest to most of us.

That said, supercomputers are physically little different from any other datacenter: rows of racks. Here is a bit of Mole-8.5:

image 

and here is a bit of Tianhe-1A:

image

In some ways Tianhe-1A is more striking from outside.

image

If you are interested in datacenters, how they are cooled, how they are powered, how they are constructed, then you will enjoy a visit to a supercomputer. Otherwise you may find it disappointing, especially given that you can run an application on a supercomputer without any need to be there physically.

Of course there is still value in going to a supercomputing centre to talk to the people who run it and find out more about how the system is put together. Again though I should warn you that physically a supercomputer is repetitive. They achieve their mighty flop/s (floating point per second) counts by having lots and lots of processors (whether CPU or GPU) running in parallel. You can make a supercomputer faster by adding another cupboard with another set of racks with more boards with CPUs

image

or GPUs

image

and provided your design is right you will get more flop/s.

Yes there is more to it than that, and points of interest include the speed of the network, which is critical in order to support high performance, as well as the software that manages it. Take a look at the K Computer’s Tofu Interconnect. But the term “supercomputer” is a little misleading: we are talking about a network of nodes rather than a single amazing monolithic machine.

Personally I enjoyed the tours, though the visit to Tianhe-1A was among the more curious visits I have experienced. We visited along with a bunch of Nvidia executives. The execs sat along one side of a conference table, the Chinese hosts along the other side, and they engaged in a diplomatic exercise of being very polite to each other while the journalists milled around the room.

image

We did get a tour of Tianhe-1A but unfortunately little chance to talk to the people involved, though we did have a short group interview with the project director, Liu Guangming.

image

He gave us short, guarded but precise answers, speaking through an interpreter. We asked about funding. “The way things work here is different from how it works in the USA,” he said, “The government supports us a lot, the building and infrastructure, all the machines, are all paid for by the government. The government also pays for the operational cost.” Nevertheless, users are charged for their time on Tianhe-1A, but this is to promote efficiency. “If users pay they use the system more efficiently, that is the reason for the charge,” he said. However, the users also get their funding from the government’s research budget.

Downplayed on the slides, but mentioned here, is the fact that the supercomputer was developed by the “National team of defence technology.” Food for thought.

We also asked about the usage of the GPU nodes as opposed to the CPU nodes, having noticed that many of the applications presented in the briefing were CPU-only. “The GPU stage is somewhat experimental,” he said, though he is “seeing increasing use of the GPU, and such a heterogeneous system should be the future of HPC [High Performance Computing].” Some applications do use the GPU and the results have been good. Overall the system has 60-70% sustained utilisation.

Another key topic: might China develop its own GPU? Tianhe-1A already includes 2048 China-designed “Galaxy FT” CPUs, alongside 14336 Intel CPUs and 7168 NVIDIA GPUS.

We already have the technology, said Guangming.

From 2005 -7 we designed a chip, a stream processor similar to a GPU. But the peak performance was not that good. We tried AMD GPUs, but they do not have EEC [Extended Error Correction], so that is why we went to NVIDIA. China does have the technology to make GPUs. Also the technology is growing, but what we implement is a commercial decision.

Liu Guangming closed with a short speech.

Many of the people from outside China might think that China’s HPC experienced explosive development last year. But China has been involved in HPC for 20 years. Next, the Chinese government is highly committed to HPC. Third, the economy is growing fast and we see the demand for HPC. These factors have produced the explosive growth you witnessed.

The Tianjin Supercomputer is open and you are welcome to visit.

NVIDIA plans to merge CPU and GPU – eventually

I spoke to Dr Steve Scott, NVIDIA’s CTO for Tesla, at the end of the GPU Technology Conference which has just finished here in Beijing. In the closing session, Scott talked about the future of NVIDIA’s GPU computing chips. NVIDIA releases a new generation of graphics chips every two years:

  • 2008 Tesla
  • 2010 Fermi
  • 2012 Kepler
  • 2014 Maxwell

Yes, it is confusing that the Tesla brand, meaning cards for GPU computing, has persisted even though the Tesla family is now obsolete.

image
Dr Steve Scott showing off the power efficiency of GPU computing

Scott talked a little about a topic that interests me: the convergence or integration of the GPU and the CPU. The background here is that while the GPU is fast and efficient for parallel number-crunching, it is of course still necessary to have a CPU, and there is a price to pay for the communication between the two. The GPU and the CPU each have their own memory, so data must be copied back and forth, which is an expensive operation.

One solution is for GPU and CPU to share memory, so that a single pointer is valid on both. I asked CEO Jen-Hsun Huang about this and he did not give much hope for this:

We think that today it is far better to have a wonderful CPU with its own dedicated cache and dedicated memory, and a dedicated GPU with a very fast frame buffer, very fast local memory, that combination is a pretty good model, and then we’ll work towards making the programmer’s view and the programmer’s perspective easier and easier.

Scott on the other hand was more forthcoming about future plans. Kepler, which is expected in the first half of 2012, will bring some changes to the CUDA architecture which will “broaden the applicability of GPU programming, tighten the integration of the CPU and GPU, and enhance programmability,” to quote Scott’s slides. This integration will include some limited sharing of memory between GPU and CPU, he said.

What caught my interest though was when he remarked that at some future date NVIDIA will probably build CPU functionality into the GPU. The form that might take, he said, is that the GPU will have a couple of cores that do the CPU functions. This will likely be an implementation of the ARM CPU.

Note that this is not promised for Kepler nor even for Maxwell but was thrown out as a general statement of direction.

There are a couple of further implications. One is that NVIDIA plans to reduce its dependence on Intel. ARM is a better partner, Scott told me, because its designs can be licensed by anyone. It is not surprising then that Intel’s multi-core evangelist James Reinders was dismissive when I asked him about NVIDIA’s claim that the GPU is far more power-efficient than the CPU. Reinders says that the forthcoming MIC (Many Integrated Core) processors codenamed Knights Corner are a better solution, referring to the:

… substantial advantages that the Intel MIC architecture has over GPGPU solutions that will allow it to have the power efficiency we all want for highly parallel workloads, but able to run an enormous volume of code that will never run on GPGPUs (and every algorithm that can run on GPGPUs will certainly be able to run on a MIC co-processor).

In other words, Intel foresees a future without the need for NVIDIA, at least in terms of general-purpose GPU programming, just as NVIDIA foresees a future without the need for Intel.

Incidentally, Scott told me that he left Cray for NVIDIA because of his belief in the superior power efficiency of GPUs. He also described how the Titan supercomputer operated by the Oak Ridge National Laboratory in the USA will be upgraded from its current CPU-only design to incorporate thousands of NVIDIA GPUs, with the intention of achieving twice the speed of Japan’s K computer, currently the world’s fastest.

This whole debate also has implications for Microsoft and Windows. Huang says he is looking forward to Windows on ARM, which makes sense given NVIDIA’s future plans. That said, the I get impression from Microsoft is that Windows on ARM is not intended to be the same as Windows on x86 save for the change of processor. My impression is that Windows on ARM is Microsoft’s iOS, a locked-down operating system that will be safer for users and more profitable for Microsoft as app sales are channelled through its store. That is all very well, but suggests that we will still need x86 Windows if only to retain open access to the operating system.

Another interesting question is what will happen to Microsoft Office on ARM. It may be that x86 Windows will still be required for the full features of Office.

This means we cannot assume that Windows on ARM will be an instant hit; much is uncertain.

3D games: gimmick or the next generation?

I’m attending NVIDIA’s GPU technology conference, and at the exhibition here I took the opportunity to view some 3D images on some Lenovo (and of course NVIDIA) kit.

image

I was impressed; yes you have to wear the special specs, but the results are superb. The images are more immersive and more realistic, and I can see the appeal.

I am still not sure though whether 3D games will take off. The screens are substantially more expensive, the specs are inconvenient, and there are not many games.

We have also seen how Nintendo’s 3D support in the DS was insufficient to generate much momentum.

The question then is whether 3D gaming will ever be mainstream. Looking at a high quality 3D display makes you think that it must catch on eventually; but it has a lot stacked against it.

NVIDIA CEO Jen-Hsun Huang beats the drum for GPU computing

In his keynote at the GPU Technology Conference here in Beijing NVIIDA CEO Jens-Hsun Huang presented the simple logic of GPU computing. The main constraint on computing is power consumption, he said:

Power is now the limiter of every computing platform, from cellphones to PCs and even datacenters.

CPUs are optimized for single-threaded computing and are relatively inefficient. According to Huang a CPU spends 50 times as much power scheduling instructions as it does executing them. A GPU by contrast is formed of many simple processors and is optimized for parallel processing, making it more efficient when measured in FLOP/s (Floating Point Operations per Second), a way of benchmarking computer performance. Therefore it is inevitable that computers make use of GPU computing in order to achieve best performance. Note that this does not mean dispensing with the CPU, but rather handing off processing to the GPU when appropriate.

This point is now accepted in the world of supercomputers. The computer at Chinese National Supercomputing Center in Tianjin has 14,336 Intel CPUs, 7168 Nvidia Tesla GPUs, and 2048 custom-designed 8-core CPUs called Galaxy FT-1000, and can achieve 4.7 Petaflop/s for a power consumption of 4.04 MegaWatts (million watts), as presented this morning by the center’s Vice Director Xiaoquian Zhu. This is currently the 2nd fastest supercomputer in the world.

Huang says that without GPUs the world would wait until 2035 for the first Exascale (1 Exaflop/s) supercomputer, presuming a power constraint of 20MW and current levels of performance improvement year by year, whereas by combining CPUs with GPUs this can be achieved in 2019.

Supercomputing is only half of the GPU computing story. More interesting for most users is the way this technology trickles down to the kind of computers we actually use. For example, today Lenovo announced several workstations which use NVIDIA’s Maximus technology to combine a GPU designed primarily for driving a display (Quadro) with a GPU designed primarily for GPU computing (Tesla). These workstations are aimed at design professionals, for whom the ability to render detailed designs quickly is important. The image below shows a Lenovo S20 on display here. Maybe these are not quite everyday computers, but they are still PCs. Approximate price to follow soon when I have had a chance to ask Lenovo. Update: prices start at around $4500 for an S20, with most of the cost being for the Tesla board.

image