Tag Archives: intel

Programming NVIDIA GPUs and Intel MIC with directives: OpenACC vs OpenMP

Last month I was at Intel’s software conference learning about Many Integrated Core (MIC), the company’s forthcoming accelerator card for HPC (High Performance Computing). This month I am in San Jose for NVIDIA’s GPU Technology Conference learning about the latest development in NVIDIA’s platform for accelerated massively parallel computing using GPU cards and the CUDA architecture. The approaches taken by NVIDIA and Intel have much in common – focus on power efficiency, many cores, accelerator boards with independent memory space controlled by the CPU – but also major differences. Intel’s boards have familiar x86 processors, whereas NVIDIA’s have GPUs which require developer to learn CUDA C or an equivalent such as OpenCL.

In order to simplify this, NVIDIA and partners Cray, CAPS and PGI announced OpenACC last year, a set of directives which when added to C/C++ code instruct the compiler to run code parallelised on the GPU, or potentially on other accelerators such as Intel MIC. The OpenACC folk have stated from the outset their hope and intention that OpenACC will converge with OpenMP, an existing standard for directives enabling shared memory parallelisation. OpenMP is not suitable for accelerators since these have their own memory space.

One thing that puzzled me though: Intel clearly stated at last month’s event that it would support OpenMP (not OpenACC) on MIC, due to go into production at the end of this year or early next. How can this be?

I took the opportunity here at NVIDIA’s conference to ask Duncan Poole, who is NVIDIA’s Senior Manager for High Performance Computing and also the President of OpenACC, about what is happening with these two standards. How can Intel implement OpenMP on MIC, if it is not suitable for accelerators?

“I think OpenMP in the form that’s being discussed inside of the sub-committee is suitable. There’s some debate about some of the specific features that continues. Also, in the OpenMP committee they’re trying to address the concerns of TI and IBM so it’s a broader discussion than just the Intel architecture. So OpenMP will be useful on this class of processor. What we needed to do is not wait for it. That standard, if we’re lucky it will be draft at the end of this year, and maybe a year later will be ratified. We want to unify this developer base now,” Poole told me.

How similar will this adapted OpenMP be to what OpenACC is now?

“It’s got the potential to be quite close. The guy that drafted OpenACC is the head of that sub-committee. There’ll probably be changes in keywords, but there’s also some things being proposed now that were not conceived of. So there’s good debate going on, and I expect that we’ll benefit from it.

“Some of the features for example that are shared by Kepler and MIC with respect to nested parallelism are very useful. Nested parallelism did not exist at the time that we started this work. So there’ll be an evolution that will happen and probably a logical convergence over time.

If OpenMP is not set to support acclerators until two years hence, what can Intel be doing with it?

“It will be a vendor implementation of a pre-release standard. Something like that,” said Poole, emphasising that he cannot speak for Intel. “To be complementary to Intel, they have some good ideas and it’s a good debate right now.”

Incidentally, I also asked Intel about OpenACC last month, and was told that the company has no plans to implement it on its compilers. OpenMP is the standard it supports.

The topic is significant, in that if a standard set of directives is supported across both Intel and NVIDIA’s HPC platforms, developers can easily port code from one to the other. You can do this today with OpenCL, but converting an application to use OpenCL to enhance performance is a great deal more effort than adding directives.

Windows 8 to be called Windows 8, no Outlook on ARM

Microsoft has announced the range of editions planned for Windows 8, which is now the official name (previously it was a code name).

Here is what I found interesting. Windows on Arm (WOA) is now called Windows RT and ships with Office included. However, Outlook is not included, confirming my suspicion that Outlook may gradually get de-emphasised in favour of separate email, calendar and task managers built into the operating system but with strong Exchange support – a good move since Outlook is perhaps the most confusing and over-complex application that Microsoft ships.

Windows RT is missing some features which are in the Intel versions, not least the ability to install desktop software, but has an unique feature of its own: device encryption.

I consider Windows RT as critical to the success of the Windows 8 project, and the only edition that may compete effectively with the Apple iPad in terms of price, convenience, battery life and usability. That said, the market will see the Intel version as primary, since it is the one that can run all our existing apps, but all the legacy baggage will also weigh it down. Users will suffer the disjunction between Metro and Desktop, and will need mouse or stylus and keyboard to use desktop applications. The danger is that Windows RT will get lost in the noise.

Multicore processor wars: NVIDIA squares up to Intel

I first became aware of NVIDIA’s propaganda war against Intel at the 2012 GPU Technology conference in Beijing. CEO Jen-Hsun Huang stated that CPUs are remarkably inefficient for multicore processing:

The CPU is fast and is terrific at single-threaded performance, but because so much of the electronics inside the CPU is dedicated to out of order execution, branch prediction, speculative execution, all of the technology that has gone into sustaining instruction throughput and making the CPU faster at single-threaded applications, the electronics necessary to enable it to do that has grown tremendously. With four cores, in order to execute an operation, a floating point add or a floating point multiply, 50 times more energy is dedicated to the scheduling of that operation than the operation itself. If you look at the silicone of a CPU, the floating point unit is only a few percentage of the overall die, and it is consistent with the usage of the energy to sequence, to schedule the instructions running complicated programs.

That figure of 50 times surprised me, and I asked Intel’s James Reinders for a comment. He was quick to respond, noting that:

50X is ridiculous if it encourages you to believe that there is an alternative which is 50X better.  The argument he makes, for a power-efficient approach for parallel processing, is worth about 2X (give or take a little). The best example of this, it turns out, is the Intel MIC [Many Integrated Core] architecture.

Reinders went on to say:

Knights Corner is superior to any GPGPU type solution for two reasons: (1) we don’t have the extra power-sucking silicon wasted on graphics functionality when all we want to do is compute in a power efficient manner, and (2) we can dedicate our design to being highly programmable because we aren’t a GPU (we’re an x86 core – a Pentium-like core for “in order” power efficiency). These two turn out to be substantial advantages that the Intel MIC architecture has over GPGPU solutions that will allow it to have the power efficiency we all want for highly parallel workloads, but able to run an enormous volume of code that will never run on GPGPUs (and every algorithm that can run on GPGPUs will certainly be able to run on a MIC co-processor).

So Intel is evangelising its MIC vs GPCPU solutions such as NVIDIA’s Tesla line. Yesterday NVIDIA’s Steve Scott spoke up to put the other case. If Intel’s point is that a Tesla is really a GPU pressed into service for general computing, then Scott’s first point is that the cores in MIC are really CPUs, albeit of an older, simpler design:

They don’t really have the equivalent of a throughput-optimized GPU core, but were able to go back to a 15+ year-old Pentium design to get a simpler processor core, and then marry it with a wide vector unit to get higher flops per watt than can be achieved by Xeon processors.

Scott then takes on Intel’s most compelling claim, compatibility with existing x86 code. It does not matter much, says Scott, since you will have to change your code anyway:

The reality is that there is no such thing as a “magic” compiler that will automatically parallelize your code. No future processor or system (from Intel, NVIDIA, or anyone else) is going to relieve today’s programmers from the hard work of preparing their applications for the future.

What is the real story here? It would, of course, be most interesting to compare the performance of MIC vs Tesla, or against the next generation of NVIDIA GPGPUs based on Kepler; and may the fastest and most power-efficient win. That will have to wait though; in the meantime we can see that Intel is not enjoying seeing the world’s supercomputers install NVIDIA GPGPUs – the Oak Ridge National Laboratory Jaguar/Titan (the most powerful supercomputer in the USA) being a high profile example:

In addition, 960 of Jaguar’s 18,688 compute nodes now contain an NVIDIA graphical processing unit (GPU). The GPUs were added to the system in anticipation of a much larger GPU installation later in the year.

Equally, NVIDIA may be rattled by the prospect of Intel offering strong competition for Tesla. It has not had a lot of competition in this space.

There is an ARM factor here too. When I spoke to Scott in Beijing, he hinted that NVIDIA would one day produce GPGPUs with ARM chips embedded for CPU duties, perhaps sharing the same memory.

Just three Windows 8 on ARM tablets at launch? Not good for Microsoft


Bloomberg reports unknown sources stating that only three Windows on ARM (WOA) tablets will be available at launch:

There will be fewer ARM-based devices in the rollout because Microsoft has tightly controlled the number and set rigorous quality-control standards, said one of the people. The new version of Windows will be the first to use ARM processors, which are most commonly found in smartphones. Windows 7, the current version, only works with Intel’s technology. Three of the Windows 8 ARM devices will be tablets, the people said.

This may be nonsense but I can see this playing out badly for Microsoft. I am making several assumptions here:

1. The design of Windows 8 is all about tablets. If it fails on tablets, then it has failed.

2. Windows 8 Intel tablets will not compete with the Apple iPad and will probably not do well. The main reason is the old one: Windows desktop is mostly unusable with touch alone. I mean, you can get it to work but it is not much fun, and that will not change.  Supplementary reasons are that Intel CPUs are less efficient than ARM which means shorter battery life, and that traditional Windows applications expect lots of disk space and RAM, and that OEMs will want to pre-install anti-malware and other foistware, and repeat the mistakes of the past that are driving users with relief towards iPads.

I can also imagine Windows 8 Intel tablets being sold with add-on styluses and keyboards that are necessary to operate desktop applications, but a nuisance in all sorts of ways.

3. Windows on ARM has more potential to be a compelling iPad alternative. Metro-style apps are designed for tablets and will work well with touch alone. ARM devices may be lightweight and with long battery life. The locked-down Windows Store is some protection against excessive OEM interference. With Microsoft Office compatibility thrown in, these might appeal to a business user who would otherwise buy an iPad.

Despite the above, my guess is that Microsoft’s OEM partners will instinctively put most of their effort into Windows 8 on Intel tablets, because that it the way it has always been, and because of an assumption that someone buying a Windows 8 device will want to run Windows applications, and not just Metro-style apps.

The problem is that such people will try Windows 8 on Intel tablets, hate them because of the reasons in (2) above, and end up buying iPads anyway.

The counter argument? That Apple conquered the tablet market with just one model, so perhaps three is more than enough.

ITWriting.com awards 2011: ten key happenings, from Nokia’s burning platform to HP’s nightmare year

2011 felt like a pivotal year in technology. What was pivoting? Well, users are pivoting away from networks and PCs and towards cloud and devices. The obvious loser is Microsoft, which owns PCs and networks but is a distant follower in devices and has mixed prospects in the cloud. Winners include Apple, Google, Amazon, and Android vendors. These trends have been obvious for some time, but in 2011 we saw dramatic evidence of their outcome. As 2011 draws to a close, here is my take on ten happenings, presented as the first ever ITWriting.com annual awards.

1. Most dramatic moment award: Nokia’s burning platform and alliance with Microsoft

In February Nokia’s Stephen Elop announced an alliance with Microsoft and commitment to Windows Phone 7. In October we saw the first results in terms of product: the launch of the Lumia smartphone. It is a lovely phone though with some launch imperfections like too short battery life. We also saw greatly improved marketing, following the dismal original Windows Phone 7 launch a year earlier. Enough? Early indications are not too good. Simply put, most users want iOS or Android, and the app ecosystem, which Elop stated as a primary reason for adoption Windows Phone, is not there yet. Both companies will need to make some smart moves in 2012 to fix these issues, if it is possible. But how much time does Nokia have?

2. Riskiest technology bet: Microsoft unveils Windows 8

In September 2011 Microsoft showed a preview of Windows 8 to developers at its BUILD conference in California. It represents a change of direction for the company, driven by competition from Apple and Android. On the plus side, the new runtime in Windows 8 is superb and this may prove to be the best mobile platform from a developer and technical perspective, though whether it can succeed in the market as a late entrant alongside iOS and Android is an open question. On the minus side, Windows 8 will not drive upgrades in the same way as Windows 7, since the company has chosen to invest mainly in creating a new platform. I expect much debate about the wisdom of this in 2012.

Incidentally, amidst all the debate about Windows 8 and Microsoft generally, it is worth noting that the other Windows 8, the server product, looks like being Microsoft’s best release for years.

3. Best cloud launch: Office 365

June 2011 saw the launch of Office 365, Microsoft’s hosted collaboration platform based on Exchange and SharePoint. It was not altogether new, since it is essentially an upgrade of the older BPOS suite. Microsoft is more obviously committed to this approach now though, and has built a product that has both the features and the price to appeal to a wide range of businesses, who want to move to the cloud but prefer the familiarity of Office and Exchange to the browser-based world of Google Apps. Bad news though for Microsoft partners who make lots of money nursing Small Business Server and the like.

4. Most interesting new cross-platform tool: Embarcadero Delphi for Windows, Mac and iOS

Developers, at least those who have still heard of Embarcadero’s rapid application development tool, were amazed by the new Delphi XE2 which lets you develop for Mac and Apple iOS as well as for Windows. This good news was tempered by the discovery that the tool was seemingly patched together in a bit of a hurry, and that most existing application would need extensive rewriting. Nevertheless, an interesting new entrant in the world of cross-platform mobile tools.

5. Biggest tech surprise: Adobe shifts away from its Flash Platform


This one caught me by surprise. In November Adobe announced a shift in its business model away from Flash and away from enterprise development, in favour of HTML5, digital media and digital marketing. It also stated that Flash for mobile would no longer be developed once existing commitments were completed. The shift is not driven by poor financial results, but rather reflects the company’s belief that this will prove a better direction in the new world of cloud and device. Too soon and too sudden? Maybe 2012 will show the impact.

6. Intriguing new battle award: NVIDIA versus Intel as GPU computing catches on

In 2011 NVIDIA announced a number of wins in the supercomputing world as many of these huge machines adopted GPU Computing, and I picked up something of a war of words with Intel over the merits of what NVIDIA calls heterogeneous computing. Intel is right to be worried, in that NVIDIA is seeing a future based on its GPUs combined with ARM CPUs. NVIDIA should worry too though, not only as Intel readies its “Knight’s Corner” MIC (Many Integrated Core) chips, but also as ARM advances its own Mali GPU; there is also strong competition in mobile GPUs from Imagination, used by Apple and others. The GPU wars will be interesting to watch in 2012.

7. Things that got worse award: Spotify. Runners up: Twitter, Google search

Sometimes internet services come along that are so good within their niche that they can only get worse. Spotify is an example, a music player that for a while let you play almost anything almost instantly with its simple, intuitive player. It is still pretty good, but Spotify got worse in 2011, with limited plays on free account, more intrusive ads, and sign-up now requires a Facebook login. Twitter is another example, with URLS now transformed to t.co shortcuts whether you like it not and annoying promoted posts and recommended follows. Both services are desperately trying to build a viable business model on their popularity, so I have some sympathy. I have less sympathy for Google. I am not sure when it started making all its search results into Google links that record your click before redirecting you, but it is both annoying and slow, and I am having another go with Bing as a result.

8. Biggest threat to innovation: Crazy litigation from Lodsys, Microsoft, Apple

There has always been plenty of litigation in the IT world. Apple vs Microsoft regarding graphical user interfaces 1994; Sun vs Microsoft regarding Java in 1997; SCO vs IBM regarding UNIX in 2003; and countless others. However many of us thought that the biggest companies exercised restraint on the grounds that all have significant patent banks and trench warfare over patent breaches helps nobody but lawyers. But what if patent litigation is your business model? The name Lodsys sends a chill though any developer’s spine, since if you have an app that supports in-app purchases you may receive a letter from them, and your best option may be to settle though others disagree. Along with Lodsys and the like, 2011 also brought Microsoft vs several OEMs over Android, Apple vs Samsung over Android, and much more.

9. Most horrible year award: HP

If any company had an Annus Horribilis it was HP. It invested big in WebOS, acquired with Palm; launched the TouchPad in July 2011; announced in August that it was ceasing WebOS development and considering selling off its Personal Systems Group; and fired its CEO Leo Apotheker in September 2011.

10. Product that deserves better award: Microsoft LightSwitch

On reflection maybe this award should go to Silverlight; but it is all part of the same story. Visual Studio LightSwitch, released in July 2011, is a model-driven development tool that generates Silverlight applications. It is nearly brilliant, and does a great job of making it relatively easy to construct business database applications, locally or on Windows Azure, complete with cross-platform Mac and Windows clients, and without having to write much code. Several things are unfortunate though. First, usual version 1.0 problems like poor documentation and odd limitations. Second, it is Silverlight, when Microsoft has made it clear that its future focus is HTML 5. Third, it is Windows and (with limitations) Mac, at a time when something which addresses the growing interest in mobile devices would be a great deal more interesting. Typical Microsoft own-goal: Windows Phone 7 runs Silverlight, LightSwitch generates Silverlight, but no, your app will not run on Windows Phone 7.  Last year I observed that Microsoft’s track-record on modelling in Visual Studio is to embrace in one release and extinguish in the next. History repeats?

On Supercomputers, China’s Tianhe-1A in particular, and why you should think twice before going to see one

I am just back from Beijing courtesy of Nvidia; I attended the GPU Technology conference and also got to see not one but two supercomputers:  Mole-8.5 in Beijing and Tianhe-1A in Tianjin, a coach ride away.

Mole-8.5 is currently at no. 21 and Tianhe-1A at no. 2 on the top 500 list of the world’s fastest supercomputers.

There was a reason Nvidia took journalists along, of course. Both are powered partly by Nvidia Tesla GPUs, and it is part of the company’s campaign to convince the world that GPUs are essential for supercomputing, because of their greater efficiency than CPUs. Intel says we should wait for its MIC (Many Integrated Core) CPU instead; but  Nvidia has a point, and increasing numbers of supercomputers are plugging in thousands of Nvidia GPUs. That does not include the world’s current no. 1, Japan’s K Computer, but it will include the USA’s Titan, currently no. 3, which will add up to 18.000 GPUs in 2012 with plans that may take it to the top spot; we were told that that it aims to be twice as fast as the K Computer.

Supercomputers are important. They excel at processing large amounts of data, so typical applications are climate research, biomedical research, simulations of all kinds used for design and engineering, energy modelling, and so on. These efforts are important to the human race, so you will never catch me saying that supercomputers are esoteric and of no interest to most of us.

That said, supercomputers are physically little different from any other datacenter: rows of racks. Here is a bit of Mole-8.5:


and here is a bit of Tianhe-1A:


In some ways Tianhe-1A is more striking from outside.


If you are interested in datacenters, how they are cooled, how they are powered, how they are constructed, then you will enjoy a visit to a supercomputer. Otherwise you may find it disappointing, especially given that you can run an application on a supercomputer without any need to be there physically.

Of course there is still value in going to a supercomputing centre to talk to the people who run it and find out more about how the system is put together. Again though I should warn you that physically a supercomputer is repetitive. They achieve their mighty flop/s (floating point per second) counts by having lots and lots of processors (whether CPU or GPU) running in parallel. You can make a supercomputer faster by adding another cupboard with another set of racks with more boards with CPUs


or GPUs


and provided your design is right you will get more flop/s.

Yes there is more to it than that, and points of interest include the speed of the network, which is critical in order to support high performance, as well as the software that manages it. Take a look at the K Computer’s Tofu Interconnect. But the term “supercomputer” is a little misleading: we are talking about a network of nodes rather than a single amazing monolithic machine.

Personally I enjoyed the tours, though the visit to Tianhe-1A was among the more curious visits I have experienced. We visited along with a bunch of Nvidia executives. The execs sat along one side of a conference table, the Chinese hosts along the other side, and they engaged in a diplomatic exercise of being very polite to each other while the journalists milled around the room.


We did get a tour of Tianhe-1A but unfortunately little chance to talk to the people involved, though we did have a short group interview with the project director, Liu Guangming.


He gave us short, guarded but precise answers, speaking through an interpreter. We asked about funding. “The way things work here is different from how it works in the USA,” he said, “The government supports us a lot, the building and infrastructure, all the machines, are all paid for by the government. The government also pays for the operational cost.” Nevertheless, users are charged for their time on Tianhe-1A, but this is to promote efficiency. “If users pay they use the system more efficiently, that is the reason for the charge,” he said. However, the users also get their funding from the government’s research budget.

Downplayed on the slides, but mentioned here, is the fact that the supercomputer was developed by the “National team of defence technology.” Food for thought.

We also asked about the usage of the GPU nodes as opposed to the CPU nodes, having noticed that many of the applications presented in the briefing were CPU-only. “The GPU stage is somewhat experimental,” he said, though he is “seeing increasing use of the GPU, and such a heterogeneous system should be the future of HPC [High Performance Computing].” Some applications do use the GPU and the results have been good. Overall the system has 60-70% sustained utilisation.

Another key topic: might China develop its own GPU? Tianhe-1A already includes 2048 China-designed “Galaxy FT” CPUs, alongside 14336 Intel CPUs and 7168 NVIDIA GPUS.

We already have the technology, said Guangming.

From 2005 -7 we designed a chip, a stream processor similar to a GPU. But the peak performance was not that good. We tried AMD GPUs, but they do not have EEC [Extended Error Correction], so that is why we went to NVIDIA. China does have the technology to make GPUs. Also the technology is growing, but what we implement is a commercial decision.

Liu Guangming closed with a short speech.

Many of the people from outside China might think that China’s HPC experienced explosive development last year. But China has been involved in HPC for 20 years. Next, the Chinese government is highly committed to HPC. Third, the economy is growing fast and we see the demand for HPC. These factors have produced the explosive growth you witnessed.

The Tianjin Supercomputer is open and you are welcome to visit.

NVIDIA plans to merge CPU and GPU – eventually

I spoke to Dr Steve Scott, NVIDIA’s CTO for Tesla, at the end of the GPU Technology Conference which has just finished here in Beijing. In the closing session, Scott talked about the future of NVIDIA’s GPU computing chips. NVIDIA releases a new generation of graphics chips every two years:

  • 2008 Tesla
  • 2010 Fermi
  • 2012 Kepler
  • 2014 Maxwell

Yes, it is confusing that the Tesla brand, meaning cards for GPU computing, has persisted even though the Tesla family is now obsolete.

Dr Steve Scott showing off the power efficiency of GPU computing

Scott talked a little about a topic that interests me: the convergence or integration of the GPU and the CPU. The background here is that while the GPU is fast and efficient for parallel number-crunching, it is of course still necessary to have a CPU, and there is a price to pay for the communication between the two. The GPU and the CPU each have their own memory, so data must be copied back and forth, which is an expensive operation.

One solution is for GPU and CPU to share memory, so that a single pointer is valid on both. I asked CEO Jen-Hsun Huang about this and he did not give much hope for this:

We think that today it is far better to have a wonderful CPU with its own dedicated cache and dedicated memory, and a dedicated GPU with a very fast frame buffer, very fast local memory, that combination is a pretty good model, and then we’ll work towards making the programmer’s view and the programmer’s perspective easier and easier.

Scott on the other hand was more forthcoming about future plans. Kepler, which is expected in the first half of 2012, will bring some changes to the CUDA architecture which will “broaden the applicability of GPU programming, tighten the integration of the CPU and GPU, and enhance programmability,” to quote Scott’s slides. This integration will include some limited sharing of memory between GPU and CPU, he said.

What caught my interest though was when he remarked that at some future date NVIDIA will probably build CPU functionality into the GPU. The form that might take, he said, is that the GPU will have a couple of cores that do the CPU functions. This will likely be an implementation of the ARM CPU.

Note that this is not promised for Kepler nor even for Maxwell but was thrown out as a general statement of direction.

There are a couple of further implications. One is that NVIDIA plans to reduce its dependence on Intel. ARM is a better partner, Scott told me, because its designs can be licensed by anyone. It is not surprising then that Intel’s multi-core evangelist James Reinders was dismissive when I asked him about NVIDIA’s claim that the GPU is far more power-efficient than the CPU. Reinders says that the forthcoming MIC (Many Integrated Core) processors codenamed Knights Corner are a better solution, referring to the:

… substantial advantages that the Intel MIC architecture has over GPGPU solutions that will allow it to have the power efficiency we all want for highly parallel workloads, but able to run an enormous volume of code that will never run on GPGPUs (and every algorithm that can run on GPGPUs will certainly be able to run on a MIC co-processor).

In other words, Intel foresees a future without the need for NVIDIA, at least in terms of general-purpose GPU programming, just as NVIDIA foresees a future without the need for Intel.

Incidentally, Scott told me that he left Cray for NVIDIA because of his belief in the superior power efficiency of GPUs. He also described how the Titan supercomputer operated by the Oak Ridge National Laboratory in the USA will be upgraded from its current CPU-only design to incorporate thousands of NVIDIA GPUs, with the intention of achieving twice the speed of Japan’s K computer, currently the world’s fastest.

This whole debate also has implications for Microsoft and Windows. Huang says he is looking forward to Windows on ARM, which makes sense given NVIDIA’s future plans. That said, the I get impression from Microsoft is that Windows on ARM is not intended to be the same as Windows on x86 save for the change of processor. My impression is that Windows on ARM is Microsoft’s iOS, a locked-down operating system that will be safer for users and more profitable for Microsoft as app sales are channelled through its store. That is all very well, but suggests that we will still need x86 Windows if only to retain open access to the operating system.

Another interesting question is what will happen to Microsoft Office on ARM. It may be that x86 Windows will still be required for the full features of Office.

This means we cannot assume that Windows on ARM will be an instant hit; much is uncertain.

Fixing a Windows 7 blue screen with Driver Verifier

A recent annoyance was a blue screen when I was in the middle of typing a Word document. “Memory management” it said.

You might think faulty RAM, but I did not think so as I had tested it extensively with the excellent Memtest86. So what was causing it? And no, I do not regard Windows as an unstable operating system, not any more (not really since Windows 98 days).

I started troubleshooting. The first step is to install the Debugging Tools for Windows, if you have not already, run Windbg, and load the minidump which Windows usually creates when it crashes. Minidumps are saved in the /Windows/Minidump folder.


It said VISTA_DRIVER_FAULT and identified the SearchProtocol process, but I was not convinced that this process was really to blame. My reasoning is that it is a Microsoft process that is running on most Windows boxes so unlikely to be badly broken.

I decided to look for a faulty driver. You can do this by running the Driver Verifier Manager, summoned by running verifier.exe (this lives in /Windows/System32 but you can start it from anywhere).


This application enables a debugging mode in Windows that will scrutinise the drivers you specify for errors. This slows down Windows so it is not something you want to leave enabled, but it is great for finding problems.

I elected to check all drivers and continued. Reboot, and as expected, an immediate blue screen.

While Driver Verifier is enabled and causing a crash you can only boot into safe mode. However Windbg works OK in safe mode. I took a look at the new minidump. The process name this time was services.exe. That means any of the services could be at fault, so not all that illuminating.

I ran msconfig and disabled all non-Microsoft services. Restarted and verifier was happy. Now it was a matter of “hunt the service”.

Eventually I discovered through trial and error and hunch (it had to be a service which I had recently installed or updated) which service failed to verify. The guilty party: Intel Desktop Utilities. This application monitors sensors on an Intel motherboard for temperature and fan speed, and fires alerts if the readings go outside safe limits.

I uninstalled the desktop utilities. No more blue screens since.

I find it hard to believe that an Intel utility distributed with all its motherboards is causing Windows blue screens; on the other hand in my case it seems clear cut. And yes, I did have the latest version “for Intel Desktop Boards with 5 or 6 Series chipsets.” My board is the DH67CL. I would be interested to know if others with same version can successfully boot with Driver Verifier enabled.

Windows 8 Tablet in June 2012? If so, I am betting ARM not Intel x86

An interview with Paul Amsellem, new boss at Nokia France, includes this remark:

Et en juin 2012, nous aurons une tablette fonctionnant sous Windows 8

which even my schoolboy French can translate:

and in June 2012 we will have a tablet running Windows 8

Now, that is sooner than I had expected based on what we saw at the BUILD conference in September, and on past experience of Windows beta cycles. Windows 7, for example, was previewed in October 2008 and went into public beta in January 2009. A release candidate arrived in May 2009, and the gold release (the first production release) was towards the end of July 2009.

Although that does not sound much different from September 2011 to June 2012, bear in mind that the gold release is the moment when PC manufacturers can test their hardware with the production code. They still have to manufacture, package and distribute the machines, which is why the first machines with Windows 7 pre-installed did not arrive until October 2009. Hence the “general availability” date for Windows 7 of October 22 – three months after the gold release.

In order to achieve a June release for Windows 8 then, you would expect Microsoft to be done by March 2011. We have yet to see the first beta (the BUILD version is a preview) and a gold release for the x86 Windows 8 in March seems to me most unlikely. Of course it could be done, but only by compromising quality. The quality of the Windows 7 first release was excellent, and Microsoft is smart enough not to jeopardise its Windows 8 launch with a sub-standard product.

Is the Nokia man then either mis-informed or mis-quoted? Either is possible; but I also wonder whether Windows 8 on ARM will play by different rules. Microsoft said little about the ARM release at BUILD, though it was on show in the exhibition.

My impression is that the ARM release will be locked-down and that the only way to install apps will be via the app store. It will also be designed for specific hardware, unlike Windows x86 where people may grab an install CD and set it up on any old PC they can find; it is not guaranteed to work, but often it does.

That means Microsoft has much less to do in terms of compatibility testing, both for hardware and applications.

It follows that, despite being a new platform for Windows, the ARM release might actually be quicker to build than the x86 release. I can just about believe that Microsoft could be ready to hand over a gold build to Nokia in March 2012.

If that is the case, then the big risk is that apps will be scarce. It would give developers little time to create apps for the new platform, and it would also be interesting to see if the Office team at Microsoft could deliver something of real value by then.

Microsoft is under intense pressure from Apple’s iPad as well as Android competitors in tablets. Although it will want to get to market quickly, the company must also realise than a botched first release makes recovery hard. This will be interesting to watch.

Hassles with Intel RAID – Rapid Storage Technology

I have recently fitted a new Intel DH67CL motherboard and decided to use the on-board RAID controller to achieve resiliency against drive failure. I have four 1TB Sata drives, and chose to create two separate mirrors. This is not the most efficient form of RAID, but mirroring is the simplest and easiest for recovery, since if one drive fails you still have a complete copy ready to go on its mirror.

I thought this would be a smooth operation, especially since I have two pairs of identical drives. Everything was fine at first, but then I started to get system freezes. “Freeze” is not quite the right word; it was more an extreme slowdown. The mouse still moved but the Windows 7 64-bit GUI was unresponsive. I discovered that it was possible eventually to get a clean though time-consuming shutdown by summoning a command prompt and waiting patiently for it to appear, then typing shutdown /s. After reboot, everything was fine until next time, where next time was typically only a few hours.

I was suspicious of the RAM at first and removed 8GB of my 16GB. Then I discovered that others had reported problems with Intel RAID (also known as RST) when you have two separate arrays enabled. The symptoms sounded similar to mine:

When the second RAID array is enabled (tried both RAID1 and 0), Windows (Win 7 Ultimate 64bit) will freeze after 10+ minutes of use. This initially manifests itself as my internet “going out”. While I can open new tabs in the browser, I cannot connect. I can’t ping via CMD either. I can’t open Task Manager, but I can open Event Viewer (and nothing really is shown in there re: this). If I try to Log Off or Restart the PC via Start Menu, Windows hangs on the “Logging Off” or “Shutting Down” screen for at least 10 minutes, up to several hours (or indefinitely).

There is no solution given in the thread other than to remove one of the arrays.

The system is 100% stable when I remove the second RAID1.

says one user.

I broke both of the mirrors and used the system for a while; everything was fine. I found an updated driver on Intel’s site (version, dated 17th October 2011) and decided to re-try the RAID. Now I had another problem though. Note that I was using the Windows management utility, not the embedded utility which you get to by pressing a special key during boot, since it is only with the Windows utility that you can preserve your data when creating a new array. My problem: I could not recreate the arrays.

Problem number one was that the drive on Sata port 0 disappeared when you tried to create an array. All four drives looked fine in the Status view:


but when you went to create an array, only three drives appeared:


Following a tip from the Intel community discussion board, I removed and reinstalled the RST utility, following which I also had to reinstate the updated driver. Now the drive reappeared, but I still could not recreate the arrays. I could start creating one, but got an “unknown error.” Looking in the event log, I could see errors reported by IAStorDataMgrSvc: FailedToClaimDisks and FailedVolumeSizeCheck. Curious, especially as I had used this very same utility to create the arrays before, with the same drives and without any issues.

Just as an experiment, I booted into Windows XP 64-bit, which I still have available using Windows multiboot. I installed the latest version of the Intel storage driver and utility, and tried to create a mirror. It worked instantly. I created the second mirror. That worked instantly too. Then I booted back into Windows 7 and checked out the RST utility. Everything looks fine.


The further good news is that I have been running with this for a few days now, without any freezes.

Is it possible that the latest driver fixed a problem? There is no way of knowing, especially since Intel itself appears not to participate in these “community” discussions. I find that disappointing; community without vendor participation is never really satisfactory.

Postscript: Note that I am aware that Intel’s embedded RAID is not a true RAID controller; it is sometimes called “fakeraid” since the processing is done by the CPU. Using Intel RST is a convenience and cost-saving measure. An alternative is Windows RAID which works well in my experience, though there are two disadvantages:

1. Intel RAID performs slightly better in my tests.

2. Windows RAID requires converting your drives to Dynamic Disks. Not a big problem, but it is one more thing to overcome if you end up doing disaster recovery.