How Microsoft changed its mind about Office XML standardization

My interview with Microsoft’s XML Architect Jean Paoli back in April was not the first time I had spoken to him. I also talked to him in February 2005. At that time Microsoft had no intention of submitting its Office XML specification to a standards body. I thought it should do so, and asked Paoli why not:

Backward compatibility. We have today 400 million users of Office, which means billions of documents. So we went and did a huge job of documenting electronically all these features and we put that into this WordML format. Well we need to maintain this damn thing, and we need to maintain this big format, we have like 1500 tags. Who is going to maintain that? A standard body? It doesn’t know what is inside of Word. That’s the problem. So we said we are going to give you a license, open and free… [Jean Paoli, February 2005].

Microsoft was forced to change its mind, because important customers (mostly governments) indicated their preference for standardised document formats. The quote remains relevant, because it says a lot about the goals of Office Open XML, which is an evolution of WordML and SpreadsheetML.

While on the subject, I also want to mention Simon Jones’ piece in the August 2007 PC Pro (article not online), perhaps a little one-sided but he does a good job of debunking some of the common objections to OOXML and exposing some of the politics in the standardisation process. He adds:

I’m not saying there aren’t any problems with the ECMA-376 standard. Nor am I saying ODF is bad. I do, however, believe OOXML is technically superior to ODF in many ways, and I want to see both as ISO standards so people can have the choice.

Technorati tags: , , , ,

The version problem of today: browser compatibility

David Berlind reports on a case where 35% of developer time is spent on browser compatibility issues.

It’s a huge problem, though I’m cautious about attaching too much weight to a singe anecdotal report. Of course it’s nothing new. Browser compatibility issues are as old as the Web; it was getting better, until AJAX and a new focus on the web-as-platform meant greater stress on advanced browser features. For that matter, version issues are as old as computing. Yesterday, DLL Hell. Today, web browsers.

What’s the solution? All use the same browser? Not realistic. The browser developers could fix the incompatibilities? It’s happening to some degree, but even if Microsoft came out with a 100% FireFox-compatible IE8 tomorrow, there’s still a big legacy problem. My web site stats for this month:

IE7 24%

IE6 22%

IE5 4%

FireFox 2.x 22%

FireFox 1.x 3%

Opera 3.9%

Safari 2.3%


Interesting that the FireFox folk seem to upgrade more quickly than those on IE – but even so, there are a lot of older browsers still in use. I suspect a lot of those IE6 users are corporates with conservative upgrade policies.

Another idea is to use AJAX libraries that hide the incompatibilities. That makes a lot of sense, though if you stress the libraries you might still find compatibility issues.

Finally, you can bypass the browser and use some other runtime, most likely Java or Flash. Unfortunately this doesn’t remove all version issues, but at least it means you are mainly dealing with one vendor’s evolving platform (Sun or Adobe). Silverlight could help as well, though its “cross-platform” only means Windows or Intel Mac at the moment, which is not broad enough.

This will be an important factor in the RIA (Rich Internet Application) wars.

Office Open XML vs COM automation

Looking at the new Open XML API, introduced by Kevin Boske here, makes you realise that old-style COM automation wasn’t so bad after all.

There are two distinct aspects to working programmatically with OOXML. First, there’s the Packaging API, which deals with how the various XML files which make up a document get stored in a ZIP archive. Second, there’s the XML specification itself, which defines the schema of elements and attributes that form the content of an OOXML document.

The new wrapper classes really only deal with the packaging aspect. You still have to work out how to parse and/or generate the correct XML content using your favourite XML parser. And it’s a lot more complex then HTML.

By contrast, the old COM automation API for Office presents a programmatic object model for the content, and you don’t have to worry much about how the document gets stored – you just tell Word or Excel to save it.

The (very big) downside of the COM object model is that it depends on the presence of Microsoft Office. High resource requirements, version problems, Windows-only, and inappropriate for server apps.

We seem to have traded one problem for another. What Microsoft needs to provide is wrapper classes for the content, rather than just its packaging.

Technorati tags: , , , ,

Why doesn’t Adobe’s AIR dev guide mention SQLite?

I’ve been trying out the Adobe AIR (formerly Apollo) SDK.  It’s a confusing business. There are two varieties of AIR apps, Flex, or HTML. The HTML kind is essentially a browser app that runs in WebKit, as wrapped by the AIR runtime, instead of in the browser, while the Flex kind compiles Adobe’s MXML into a Flash SWF which again runs within AIR. The AIR SDK only supports HTML AIR apps, so for the full experience you also need the Flex 3 beta SDK.

But I digress. I have a long-standing interest in SQLite so one of the first things I looked for was how Adobe is using this in AIR. It is there: it’s mentioned in the press release, which emphasizes that AIR has some of that open source fairy dust:

Key elements of Adobe AIR are open source, including the WebKit HTML engine, the ActionScriptâ„¢ Virtual Machine (Tamarin project) and SQLite local database functionality.

However, you wouldn’t know it from the docs. The word SQLite does not appear in either the Flex or the HTML developer guides. Here’s how it introduces the “local SQL databases” section:

Adobe Integrated Runtime (AIR) includes the capability of creating and working with local SQL databases. The runtime includes a SQL database engine with support for many standard SQL features.

The SQLite library itself appears to be compiled into the main AIR runtime library, Adobe AIR.dll.

Why do I mention this? A few reasons.

First, it stinks. Let me emphasize: Adobe is entirely within its rights in not crediting SQLite in its docs. The main author of SQLite, Dr D Richard Hipp, has disclaimed copyright. So it is not illegal, but it is discourteous. By contrast, here’s how the Google Gears docs introduce the database module:

The Database module provides browser-local relational data storage to your JavaScript web application. Google Gears uses the open source SQLite database system.

Second, it’s unhelpful. As a developer familiar with SQLite, I want to see an explanation of how Adobe’s build of SQLite differs from what I am used to – what is added, what if anything is taken away. I also need to know how easily I can access the same database from both AIR and from another application, using the standard SQLite library.

Third, I’m increasingly sceptical of Adobe’s claim that it is somehow “aligning” its API in AIR with that in Gears. Here’s what Michele Turner, Adobe’s VP of developer relations, told me:

Adobe, Google, Mozilla and others will be working to align the APIs used to access local database storage for offline applications, so this functionality will be consistent for developers both in the browser and via Apollo on the desktop.

Perhaps, but there’s really no sign of this in the current beta. The AIR database API and the Gears API are totally different. The full text search extension which is part of Gears seems to be missing in AIR. Another key difference is that unlike Gears, AIR makes no attempt to isolate databases based on the origin of the application. In AIR, a SQLite database may be anywhere in the file system, and it’s equally available to any AIR application – a big hole in the AIR sandbox.

This is all beta, of course, so it can change. I hope it does. Here’s my wish list:

  • Proper credit for SQLite in the docs.
  • Use the Gears code – full text search could be very useful – and deliver on the promise of aligning the API.
  • Failing that, set out exactly how AIR’s SQLite differs from the standard build.
Technorati tags: , , ,

The problem of old Java runtimes

The August PC Pro arrived this morning, and I enjoyed Steve Cassidy’s rant (page 174) on old versions of Java that typically litter PCs:

I’ve made it my habit to go round all the LAN’s I visit removing all older versions of Java from the machines, because the Java updater doesn’t remove them automatically.

It reminded me that I’d intended to post about this dialog, encountered when installing Accurev for a short review:

The decision here is whether to let AccuRev install its own version of the JRE (Java Runtime Environment), or to use one you already have, in which case you have to identify it. It’s a tough decision. If you follow the recommendation to install a private version, you end up with multiple different versions of Java which will likely never get updated except by the application vendor, if indeed you choose to upgrade. I understand why vendors do this: it simplifies testing and installation, and gives apps a predictable platform on which to run.

Unfortunately the downside is substantial too. In the AccuRev case it was slightly unfortunate, since the supplied JRE was incompatible with Vista and broke Aero graphics. A more painful example was when the JRE installed with APC’s PowerChute utility failed because of an expired cryptographic certificate; the consequences were extreme, and in many cases affected systems would no longer boot. See here for the gory details.

I prefer the way Microsoft handles the .NET runtime, where more than one version can be installed, but they are system files for which Microsoft takes responsibility through Windows Update. Sun installs an updater with its JRE that works for web browsers and other applications that use a shared JRE, but there are still many apps like AccuRev that install private versions.

Technorati tags: , , ,

How to speed up Vista: disable the slow slow search

What’s the biggest problems with Vista? Not the buggy drivers, which are gradually getting sorted. Not the evil DRM, which I haven’t encountered directly, though it may be a factor in increasing the complexity and therefore the bugginess of video and audio drivers. Not User Account Control security, which I think is pretty good. Not the user interface, which I reckon improves on Windows XP though there are annoyances.

No, my biggest complaint is performance. This morning I noticed that if I clicked the Start button and then Documents, it took around 15 seconds for the explorer window to display, fully populated. Doing this with Task Manager monitoring performance, I could see CPU usage spike from below 10% to between 55% and 60% while Explorer did its stuff.

Explorer gets blamed for many things that are not really its fault. Applications which integrate with the desktop, such as file archive utilities, hook into Explorer and can cause problems. I tried to figure out what was slowing it down. I opened up Services (in Administrative Tools) and looked at what was running. It didn’t take long to find the main culprit – Windows Search:

Windows Search in Services

You will notice that the above dialog shows that the service is not running. That’s because I stopped it. The difference is amazing. The Documents folder now shows in less than a second. When I click the Start button, the menu displays immediately instead of pausing for thought. Everything seems faster.

Looking at the description above, it is not surprising that there is a performance impact. The indexer gets notified every time you change a file or receive an email (if you are using Outlook or Windows Mail). The same service creates virtual folder views in Explorer, a poor man’s WinFS that should make the real location of files less important. Notice that the explanatory text warns me that by stopping the service I lose these features and have to “fall back to item-by-item slow search”.

I think it should say, “If the service is started, Explorer will take fifteen times longer to open and your system will run more slowly.”

Desktop search is a great feature, but only if it is unobtrusive. In Vista, that’s not the case.

This kind of thing will vary substantially from one system to another. Another user may say that Windows Search causes no problems. I also believe that the system impact is much greater if the indexer has many outstanding tasks – such as indexing a large Outlook mailbox, for example. Further, disabling Windows search really does slow down the search function in Explorer.

Turning off Windows search is therefore not something to do lightly. It breaks an important part of Vista.

Still, sometimes you need to get your work done. That fifteen seconds delay soon adds up when repeated many times.

In truth, we should not be faced with this decision. Microsoft should know better – it has plenty of database expertise, after all. There’s no excuse for a system service that slows things down to this extent.

By the way, if you have understood all the caveats and still want to run without Windows Search, until Microsoft fix it, then you must set the service to disabled. Otherwise applications like Outlook will helpfully restart it for you.


See comments below – a couple of others have reported (as I expected) that search works fine for them. So what is the issue here? In my case I think it is related to Outlook 2007, known to have performance problems especially with large mailboxes like mine. But what’s the general conclusion? If you are suffering from performance problems with Vista, I recommend experimenting with Search – stop and disable it temporarily, to see what effect it has. If there’s no improvement, you can always enable it again.

It strikes me that there is some unfortunate interaction between Explorer, Search, and Outlook; it’s possible that there are other bad combinations as well.

Technorati tags: , , ,

Apple iPhone needs Google Gears

At its developer conference Apple announced that the forthcoming iPhone will support Web 2.0 applications. In this context, “Web 2.0” means at a minimum an embedded web browser (Safari) that runs JavaScript, but that’s no big deal; we expected nothing less. It’s at least a little more than that though:

Developers can create Web 2.0 applications which look and behave just like the applications built into iPhone, and which can seamlessly access iPhone’s services, including making a phone call, sending an email and displaying a location in Google Maps.

The emphasis is mine. This implies some sort of hole in the sandbox, but web apps on the iPhone needs more than just the ability to make phone calls if they are going to be useful. They need to work offline. In fact, a mobile phone (ironically) is one environment where offline web apps will be particularly useful. Nobody is always-on when travelling; it varies from mostly on (urban travel) to mostly off (trains, planes). As a regular train traveller, I find attempting to run web apps on a mobile utterly frustrating.

Fortunately Google has come up with an answer to this with its Gears initiative. Here’s how you write a good Gears app:

  1. Write your app to work offline.
  2. Add synchronization with the server that happens transparently when connected.

This is perfect for a mobile app. Running web apps rather than local apps also bypasses one of the main obstacles to mobile development: the need to get your binaries approved by a telecom provider before they can be installed.

Now, I have no idea whether Apple plans to include Google Gears, or an equivalent, in the iPhone (I’m not at WWDC). But I do think it is a great idea, for this or any mobile device. Combine it with Flash or Silverlight and we will wonder why we ever wanted more.

Lennon and McCartney, Yin and Yang

There’s a discussion over at the Hoffman forums on the value of the post-Beatles solo efforts.

There is a touch of magic in the Beatles’ work though I am not really a diehard fan. Generally I find the solo albums less satisfactory with a couple of exceptions. The reason I think is to do with yin and yang, a Chinese philosophy of complementary opposites. Exactly how Lennon and McCartney differ is going to be hard to put into words, but it is to to with light and dark, where McCartney is content mainly to stay in the light, and Lennon is driven to explore the dark. In the solo work, McCartney is always in danger of descending to nursery rhyme, while Lennon is always in danger of descending to primal scream. They are better together.
Personally I tend toward introspection so my preference is for Lennon, and in particular Plastic Ono Band I regard as a work of genius, dark though it is. As for McCartney, I love Band on the Run though it is lightweight, and Flowers in the Dirt where Elvis Costello seemed to play Lennon’s role to some extent, giving the music an edge that McCartney’s work often lacks.

PCs that shut themselves down

I was asked to look at a misbehaving laptop recently (a hazard of this profession). “It seems to works fine, then shuts itself down without any warning,” I was told.

Yesterday the same happened to me. I was typing away when bam! the PC turned itself off.

The reason in both cases was the same. Dust. When I took the back off the PC I saw that the fins on the heat sink were completely clogged and it was no longer cooling the CPU effectively. When the CPU overheats its thermal protection kicks in and turns it off.

The laptop was a variation of the same problem. Some of the vents in the case were filled with dust and dirt, impeding the flow of air. Apparently this is a common problem with some Acer models.

Part of the problem lies with Intel for making CPUs that run so hot. Cooling is critical.

Fortunately the fix is easy, though you normally need to take the back off. I used one of those aerosols that squirts out compressed air. Clean out the dust and it all runs sweetly again. If only all system failures had such simple solutions.

Technorati tags: , , ,

Google’s new model of app development

I was fascinated by this slide shown at the recent global developer day, which I’m reproducing with Google’s permission:

Four blocks captions Ads, Standards, Mashups, Open Source

The image doesn’t make sense without the caption, which I’ve used as the title of this post: The New Model of App Development. You can see the slide in context in this Register piece. Two things in particular interest me. One is the appearance of ads as an integral part of the development model. This makes sense for Google’s own development, but does it make sense for others? Given that much of the software industry is slogging away at internal business applications, that seems a stretch. It may be true for consumer apps. Ad-funded applications have not been a big success on the desktop, but we have somehow become tolerant of ads flashing round the screen when working on the Web.

Another issue is one we tried to capture in the caption for this image at the Reg. The main goal of developer day was to get developers to integrate Google services into their applications, by using Google Maps and the other APIs on show at Google code. The company is even keen to host your gadgets on its own servers. Google wants to be an indispensable building block in app development, even though it left itself out of the illustration.

How about open source? Google uses and sponsors open source software, and has posted the code for Gears, but where’s the code for Docs & Spreadsheets? Closed source is an important part of Google’s own app development model, as it is for most others.