Office Open XML vs COM automation

Looking at the new Open XML API, introduced by Kevin Boske here, makes you realise that old-style COM automation wasn’t so bad after all.

There are two distinct aspects to working programmatically with OOXML. First, there’s the Packaging API, which deals with how the various XML files which make up a document get stored in a ZIP archive. Second, there’s the XML specification itself, which defines the schema of elements and attributes that form the content of an OOXML document.

The new wrapper classes really only deal with the packaging aspect. You still have to work out how to parse and/or generate the correct XML content using your favourite XML parser. And it’s a lot more complex then HTML.

By contrast, the old COM automation API for Office presents a programmatic object model for the content, and you don’t have to worry much about how the document gets stored – you just tell Word or Excel to save it.

The (very big) downside of the COM object model is that it depends on the presence of Microsoft Office. High resource requirements, version problems, Windows-only, and inappropriate for server apps.

We seem to have traded one problem for another. What Microsoft needs to provide is wrapper classes for the content, rather than just its packaging.

Technorati tags: , , , ,

2 thoughts on “Office Open XML vs COM automation”

  1. Jonathan,

    Well, the library I linked to is called Microsoft SDK for Open XML Formats, but it deals mainly with packaging not content.

    It looks like OpenXML4J is the same, judging by this FAQ item:

    Can I work at document level now ? (Open Packaging Convention is considered as structure level)
    Unfortunately not yet, but we are working hard for it !

Leave a Reply

Your email address will not be published. Required fields are marked *