Microsoft has lost its appeal in a case where a small company called i4i claims that Office 2003 and 2007 infringes its patent on embedding custom XML within a Word document. This is not the XML that defines the content and layout of the document. It is XML contained within the document that Word itself does not understand, because it conforms to a custom schema, and which will not be displayed unless you write code to parse it and output some sort of result to the document.
Microsoft now says:
With respect to Microsoft Word 2007 and Microsoft Office 2007, we have been preparing for this possibility since the District Court issued its injunction in August 2009 and have put the wheels in motion to remove this little-used feature from these products. Therefore, we expect to have copies of Microsoft Word 2007 and Office 2007, with this feature removed, available for U.S. sale and distribution by the injunction date. In addition, the beta versions of Microsoft Word 2010 and Microsoft Office 2010, which are available now for downloading, do not contain the technology covered by the injunction.
The key phrase here is “little used feature”. It is true, in that the vast majority of Word documents do not use it; the only users who will be affected will be those who have built custom solutions which use it in some kind of workflow or for data analysis.
Why did Microsoft lose? Here I have to admit my lack of legal knowledge; though I’m aware that Microsoft’s track record in court is not good. One interesting aspect of the case reported here is that Microsoft was proven, by an email from January 22 2003, to have been aware of the patent and products from i4i:
we saw [i4i’s products] some time ago and met its creators. Word 11 will make it obsolete
says the internal email; Word 11 is another name for Word 2003.
That said, intuitively both the patent and the decision seem odd to me, in that XML is specifically designed to allow data with a custom schema to be embedded within a document defined by another schema. But does the i4i patent cover every XML document out there that does this – such as, for example, XHTML documents that include microformats? The answer, as I understand it, is no, because the patent is about how the custom XML is stored, not that it exists. Here’s a quote from the patent itself:
The present invention is based on the practice of separating encoding conventions from the content of a document. The invention does not use embedded metacoding to differentiate the content of the document, but rather the metacodes of the document are separated from the content and held in distinct storage in a structure called a metacode map, whereas document content is held in a mapped content area … delivering a complete document would entail delivering both the content and a metacode map which describes it.
In other words, the custom XML is not stored directly within the containing document, but in a separate file, together with an instruction that says “please insert me at location x”.
Is that really any different? Intuitively, I doubt it. What we think of as single files are often in reality a number of sections bundled together, such as a header part and a content part. Further, what we think of as a single file may be stored in several locations, with metadata that defines how to get from one part to the next.
An Office 2007 document such as .docx is in reality a ZIP archive which contains several separate files, organised according to the Open Packaging Convention; if the i4i patent has wider implications, it strikes me that they would be for the OPC rather than for XML itself.
I don’t claim any expertise in whether or not i4i has a valid claim against Microsoft or others. I do have an opinion though, which is that this kind of patent litigation does not benefit either the industry or the general public. This particular case concerns me, because the patent strikes me as generic, and one that could be applied elsewhere, which means more effort expended to workaround legal issues rather than in improving the software we use; and because even if the feature in Word is “little used”, the concept is an important one that still has great potential – though now probably not in Microsoft Office.