Wednesday, January 10, 2007

Game Time for OpenDocument

Game Time for OpenDocument:

The contradictions between ODF and EooXML are beyond real. They are beyond repair. There are two questions concerning these intractable contradictions. The first is, what will ISO do with EooXML? As of the January 5th, 2007 submission, EooXML has the critical 30 day period known as the ISO Contradiction Review Phase. Any allegation of a contradiction between EooXML and other ISO products (ODf) and it's back to the drawing board for MSECMA. Second, given the serious disparity of these contradictions, how does the marketplace resolve these contradictions? How do users get from where they are today, to where they need to be tomorrow?

Today they are stuck with a vast investment in legacy MSOffice productivity environments, their information bound to MSOffice proprietary formats, their information processes and critical day to day business activities strapped, bound and locked into that same application – API dependent base. Moving to ODF isn't a file format switch or simple application swap out affair. No, it's taking your digital business, information processing and daily workflows out of Microsoft's control, and putting your future into your own hands.

Lots of stuff happening this week. ECMA-376 is only a few weeks old, and the forces of Redmond were quick to throw down the gauntlet' with as smug a sneer as one would expect. Three articles caught my attention:

. .......originally posted as " MS Winning Office Doc Battle"

Great stuff. But what interest me most is not the obvious contradictions between our two universal file format contenders, ODf and EOOML. No, what interest me most is how to break the monopolist iron fisted grip on our digital lives. Mandating ODf isn't enough. There has to be a bridge between where we are today, locked into the MSOffice productivity environment, and where we really want to be tomorrow – the free flow of information between loosely coupled applications that adhere to open standards and the universal access and exchange of the open Internet.

Given that objective, it is Matt Asay's InfoWorld Blog posting "The Future of Lock-in" that has caught my attention today.

May 17, 2006 :: The Future of Lock-in

Hi Matt,

Right on! You nailed it. Although i tend to think in terms of an information processing chain where EooXML is both the file format and the transport. Microsoft gets the portable document model, where content, data and streaming media traverse in highly exchangeable XML document containers.

The Microsoft chain runs MSOffice <> EooXML <> VSTO <> IE 7.0 <> the Exchange/SharePoint Hub. From the E/S Hub, it splashes out into a galaxy of server and device side services such as MSLive, MS SQL Server, and things like MS ERP.

The core of the chain is that everything speaks EooXML, even applications written with VSTO 2005, (they dropped MSXML in favor of EooXML).

What most people fail to understand is that there are two barriers to entry/migration from MSOffice. The first barrier is that of the billions of binary file formats bound to MSOffice.

The second barrier is that of MSOffice bound business processes, and it's near impossible to overcome. Most people don't even get this far, giving up in frustration with non interoperable file formats and lossy conversions that plague the first barrier. However, for those who do get this far, which they did in Massachusetts, the second barrier is near impossible. The barrier of MSOffice bound LoB's (Line of Business), business processes, and assistive add-ons seems impregnable.

Since 1995, MSOffice has evolved to become the platform of critical day to day business processes. You can't replace these workgroup – workflow processes with OpenOffice. Nor can OOo "participate" in existing processes unless there is perfect fidelity of file conversions - otherwise known as perfect roundtrip fidelity.

The good news is that in the very near future every one of these MSOffice bound business processes is going to migrate to an Internet "XML ready" Hub of some sort. The bad news is that Microsoft is killing everything in sight with the MSXML-EooXML E/S Hub. Including Lotus Notes.

As you might guess, the E/S Hub has great "integration" with the MSOffice desktop productivity environment. A level of integration that can't be touched by any other vendor, be they Oracle, IBM, CA, BEA, SalesForce.com, JBoss, or Apache OSS. What makes the migration of business processes to E/S Hubs inevitable is the incredible bump in productivity any process gets when moved to an integration Hub.

Interestingly, the MSOffice point of integration in this emerging processing chain is that of the XML file format; EooXML. Because the portable document model is so critically important to these Internet enabled processing chains (you can't do this kind of sprawling data binding - workflow routing with binaries), many governments are seeking an ODF plugin for MSOffice.

The idea behind the ODF plugins for MSOffice is to turn MSOffice into an ODF pump instead of a pump for EooXML. The advantages are twofold - both the first and second barriers to migration are broken. And broken without any disruption to the current business process flow, or cost of re engineering to an MSOffice alternative (if that were even possible).

Once MSOffice is converted, and an ODF pump is in the anchor position, those MSOffice bound business processes can be migrated to anything that can speak ODF. All the server side services (hello IBM and Oracle) that get cut out of the Microsoft chain can cut into an ODF one without a problem. To get "great" integration and perfect interoperability though, applications in the chain have to break with the long standing traditions of being information "end points", and become routers of information - adding value but not breaking the flow.

Sadly, few application understand the emerging processing chains and the new demands for "routing" information. For instance, Google Writely supports ODF, but only as an "end point". Information might go into Writely, but it doesn't come out in a useful ODF structure. The law of these emerging processing chains seems to be that of perfect "roundtrip" fidelity on transformation. Applications must assume that the information flows they link into never stops. Even enterprise publication, content and archive management systems must keel to the law of interoperability if they are to compete against the MSOffice <> EooXML <> E/S Hub Juggernaut.

So what i think we need most at this moment in time is ODF ready Hubs; where content, eMail, scheduling, workflow management, data binding, workgroup management, and project management merge with the ODF – XForms document model.

If we can intercept the migration of MSOffice bound business processes, using ODF plugins and ODF ready Hubs, the final step to breaking the monopolist grip is within reach. Once the business processes are in ODF and residing at the Hubs, where all kinds of server and device systems can integrate as needed, it's easy to replace MSOffice on the desktop with an OpenOffice – Mozilla one two punch. Yes, the desire of the plugin makers is to eventually replace themselves with OOo. Before that can be done though, we have to work through a period of mixed applications whose only point of interoperability is that of speaking perfect ODF.

You are right about the longterm lockin. I watched this happen last year with the real estate industry. It's a good story, but too long for a blog comment. The take away point however is that the real estate industry bought into the E/S Hub big time, and many vertical applications of extraordinary productivity swept into the marketplace. The E/S verticals quickly replaced near every desktop productivity shrinkwrap app, even those with over fifteen years of dominant marketshare. Vertical vendors fought to replace their own shrinkwrap stuff with their own E/S products. It was the only way to survive. The E/S Hubs automagically converted binary bound documents to MSXML, making them unreadable with anything other than MSOffice 2003 and IE 7.0. Every Windows 2000 desktop was EOL'd one fine Friday afternoon last year when the mothership sent out a security upgrade to E/S. An upgrade that required IE 7.0, for which there is no W2K version available. A Friday afternoon in real estate. Fry's was hopping.

The Realtors moved to E/S Hubs en mass because the productivity gains truly are extraordinary. One could easily argue that that industry is locked in for the next 15 years.

Even though Massachusetts was mandating ODF, they were buying E/S servers, thinking perhaps it's just an easy to administer eMail system. Right. The Commonwealth is just one court docket system written to E/S away from having to convert all those ODF docs back to EooXML. Oh wait, with the MS Translator, E/S will automagically do that for you :)

The Linux desktop:

Another point to make is that the OF processing chain opens the way for Linux Desktops. As the MSOffice bound business processes are first moved to an ODF footing (the plugin) and then migrated to ODF ready Hubs (Zimbra/Alfresco), the way is cleared for ODF ready Linux desktops to crack the monopolists stronghold.

It's this lifting of the MSOffice bound business processes into the Internet that open up the way for Linux desktops. Most Linux distro providers are totally unaware that the second barrier is also the "Linux desktop barrier".

Furthermore, the ODF processing chain opportunity is "replacement - upgrade- next workstation to be purchased" opportunity. It's not about wiping out existing Windows installs as much as it's about the next purchase made for a particular work group.

This fact ought to impact how Linux providers like Linspire go about the business of selling their systems. First of all, it's a box business opportunity, not one based on rip out and replace. Second, the desktops must be ODF - XForms - Java ready, with OOo and Mozilla holding down the application core. Third, the Linspire's of the world really need the plugin and ODF Hubs in place and the migration underway before the monopoly base is truly up for grabs. All 485 million of them. This begs the question of why Linux Desktop and Server providers missed the high stakes involved with the ODF - EooXML choice. IMHO, they missed this primarily due to the fact that Linux users are rarely if ever "workgroup - workflow" bound. They tend instead to be individuals working outside the tight constraints of information exchange and perfect roundtrip interop workflows. And if they do participate in a "workflow", it's more like that of the GrokLaw process. A process based on forum publication simple HTML. As long as they can get their information into HTML, most Linux users have no problem entering the workflows important to them.

Funny, but for Linspire to get their breakthrough moment, the ODF processing chain must first succeed in breaking Microsoft's grip on those MSOffice bound business processes.

Thanks Matt for bringing this issue forward,

~ge~

Background on the above comments ::::

Over the years Microsoft has perfected many ways to lock digital consumers into their Windows and MSOffice application platform. The forced march - upgrade treadmill Redmond has perfected is perhaps the most extraordinary profit machinery ever unleashed. The most innovative mesh of lockin schemes ever conceived. At the core of this machinery is the user digital dilemma of user owned information and information processes being bound to the Microsoft applications and platform services used to "work" that information.

When the OASIS Open Office XML file format, now known as ODF or OpenDocument, was first ratified as an international standard, many thought it was the beginning of the end for Redmond monopolist and the great treadmill machinery they had established. XML is after all the language of future Internet. It's the way data, content, and streaming media traverse the Web 2.0, with a surrounding galaxy of solutions and services from SaaS to SOA to desktop productivity environments, to backend transaction and information processing systems, to enterprise publication, content and archive management systems.

The promise of XML is that of application and platform independent global transport, exchange, and crystal clean transformation. In this maw of emerging universal connectivity and collaborative computing solutions, the OpenDocument XML standard promised to merge the legacy of desktop productivity information and processes into the larger stream of Internet ready services and systems. The future looked good.

Microsoft however was not to be denied or left behind. The threat of the Internet to their monopolist enterprise was and is very real. First they stopped Netscape from establishing the ubiquitous cross platform browser as an Internet application platform. Then they stopped Java from becoming an Internet - enterprise systems application platform. It took a few years, much litigation, some time served, but finally Microsoft is ready to make their run at the many Internet interlopers who dare challenge their empire. Brace yourselves for a whole new era of massive lockin. A stack based lockin machinery that, if successful, will bind business processes to the Vista stack of desktop, server and device systems for years to come.

Matt Asay's article is one of the first to signal the alarm and describe the early parameters of what's at stake. He sights the rising dominance of the Microsoft SharePoint server, pointing out that behind it all is a new lockin strategy at work. One far more insidious and breathtaking in it's reach than anything ever imagined with binary file formats. File formats bind information to specific applications and platform services. SharePoint is designed to bind information processes to the Vista Stack.

At the heart of the Vista Stack is the recently approved MS – ECMA Open Office XML file format. Oooops. That's MS-ECMA Office Open XML file format, not to be confused with the OASIS Open Office XML file format now known as OpenDocument; ODF and EooXML for short.

What Microsoft has done is to fully leverage their application control over the billions of legacy binary documents locked into the applications used to work this user owned content. EooXML was written to provide backwards compatibility with the BoBs (billions of binary documents). In 2002, when asked why they didn't join the OASIS Open Office XML standardization effort, Microsoft responded that OpenDocument (then Open Office XML) was unable to handle the application specific vagaries of the BoBs. ODF was said by Redmond to be inadequate. And that was before the technical committee work on ODF even began.

Without Microsoft's participation, most felt it would be impossible for ODF to “accommodate” and fully reflect the application specific BoBs. After all, only Microsoft understood the years of arbitrary and treadmill accelerating changes to the binary file formats causing users to upgrade to ever new versions of MSOffice applications. The version madness we are all so familiar with. They, and they alone had the blueprint able to unlock the BoBs.

So Microsoft set out to develop a whole new generation of application/platform bound XML file formats. The Vista generation. OpenDocument was of course developed as an application and platform independent XML file format. Internal document dependencies are based on other open and internationally recognized XML standards (XHTML, CSS, XForms, SVG, SMiL, XSLT, etc.) MSXML and MS-ECMA OOXML on the other hand were developed in the great treadmill traditions of being bound to Microsoft applications and platforms, bound through a cascading entanglement of system specific and very proprietary dependencies. Nothing new to report here except that the documentation for this mash is over 6,000 pages.

The short description for EooXML is that it's an XML wrapper of binary encodings unique to the legacy of Microsoft Office applications.

Still, many held out hope that with EooXML, Microsoft would finally release the secret blueprint and fully document the conversion of BoBs to clean XML. With the EooXML ratification and release in December of 2006, perfectly timed by the way to coincide with the release of Vista, MSOffice 2007, VSTO 2005, and a new version of the Exchange/SharePoint Hub, we can now see that this was anything but full disclosure and documentation. With EooXML, Microsoft concedes nothing while leveraging the BoBs into a launch of a future lockin strategy that is breathtaking in scope and ambition.

Here's something interesting. We now know that EooXML wraps the difficult to transform binary encodings of BoBs in exactly the same way as the OASIS Open Office XML Technical Committee described in February of 2003! In February of 2003, Phil Boutros, the legendary reverse engineer expert from Stellent, submitted to the technical committee the XML binary and processing instruction wrapping model affectionately known as the tags. To avoid any hint of application specific references, in the ODF specification (section 1.5) these tags are referred to as . Other casual references include and tags.

Another interesting point, courtesy of the inexhaustible Rob Weir, is that MSOffice 2007 has the unique ability to produce two kinds of EooXML. I've never seen an application do this before, and one has to wonder why?

What Rob discovered is that if you import a legacy BoB into MSOffice 2007, the application will convert it to EooXML fully preserving the originating application binary encodings – even doing so within laughably and descriptively colorful named XML tags. Fine. We can easily do that with ODF using the infamous tag model. No inadequacy to be found here. (Damn, if only we had patented that technique. Phil Boutros must be lapping this up :) One of the examples Rob pointed out is the use of the long since deprecated VRM encoding. Good work MSOffice 2007!

Next Rob re created that same legacy document “natively” in MSOffice 2007. Exactly the same! Saved it as EooXML. Then examined the XML, comparing the two EooXML files. Well well well. They are substantially different! Same application. Same file format. Same document content and presentation. Different EooXML! Interestingly, for one thing there is no VRM encoding. It's been replaced by the proprietary application/platform dependent but forward Vista ready DrawingML.

Some will argue that this is the only way to preserve backward compatibility. I would argue that this will result in an information nightmare. Only one of the EooXML files is backwards compatible. The other is ready for the Vista bound information processing chain centered on the Exchange/SharePoint Hub. How are organizations going to keep things straight?

IMHO, the better way of handling this backward compatibility dilemma is to write a plugin for the legacy applications. Let the plugin do the nasty dark work of conversions, and do so without disruption or confusion to end users. All the files should of course be converted to clean, open and highly transportable/transformable XML, and stored in that state. (uh, that would be ODF instead of EooXML :) This is the optimal workflow and exchange “state”. Let the plugins do the application specific “user interface” only conversion back to the proper in-memory-binary-representation when needed.

So why did Microsoft choose this strange dual XML file format role for MSOffice 2007? Why not provide a useful binary <> EooXML plugin for legacy applications instead of mashing up an XML file format specification with legacy application version specific madness?

IMHO, Matt is right. SharePoint is the future lockin point. The Vista Stack and Vista information processing chain model will probably convert without exception all those legacy specific EooXML files to Vista specific EooXML. You may not have to upgrade MSOffice 97, 98, 2000, or 2003 to MSOffice 2007, unless and until you find yourself participating in an Exchange/SharePoint Hub based process. So maybe this isn't a traditional force march to MSOffice 2007? Although it remains to be seen, this may in fact be a lockin and upgrade strategy designed to strongly encourage users to move from Win2k and XP to Vista. A move designed to demand both the desktop and server versions of Vista applications and systems.

For more information, try this web site: Office Business Applications Developer Portal, with special attention to this document: White Paper: Building Office Business Applications

Brace yourselves my freedom loving friends. It's game time.

~ge~

4 comments:

Anonymous said...

EooXML should be blocked on this basis. ODF already exists and EooXML is therefore a duplication of an established ISO standard for no good reason. The argument that it is a defacto standard is complete nonsense - nobody uses it yet. If Microsoft decided to document and publish it's binary .xls, .doc format versions as ISO standards, then there would be some justification for that because they are defacto standards, but not EooXML

If EooXML is passed, then you know there is corruption at work within ISO since they prevented the same thing being done to wapi standards by the Chinese.

Unknown said...

Hi arb,

IMHO, it all comes down to one question:

*... Is ODF able to handle everything EOOXML was designed for? Is there something you can do in EOOXML that can't be done with ODF?

Microsoft insists that the reason they developed EOOXML is that ODF is inadequate and unable to handle the advanced features of MSOffice, and, most importantly, the billions of binary legacy documents produced by the many versions of MSOffice still in production.

The answer to this question is that ODF can handle everything MSOffice can throw at it.

There are two ways of proving this.

The first is to do an external mapping between an EOOXML file and ODF. This is what the MNC Translator Project tries to do using an industry standard XSLT - XPath approach. The results are interesting and instructive. They are unable to do a transform from ODF to EOOXML - which not surprisingly is their primary project objective. The XSL Transform from EOOXML to ODF is much easier to accomplish, but they are trying to do something near impossible. The reason is that EOOXML is inadequate and unable to handle the advanced presentation features (styles) of ODF applications.

Now, be careful here with what i just said. ODF is able to handle these advanced presentation features so common to ODF ready applications, but EOOXML is not. Yet, ODF can handle everything EOOXML can do. Which is to say, everything MSOffice and the legacy binaries can throw at it.

Another way of stating this would be to say that EOOXML is a subset of ODF. But note you can't say that ODF is a subset of EOOXML.

The second way of proving that ODF can handle MSOffice is to mount a native MSOffice ODF plugin and start converting and producing ODF natively in MSOffice.

This has been proven by the OpenDocument Foundation's daVinci plugin. daVinci is able to handle the entire legacy of billions of binaries, the nuances and unique needs of assistive technology add-ons, the infamous LOB's - those line of business applications developed on top of MSOffice, and the critically important day to day business processes bound to years of MSOffice use.

These two arguments are application neutral.

The first argument isolates the file formats from application influences and examines them purely on terms of XML. Here we find that EOOXML breaks the most fundamental promise of XML; the transformation promise.

The second argument, an ODF plugin for MSOffice, pits the two file formats against each other within the same application. Again neutralizing the influence of application specific features.

In their arguments to prove that ODF is inadequate, Microsoft drags us into an unfair and disingenuous application bound comparison model. What they end up arguing is that MSOffice has different features than OpenOffice. Which has nothing to do with the XML file format comparison.

Maybe they just don't get what ODF was designed to be? Maybe after so many years of binding proprietary file formats to specific application versions, they can't see their way out of the application bound swamp?

The truth is that ODF was designed to be an open and universal file format, application and platform independent, timeless in it's usefulness across all kinds of information domains, including those we have not yet dreamed of.

The information domains that were considered during the ODF 1.0 phase of development include desktop productivity environments; enterprise publication, content and archive management systems; SaaS and SOA implementations (the universal transformation layer); and the Open Internet with all that the Web 2.0 can throw at it.

The only way ODF can achieve this universal reach is to stay as close as possible to the continuing development of open standards, maintaining compatibility with both the protocols, methods and open means of implementation.

EOOXML on the other hand was designed to meet the needs of one vendors application and platform. It is entirely bound and dependent on that vendors application interfaces and platform system calls. By design.

Which is okay, except that now Microsoft wants the world to accept this as an approved ISO Standard - an approval that would totally contradict and introduce problematic inconsistencies with existing ISO products. That's not okay!

If Microsoft wants to introduce EOOXML to ISO as a recognized subset of ODF, that would require some serious harmonization work between OASIS ODF TC and the MSECMA TC. That would be okay, but there is no indication that Microsoft would be willing to pursue this route. Even though Microsoft joined the original OASIS Open Office XML TC (ODF), they declined to actively participate in the work. They have however maintained their membership and observer status in the OASIS ODF TC. Observer status provides them with direct access to all documents, discussions, meetings, wiki's, listserves, sub committee work, commentaries, etc. The only thing is that they don't qualify to vote unless they actually attend the conference calls.

The harmonizing process between ODF and Chinese UOF is under way. There is no reason why the same can't be done with EOOXML.

Hope this helps, and thanks for the comment,
~ge~

Anonymous said...

And ACME 376 is the end-run around that.

Microsoft can consent to tie itself in knots explaining how da vinci won't work, while it's already up-and-running, and also how their own ODF-MSOOXML plugin will work, when it is so far behind schedule so as not to be funny.

A good number of important and useful developments in the computer industry came about because of individuals insisting on doing things behind bosses' backs. I suspect that with this double MSOOXML hassle, the growth of Microsoft's control will cause more trouble than its worth for so many businesses that they'll look the other way when da vinci gets installed, just because they would rather have their systems working than not or intermittently.

Anonymous said...

When we talk aboutdata conversion india has been a great place to get those jobs done at a cheaper price