Thursday, October 1, 2009

Analysis: Microsoft’s plan to open the PST format


By David Gewirtz

This has been an interesting week for Microsoft and Outlook. Oh, sure, all the big news has been about the consumer availability of Windows 7, but that's not the really big news. The really big news is Microsoft's plan to open up the PST format.

"What matters is this is good for us all."

We've written extensively about PST files here in OutlookPower, and I even discussed the format at length in my book Where Have All The Emails Gone? about missing email from the time of the Bush White House.

If you're not familiar with the term PST file, here's a short introduction: it's Outlook's file format. Nearly everything that Outlook stores in your email file is stored in the PST file. In fact, your email file is your PST file. Oh, sure, Outlook dumps a pile of other detritus all over your hard drive (see Where Outlook hides its secret stuff), but the real meat of your email is stored in PST files.

Up until now, the format for those files had been Microsoft-proprietary. Some developers had hacked the format to create some file repair utilities, but there's very little public knowledge about the format that pretty much stores much of our life data.

But this week, Paul Lorimer, Group Manager, Microsoft Office Interoperability announced that Microsoft will be fully documenting the format. This is big. And while some pundits theorize this all came about because of some pressure from the European Union, the motivation doesn't really matter.

What matters is this is good for us all.

It is never good for a file format to be closed and hidden from view. First, of course, it means we don't truly own our data. But second, it prevents others from making improvements, adding capabilities, debugging problems, and helping to smooth migration.

Initially, Microsoft's reward for opening the format is likely to be a small migration away from Outlook to something like Gmail. We can certainly expect Google to brag about the ability within minutes of the format's full documentation.

And while it's always good to see lock-in reduced, there's more to it than that, especially for those of us who live within Outlook. First, although the PST file stores nearly all of our mail-related information, it's not without its reliability problems. Microsoft has some tools for fixing the files, but with an open format, new opportunities will be available for outside developers to build more powerful diagnostic and repair products.

Second, this is also a win for developers. While MAPI and the plug-in architecture, along with VBScript, have created a vibrant market for Outlook add-ons, there's always more that can be done, and having access to the data format is likely to make that possible. It's also likely to let developers improve performance of some of their code that previously relied on cruder ways of scanning the full data set.

I would be remiss, though, if I didn't mention a downside. This format is very complex and it is possible for developers to act upon the data file and corrupt it in ways previously unseen. This is wizard-level programming and we hope that less-than-seasoned developers won't try to go mucking around in the deep, dark corners of our mail files.

Even so, we're thrilled that Microsoft has taken this step. Whether they were compelled to do so by an external authority, by a competitive desire, or by simply being good digital citizens, we all gain -- and Outlook is made stronger by this action.

And so, we officially send a hearty OutlookPower atta-boy in Redmond's direction.