Anyone who has ever dealt with Microsoft Outlook will know the .pst file format – it’s the binary, undocumented file in which all data from Outlook is stored – emails, contacts, calendar, you name it, it’s in there. Microsoft has announced that it will release detailed technical documentation on the Outlook .pst data format.
The news was announced byPaul Lorimer, Group Manager at Microsoft Office Interoperability, on the Interoperability@Microsoft weblog. Microsoft says that data portability has become very important, especially now that all sorts of crucial information is stored in closed formats, and more specifically, in Outlook .pst files.
To meet the needs of customers and partners, Microsoft will release detailed technical documentation about the Outlook .pst format.
This will allow developers to read, create, and interoperate with the data in .pst files in server and client scenarios using the programming language and platform of their choice. The technical documentation will detail how the data is stored, along with guidance for accessing that data from other software applications. It also will highlight the structure of the .pst file, provide details like how to navigate the folder hierarchy, and explain how to access the individual data objects and properties.
This documentation is still in its early stages, and the company is gathering feedback from partners and interested customers about the documentation, to ensure that it will be “clear and useful”. A wild guess is that it might arrive alongside the release of Office 2010, but I’m not basing that on anything other than common sense.
The documentation will be released under the Open Specification Promise, meaning that anyone can implement the .pst format in any away, on any platform, using any tool, free from any fears of patents. You do not need to contact Microsoft in any way.
A good move by Microsoft, but long overdue. Better late than never, I suppose, and it will also benefit open source Outlook equivalents like Evolution.
When I migrated away from Windows, both at work and at home, I spent something like a week trying to get my mail and contacts out of those darn PST files. I wound up loosing a lot of data. It was about time MS did something about interopetability with Outlook, considering how widely it is used. Btw, I have a new job now and am forced to use Outlook…
I’ll trade your Outlook for my Lotus Notes.
I still have my old PST from my days of primarily Windows/Outlook. I’d love to import that directly into a thunderbird install allowing my use of it across platforms.
Heck, even now I have a personal.pst on my work flashdrive for personal email that ends up at my work address. It’d be great to get home and open it directly under Thunderbird so the contents can merge into my bulk email.
Microsoft looking at opening another user data file format; I gotta admit support for that idea.
You could have done this ages ago. Thunderbird when used under Windows has supported importing all of your Outlook email and contacts for a heck of a long time.
Once you have your data out, you can then export your thunderbird addressbook or emails to any platform where it runs.
If this is true, kudos to Microsoft. (I reserve final judgement until they have shown whether they intend to fully open it and/or migrate to a new, closed version for the next version of Outlook.)
I have a few PST files lying around. Before I installed my own IMAP server, I used Outlook to store all my email. I still have to export those messages from Outlook to my IMAP store. PST, however, mostly fulfilled its promise. Older versions of Outlook/PST suffered from a 2 GB limit, which caused some “silent” data loss. Newer versions do much better. I like the possibility of Outlook to store regular files in any folder (not as attachment, but really as, for example, a word document between some mail messages).
Anyway, enough cents – hopefully Microsoft will continue on this enlightened path. It’s about time.
“Newer” versions are not .pst files. They are .ost, a completely different format.
This is no different from opening the Word 95 .doc format … considering the newer versions are nothing like it.
I’m afraid you are mistaken.
Outlook uses PST, and PST files created by Outlook 2003 SP2 and up support sizes over 2gb.
OST files are “offline pst files” created by outlook if you enable “Cached Exchange mode” when connecting to an Exchange mail server.
You’ve gotta be kidding me. Are they serious? :O
Having public specifications for the “.pst” format, or more accurately, having open tools to parse “.pst” files would be a blessing for many sysadmins.
PSTs are a headache. Not only they continuously increase in size (because users never “compact” and the automatic compacting seems to be a myth) and numbers, but they change just by opening them, which does wonders for the backup systems.
Pain in the neck they are.
I spent some time last week moving past year’s email into PST files on a user’s machine. They retain access to older email and I don’t have to store it on the server. The user also doesn’t have to sit through long resync times for that multiple-gig sized data blob.
In this case, I was watching the targeted PST file size go down to get an idea of when I could come back to compact the next in line. I noticed the size of other PST files was decreasing as well.
This seemed most prevalent when compacting a file or during the pause at it’s end and my return. When Outlook was sitting unattended without a options box open, I didn’t see any file size decrease. I can’t confirm what the event trigger is to cause Outlook’s automagic compacting. I have seen Outlook compacting files other than the one specified though.
They know the areas that they may be losing ground.
I’ve recently migrated employees from Outlook to Google and Thunderbird because the files are compat and info transfer was easy.
Eventually they will be more open than they were in the past. This is the direction technology is going. The days of being stingy are over. It’s a give and take unless it’s worth paying for.
…long term move would be to use accepted standard file formats for the various components to ensure interoperability with anything else. We saw an article yesterday questioning a company’s faith in their products but when Microsoft continually refuse to adopt industry standards it certainly has to raise that question about them.
A step in the right direction at least, but a lot more to be done…
Edited 2009-10-26 21:55 UTC
Nice, never thought I would see the day. This is a definite win for developers!
There is some debate about whether this “Open Specification Promise” is compatible with the GPL or other free software licences:
http://en.wikipedia.org/wiki/Microsoft_Open_Specification_Promise#S…
Microsoft leaves this ambiguous with the statement, “we can’t give anyone a legal opinion about how our language relates to the GPL or other OSS licenses (sic)”.
But tell me, what is compatible with GPL and their license? just BSD and other GPL like licenses. Is hard to please them.
But the opennes of this file format is not a try to please anyone anyway.
You’re talking as if it’s absurdly difficult to make something GPL compatible. The MS-Samba deal is a good example of a deal done right.
Microsoft have plenty of legal power and could have made a clearer agreement if they had wished. I’m not saying that they are doing this on purpose, but it does raise eyebrows when they claim that this can be used by everyone when it probably can’t be used in software released under the most prevalent open source licences.
What you are wrongly assuming with your “Think twice” title, is that we all care about being completly GPL compatible or not, I assume it is but I don’t care if is not.
I’m not assuming anything, and I’m not sure why you’re so keen to put words in my mouth.
The fact is that some of the best and most important software out there may not be compatible with this deal, meaning less choice for users. Maybe you don’t care about others, but I do.
This also contrasts to the “anyone can use it” marketing we are hearing from Microsoft.
Surely, for this story, it’s of no importance? After all, MS are talking about documenting how you would access a pst file, not releasing code that would enable you to do so.
I suppose that maybe the patent indemnification promised by the Open Specification Promise could be questioned in light of using the GPL to license any software created to access pst files but I just don’t see it.
What’s your opinion? I’m not up to scratch on the arguments of both sides and would love to hear what you think.
It is very important.
The documentation and the patent indemnifications are both part of the Open Specification Promise. I’m not a lawyer, but it looks like using the documentation is fine but the patent indemnification may not be (according to the SFLC). Without patent indemnification, the patent holder (in this case, Microsoft) reserves the right to demand royalties be paid for the use of software that uses the patented portions of the spec.
The GPL is one licence that this may not be compatible with. There are probably other licences with this problem too. This does need to resolved on a licence-by-licence basis.
I see what you mean. I’ve done a bit of reading up and from what I can find out, the argument boils down to the patent indemnification promise only applying to the original software created and no other upstream software created on the strength of the original. This would make MS’s OSP incompatible with the GPL version 3, as far as I can see. Version 2 of the GPL should be fine though, as long as you don’t mind leaving yourself open to a potential patent suit on the part of MS.
Obviously, for GPL version 3 software this is a problems but surely for this kind of thing BSD would be just as good?
Which means that is is not fine at all. Nobody wants to be at the mercy of a patent lawsuit. They are far more expensive and time consuming than most people or companies can afford.
You could purchase a licence, but that protects just you. Nobody else who uses the code is protected. Why is it GPL in the first place if nobody can use the code without being sued? That certainly removes the “freedom” from the concept of “Free(dom) Software”.
Novell have an agreement like this with Microsoft regarding Mono and .NET patents. Novell and paying Novell customers are supposedly covered, but nobody else is. So if you use Mono in Ubuntu or Fedora for example, you are potentially liable.
Licensing is a matter of choice for the originating developer(s). If you or they want to use code from elsewhere, it must be under a licence compatible to your/their own.
The fact is that most open source code out there is (L)GPL, and there are other licences compatible with it. You can’t exclude most open source code and then claim that anyone can use the specs. Projects aren’t going to change their licences simply to please Microsoft.
If you want an example of how to do this right, look at the Protocol Freedom Information Foundation deal with Microsoft, brokered by the SFLC and members of the Samba team:
http://samba.org/samba/PFIF/
They were very careful to ensure that all Free Software implementations would be covered, including GPLv3 ones like Samba.
So there’s nothing impossible about this. It all can be done right.
Microsoft may be moving to a new mail archive format
They already have. Outlook 2003 uses OST by default. No idea what Outlook 2007 uses. PST died with Outlook 2002 (XP).
Incorrect. See previous reply.
OST is offline file for Outlook to keep cache of Exchange mailbox, it is not same as PST!!! If any sane person clicks File->New->Outlook data file on Outlook 2007 you can choose between “Office Outlook Personal Folder File (.pst)” or “Outlook 97-2002 Personal Folder File (.pst)”. New PST supports over 2GB file size and Unicode, there is probaply lot more changes but these are most visible. So stop giving plusses on person who spreads false information.
As soon as Evolution has full import of pst files (including contacts, calendar, etc) for the Windows port of it that is working fully stable, then I can start telling people to get rid of the piece of crap that is Outlook.
Seriously, the only time I have NOT had an issue with Outlook is when you are also using an exchange server. They want to fix their compatibility, then fix the bugs that Outlook has.
Can’t tell you how many clients I’ve had to tell that they have to reboot their entire computer, so that Outlook will work again. Simply closing it and opening it won’t work. Not to mention it’s a completely random reason why it stops working.
Not to mention the weird bugs of changing the port settings on an account, then clicking Ok, then it still doesn’t work, so you go into the settings again and they had reverted back to what they were! Seriously, when you have to change a setting 3 times to get it to stick, there is something wrong with your software.
I will rate Outlook as being the most unstable piece of software I have ever had the displeasure of troubleshooting.
I’m still waiting for the CIO to approve Mutt under Cygwin.
Support ticket, 16 months and counting .. 🙂
Alpine is the way to go
OSTs are pst format, but the ost signifies that it is an offline copy of what is on your exchange server.
Edited 2009-10-27 02:58 UTC
I hope this also helps Linux developers to understand and make Thunderbird, Evolution Mail client better. importing data from pst and also in someway communicate better with Exchange server.
I finally had time to look at Evolution as I try to replicate all work functions outside of Windows. The week after upgrading the exchange server of course which is also when I learned that Evolution does not support direct connections with the new Exchange. I don’t think this will help non-Outlook client apps talk to Exchange Server.
Now, having native PST import or mounting a PST file as is done with Outlook; that is a possibility now and one I will watch for.
I don’t get the need for this fake interoperability. It’s my data, how come software companies are allowed to hold it hostage ?
The software industry ran wild long enough, time to protect the users/consumers (I’m a developer but a consumer as well).
File formats which hold my data should be published – it’s not yours. There’s no need for MS’ Johnny-come-lately or Apple’s harassing of BlueWiki.
You want to publish software which stores data in a custom format ? You’ll have to publish the format as well.
My data shouldn’t guarantee your business model.
Rubbish, you took the easy way and choose something that makes your life easy nobody forced you. I’m sure when leftist econazis come to power they will force all to do like you say, but until then I’m glad that companies freedom to decide what to do. And so has consumers who don’t need to buy stuff they don’t want, there is always pen and paper.
Unfortunately, IT managers are dumb asses most of the time and people have to live with idiotic software like Outlook. It’s not like you have a lot of choice in there.
Huh ? I meant “my” as a general case. My OS, development tools and apps are FLOSS (with the exception of flash).
In that case s/my/user/g, and look at the bigger picture.
I’m glad Microsoft are finally releasing the specifications, but perhaps this has something to do with some developers having taken a major effort already deciphering the (highly complex, mind you!) format and providing tools to recover your emails?
http://www.google.com/search?hl=en&q=recover+.pst
You could take your info into Outlook Express and then migrate it into Windows Live Mail.
For most small business and user desktops Outlook is overkill and the archaic email store file system used needs to be killed off. MS has done this with Windows Live Mail although I don’t know why they can’t use a decent email naming system for the email files that are stored. They still do something obscure with it but at least you have a better way of storing your messages akin to the way BeOS did it back over a decade ago.
Geez, even scripting automation in Outlook 2007 is a hit and miss affair. MS needs to scrap Outlook code and start again from scratch and provide decent free utilities for Admin’s to migrate their data away from Outlook. It’s the least they can do for subjugating the computing world to Outlook for so long.