Something has become a bit worrisome in my computing life. Since I got my Powerbook last Fall, I've allowed myself to become increasingly dependent on closed software applications.
One of the things I treasure about open standards and simple file formats is that I can still easily read and edit my emails and documents going back to September 1991, to the day I typed my first command at a Unix prompt. That's nearly 15 years ago!
Many people fret about bit rot: put your newborn's pictures in some seemingly ‘archival’ format like JPEGs on CD-ROMs, and your ability to revisit them when your kid turns 21 is very much not assured. There are two issues here: the physical media, and the file formats. In the net-centric, GNUish world I first entered as an undergrad, neither issue seems to be much of a problem.
With ‘offline’ physical media – cartridges, cards, disks, etc. – you must be extremely vigilant to copy all your stuff from one dominant form to the next during the narrow window in time when both are available. Copy your stack of 5¼″ floppy disks onto 3½″ disks. Copy those onto Zip disks. Copy those onto CD-RWs. Copy those onto DVD-Rs. Copy those onto external USB-2 hard disks. Copy those onto whatever the hell is next. Who has the patience for all that? But if you get lax and skip a step, you end up with valuable stuff on a 5¼″ floppy but a computer that only supports 3½″ and CD. Now what do you do? (Substitute the latest technologies as needed.)
This is one reason why I never recommend any kind of offline storage medium, including today's popular USB sticks. Many folks in personal computing thought it was a major coup when Apple released the first home computer with no floppy drive – the iMac in 1998 – but the DECstation 3100s we used at Hopkins had no external storage facility at all, and they were produced in 1989. They had internal hard disks and ethernet, and that's it. It's still basically all I need; some of my machines can write CDs or DVDs, but I really hardly ever use that functionality.
So, my strategy for keeping data alive through the years is just to copy it over the network from one Unix machine to the next, whenever I change institutions and workstations. Nowadays, I always keep multiple copies alive (home, work, laptop) as a backup strategy as well. This has the added advantage that whenever you buy a new machine or disk, it generally has 10 times or more the capacity of the previous one. So bringing along all the old stuff every time costs very little space.
Now, as for the file formats themselves, this has until now been very easy as well. On Unix-y systems, the plain text file is still king, and the few binary formats tend to be open, stable, and supported by multiple applications (think JPEG, PS, PDF). There have always been exceptions: xfig is one that I used way back. And with the more desktop-oriented applications of Gnome, KDE, and Mac OS, there are even more exceptions: I currently rely on gnumeric and gnucash. But as long as the apps are portable, open-source, and provide a variety of export formats, I'm not too worried.
Incidentally, XML-based formats are often touted as a solution here, but they only get you so far. Sure, a text-based format is going to be easier to decode than an opaque, arbitrary, binary format. But open up the .apxl (XML) file used by Apple Keynote in your text editor and tell me with a straight face that it would help you reconstruct your presentation if you no longer had access to Keynote.
I started out writing this with the intent to think ‘out loud’ about what Mac applications I've come to depend on, and how I might reduce that dependence and transition back to mostly open source stuff. (Then I can make use of my GNU/Linux desktops at home and work again, instead of carrying the Mac laptop back and forth always.) But this post has become long already, and I'm ready to head home and seek out dinner, so maybe it's best just to publish this and restart that brainstorm another day.