Sunday 30 December 2007 @1:02
Yesterday was a bad GNU/Linux day. I was getting angry, and I rarely get angry. Every once in a while, I type “apt-get upgrade” and everything goes to hell. Like the print server’s ability to print from PDF. Right when I’m trying to print last-minute boarding passes, confirmation pages for hotels and rental cars, directions from the airport, etc.
I of course found workarounds, then in some cases had to work around the workarounds. But in all it took about 3 hours to figure out what was going on and get 10 pages printed. Something was broken in how CUPS talks to ‘pdftops’, causing it to hang. There were recent security updates to CUPS, but when I downgraded it, the problem persisted. It’s still not fixed, but I found and subscribed to a probably-relevant entry on bugs.debian.org.
I guess I use Debian ‘stable’ so that this sort of thing happens very rarely. The frustration had me questioning my commitment to GNU/Linux and other free software. But of course it’s not that simple: what I treasure is the hackability, and so some instability is inevitable. Even on a Mac, I’m likely to access unsanctioned functionality with “sudo vi /etc/cups/cupsd.conf” or whatever, but this kind of customization rarely survives updates.
I suppose the solution is to take updates more seriously, evaluating them the way a business would: apply them only when there’s sufficient time to run a systematic set of regression tests.
Thursday 9 August 2007 @11:41
I think I was a tinkerer even before I encountered computer programming or computer science. And even though my research is pretty theoretical, I still enjoy breaking out the screwdrivers and anti-static wrist strap.
Our main home PC was a midrange Dell Optiplex that Art bought when he started med school in 2000. About two years later he switched to a Mac laptop and I converted the PC to a GNU/Linux workstation. We mainly use it as a home file and backup server, but it’s also my desktop when I’m working at home. Over the years I added substantially more disk capacity, bought a DVD writer, swiped a video card from an older machine for a dual-head display, etc. By this summer, I was itching for a more substantial upgrade. Software builds are pretty slow, firefox was struggling with memory limitations, and occasionally I had trouble helping out friends with USB backup drives because the system didn’t support USB2.
One of the great things about the PC architecture (as opposed to laptops and small form-factor consumer systems like the Mac mini and iMac) is that it’s entirely possible to upgrade it piecemeal. I had two high-capacity disks and an optical drive that were newer than the base system — no point in replacing those. So I went onto Newegg and did some research on the latest specifications. I generally don’t like buying the very latest stuff because the price/performance ratio is too high. The economical sweet spot on the curve is usually a generation or two back.
I went for the AMD Athlon 64-bit X2 dual-core processor. I got a compatible ASUS mini-ATX motherboard, 2G RAM, and a new mini-tower case. One thing I knew I needed out of the motherboard was two IDE (PATA) buses: one for the optical drive, another for the two legacy disks. The newer I/O bus is called SATA. The prices on drives are so good that I bought a 320G SATA disk too. The new PC will have well over half a terabyte of storage over the 3 disks.
Putting it all together went okay. I scraped my fingers to bleeding twice.
The case seemed roomy — I chose it because having four hard drive bays was fairly rare for inexpensive cases — until I started putting the components inside. Before installing the disks, I booted an Ubuntu live CD to check that all the other components would work.
It turns out that three disks in a case this size is not ideal, even though it physically would hold four. After a few days of use, the disks were running hot. Really hot — the SMART temperature sensors reported 54°C (129°F)! Manufacturers are not always very precise about max operating temperatures. According to some numbers I was finding, this was on the high end, but not out of range. Also, the lifetime of the drive seems to depend more on the ambient case temperature, and ACPI was reporting 40°C on the motherboard.
Ultimately, I found a small fan in an older, unused system that I’ve been too lazy to take for recycling. I managed to secure it in between two of the drives and bring some air across them from the vents in the front of the case. Now the drive temperatures are in the range 45-47°C and reach 50°C only during heavy use (such as backups with rsync). I guess I’m satisfied with that, for now. But next time I think I will choose a case with more drive space, and with better front cooling facilities.
This is the first time I built a PC from the motherboard up, and overall, it has been a good experience.
Tuesday 26 June 2007 @8:47
Hm, this space has been quiet for a while, but for justifiable reasons: I have two journal manuscripts submitted since the summer break began.
I’m never thrilled about writing for journals, because it often means that the key problem is already solved, and I usually would prefer to work on new problems than to “dot all the i’s” on old ones. On the other hand, it’s liberating to escape the strict space constraints of a conference paper. On the third hand, constraints are sometimes cited as catalysts for creativity. I’m reminded of the proverb “I wrote you a long letter because I didn’t have time to write a short one.”
I have also been ‘sharpening the saw’, also known as… Emacs hacking! Version 22.1 was finally released, and I took it as an opportunity to run through the manual and look for all the great little features and tweaks that have become available since the last time I studied the manual so intently. For example, just one thing that I adore for Java programming is glasses-mode (o^o). On-screen, it inserts some customizable little character in between LongCamelCaseWords so that you see them as Long·Camel·Case·Words. Ha!
Now I’d like to ‘sharpen my shell’ too. Zsh has lots of great stuff that I’m not currently using. I learned shell scripting in the early 90s on straight Bourne shell and tcsh, and only recently learned I could do concise parameter-frobbing things like ${file/foo/bar} rather than `echo $file | sed ’s/foo/bar/’` or whatever. Tab-completion for sub-commands (of svn, darcs, etc.) and host names (for ssh) would be great, and I know there are some directory-hopping features (beyond pushd/popd) that would help me. But one thing I’m grappling with is that I currently use zsh both in regular xterms and inside Emacs shell-mode. In the latter case, a lot of the fancy stuff in zsh won’t work. So do I avoid running shells inside Emacs, or hack shell-mode, or get term-mode working instead? Or, maybe forget zsh and do everything with eshell? Am I prepared to run always in Emacs, even when logged in to remote machines? I’m stuck.
Meanwhile, I cleaned up /usr/local/ on most of my machines. I try to avoid installing anything that’s not managed by apt, even if I have to backport it myself (such as with emacs22 on Debian etch). But sometimes it’s inevitable: either it’s something impossibly obscure, or I need a newer version than what’s available already, or it’s something I have hacked on myself and I need my version installed. So now what I do is keep a branch in /usr/local/src/, install it to /usr/local/stow/, and everything else in /usr/local/ is a symbolic link managed by GNU Stow. This should solve the problem of discovering some problematic file or library in local that I make-installed six years ago, and can’t remember what package it’s from or why it’s there.
Wednesday 19 July 2006 @18:51
Something has become a bit worrisome in my computing life. Since I got my Powerbook last Fall, I’ve allowed myself to become increasingly dependent on closed software applications.
One of the things I treasure about open standards and simple file formats is that I can still easily read and edit my emails and documents going back to September 1991, to the day I typed my first command at a Unix prompt. That’s nearly 15 years ago!
Many people fret about bit rot: put your newborn’s pictures in some seemingly ‘archival’ format like JPEGs on CD-ROMs, and your ability to revisit them when your kid turns 21 is very much not assured. There are two issues here: the physical media, and the file formats. In the net-centric, GNUish world I first entered as an undergrad, neither issue seems to be much of a problem.
With ‘offline’ physical media — cartridges, cards, disks, etc. — you must be extremely vigilant to copy all your stuff from one dominant form to the next during the narrow window in time when both are available. Copy your stack of 5¼″ floppy disks onto 3½″ disks. Copy those onto Zip disks. Copy those onto CD-RWs. Copy those onto DVD-Rs. Copy those onto external USB-2 hard disks. Copy those onto whatever the hell is next. Who has the patience for all that? But if you get lax and skip a step, you end up with valuable stuff on a 5¼″ floppy but a computer that only supports 3½″ and CD. Now what do you do? (Substitute the latest technologies as needed.)
This is one reason why I never recommend any kind of offline storage medium, including today’s popular USB sticks. Many folks in personal computing thought it was a major coup when Apple released the first home computer with no floppy drive — the iMac in 1998 — but the DECstation 3100s we used at Hopkins had no external storage facility at all, and they were produced in 1989. They had internal hard disks and ethernet, and that’s it. It’s still basically all I need; some of my machines can write CDs or DVDs, but I really hardly ever use that functionality.
So, my strategy for keeping data alive through the years is just to copy it over the network from one Unix machine to the next, whenever I change institutions and workstations. Nowadays, I always keep multiple copies alive (home, work, laptop) as a backup strategy as well. This has the added advantage that whenever you buy a new machine or disk, it generally has 10 times or more the capacity of the previous one. So bringing along all the old stuff every time costs very little space.
Now, as for the file formats themselves, this has until now been very easy as well. On Unix-y systems, the plain text file is still king, and the few binary formats tend to be open, stable, and supported by multiple applications (think JPEG, PS, PDF). There have always been exceptions: xfig is one that I used way back. And with the more desktop-oriented applications of Gnome, KDE, and Mac OS, there are even more exceptions: I currently rely on gnumeric and gnucash. But as long as the apps are portable, open-source, and provide a variety of export formats, I’m not too worried.
Incidentally, XML-based formats are often touted as a solution here, but they only get you so far. Sure, a text-based format is going to be easier to decode than an opaque, arbitrary, binary format. But open up the .apxl (XML) file used by Apple Keynote in your text editor and tell me with a straight face that it would help you reconstruct your presentation if you no longer had access to Keynote.
I started out writing this with the intent to think ‘out loud’ about what Mac applications I’ve come to depend on, and how I might reduce that dependence and transition back to mostly open source stuff. (Then I can make use of my GNU/Linux desktops at home and work again, instead of carrying the Mac laptop back and forth always.) But this post has become long already, and I’m ready to head home and seek out dinner, so maybe it’s best just to publish this and restart that brainstorm another day.
Sunday 21 May 2006 @19:00
I’m working on a rather sophisticated program in Java, where I ended up building on lots of libraries and other programs. Java is never my first language choice, but with the new features in 5.0 (or 1.5 or Java 2 or whatever the hell they’re calling it)—generics, for-each loop, auto-boxing, enums, assertions, variadic methods—I find it almost usable. And it is fairly easy to bring in external libraries: download the jar file, set your class path, browse the javadoc site, and you’re on your way.
The system is not ready for release yet, but I did start contemplating the licensing terms today. Because of all the libraries I’m depending on, it’s kind of complicated. I’m a big advocate of free software. For end users, the license doesn’t matter very much; any of the ‘open source’ ones will do. For lone developers, you just choose one that matches your goals. For me, that’s GNU GPL or LGPL, depending on the situation.
Unfortunately, if you’re building a system that incorporates the work of many other people, it gets complicated. Here are the libraries and programs I’m embedding… so far:
- org.kohsuke.bali — BSD-new
- com.colloquial.arithcode — BSD-new
- org.apache.tools.bzip2 — Apache 1.1
- xerces — Apache 1.1
- junit — Common Public License 1.0
- gnu.bytecode — GPL 2
- gnu.getopt — LGPL 2
So, under what license can I distribute my own code, which is combined with all of these systems? The main complication seems to be that the Apache and Common licenses are not strictly compatible with the GPL, even though they are meant to be, “in spirit.”
Some people would blame this on the GPL. It does seem to be the odd man out; if gnu.bytecode were released under Apache/BSD or even LGPL, then I could distribute the whole thing under Apache/BSD and be done with it. But I’m a big supporter of the GPL, and I can’t blame Per Bothner for selecting it for the bytecode library (part of Kawa, a Scheme-to-Java system, by the way).
I see a few options here:
1. Remove dependencies on the Apache/Common components, making them optional at build time, then release under GPL. End users can still grab and link those libraries, but I don’t have to redistribute them for a basic configuration.
2. Remove dependence on gnu.bytecode, perhaps using Byte-Code Engineering Library instead (it has an Apache license). Then release everything under Apache or BSD. Regardless of the relative merits of the two libraries, I’d hate to reject one just because it’s GPL.
3. Perhaps it’s possible to redistribute Apache/BSD-licensed software under the GPL, even though I’m not the copyright holder? That seems to be the opinion of Roy T. Fielding, of the Apache Software Foundation:
Whether or not [Apache and GPL] are considered compatible by the FSF is an opinion only they can make, but given that a derivative work consisting of both Apache Licensed code and GPL code can be distributed under the GPL (according to our opinion), there really isn’t anything to be discussed. — 24 Jan 2004
Option 3 would be ideal, if it turns out to be legal. I usually assume that I’m bound to redistribute under the same license the author used, but really it’s just the GPL requires that… ?