contrapunctus, by Christopher League
archive of July 2006

Subversion, the honeymoon is over

Really, I wanted to love you. You seem cleanly designed. I adore your model of a persistent file system, where even branches and tags are just sub-directories. Your commands mostly make sense. I appreciate that many of them work without repository access, so I don’t have to wait long to get a status or a diff.

But now you’re screwing me over. All I wanted to do was take the WordPress 2.0.4 upgrade for a spin in your vendor branch. I know it has been a few weeks since I last spoke to you. But now all you can tell me on my Powerbook is “Bad database version: compiled with 4.4.16, running against 4.3.29.” And on Debian, you say only “svn: bdb: Program version 4.2 doesn’t match environment version.” What did I do to deserve this?

You’re jealous of darcs, aren’t you? How petty. Anyway, it’s your affair with Berkeley DB that got us into this mess. I know, you’re seeing ‘FSFS’ now, whatever that is. You say things are better this way, but where does that leave me?

I suppose I should accept some blame too. Some of my repositories are private — read and written only by me — and I wanted them available for commits when I’m offline. So I put them in my home directory and synchronized with unison to other architectures and operating systems. I know, I know. To say this is ‘not recommended’ is understatement. But somehow it seemed to work okay for a while.

I wish I could quit you.

But I need access to my files first.

If crypto is outlawed…

Last week, I bought a pair of 200G IDE disks, just because they were dirt cheap. Probably I’ll use one at home and one at work. I already have a 150G at home for music and such.

I used to be a SCSI snob — and I guess in some ways I still am — but I just can’t afford that habit anymore! Although I miss the performance of SCSI, the price differential per GB is enormous. The drop in performance is definitely noticeable, particularly since I now have two large disks on the same bus. Any disk-intensive activity also drives up the CPU load, which it never would do with SCSI. And forget running more than one disk-intensive process at a time. If I’m still on the computer when ‘updatedb’ starts running, it’s time for bed.

Anyway, in rearranging my file systems at home, I decided to try something new. I now have my root and /home file systems on encrypted partitions. Why? Just because I can, I guess. It might be a fairly valuable technique on a laptop, which is more easily lost or stolen. At least then, you can be reasonably confident the thief can’t access your data.

On a home desktop machine though, crypto seems admittedly frivolous. Am I part of the tinfoil hat set, who thinks the FBI (or some darker, more sinister organization) is going to sneak in and confiscate or clone my drives? Do I have anything on there to hide anyway? Not really. But I do believe strongly in a right to privacy. And if we don’t exercise the rights we do have, we are likely to lose them.

The Disk Encryption HOWTO by David Braun was essential reading, although I didn’t follow its prescriptions precisely. You will need a Linux 2.6 kernel with ‘cryptoloop’ and ‘aes’ compiled in, and the ‘loop-aes-utils’ package that provides crypto-aware versions of ‘mount’ and ‘losetup’.

What happens, essentially, is this: I keep a small unencrypted boot partition near the beginning of the disk. It contains the kernel, the aforementioned ‘loop-aes-utils’, some scripts, a set of keys, and a few other essential binaries: sh, ls, and pivot_root. I configure grub to boot and root from this partition, and provide the kernel with a custom init script. This script prompts the console for a master password (must be 20 or more characters), and uses this to unlock an image containing the keys to each partition. The keys themselves are totally random 60-character strings.

Once the keys are available, the init script uses ‘losetup’ to configure crypto-enhanced loop-back devices for each partition. Then it can unmount the keys, mount the soon-to-be root partition, pivot_root to it, and invoke the real /sbin/init. The remaining partitions will be mounted automatically later on, so long as you use the /dev/loop devices in /etc/fstab, or better yet, refer to them by filesystem label.

  LABEL=debian-root  /      ext3  defaults 0 1
  LABEL=linux-home   /home  ext3  defaults 0 2
  LABEL=linux-swap   none   swap  sw       0 0

It sounds fancy, but once I was familiar with the tools and their capabilities, it wasn’t that bad to set up. The HOWTO describes booting off of a USB stick that contains the keys and kernel; this way authentication is based on something you know (master password) and something you have (the USB stick). This was too much of a pain for my home setup, plus my BIOS is too old to boot from USB.

What took the most work was allaying my fears that I’d be totally hosed when something goes wrong with the boot process. It turns out that the current Ubuntu Live CD (6.06) includes a kernel with the required modules. So I can boot from the Live CD, mount my /boot partition, and then manually use the keys to mount my encrypted partitions. The only important thing is to keep a safe backup of the boot partition, especially the keys. If I lose those 60-character keys I really am hosed. Currently I have /boot mirrored on both disks, and the keys file copied on various other machines.

Why not just encrypt /home, or even just $HOME for the current user? Encrypting the root filesystem is something of a pain, involving as it does a pivot_root and delicate boot-time hacking. As the HOWTO points out, a GNU/Linux system really makes no guarantees about information flow; there’s no telling what stuff from /home may show up in /var/log or wherever. So it’s simplest just to encrypt everything, including the swap partition.

So maybe this display of Linux wizardry makes up for my gaffe about iptables earlier in the week. :)

(Funny that I started this post complaining about the performance of IDE drives, and then proceeded to add a layer of encryption on top of that. I haven’t done extensive benchmarking, but I did run ‘iozone’ a few times, and as far as I can tell the crypto only slows down reads and writes by 1 or 2%.)

Unison inode inadequacy

Okay, I resolved my performance problems with Unison. It seemed like it was taking far too long to search for changes; I noticed that it paused for significant lengths of time on extremely big files. If there are no changes and the archives files are intact, then it should just have to stat the file, so the time per file should be constant, not proportional to file size.

But this wasn’t happening. Then I realized that my recent disk reorganization probably had something to do with it. I installed a new disk, then repartitioned and moved file systems around on my home machine recently (more on that later), and this of course changed the inode numbers on all the files, which unison tracks in its archive.

Now, my expectation was that unison would be slow the first time around, but after noticing all the inode changes, it would be fast thereafter. This didn’t seem to happen. After a full sync (which was painful because I had to do it in pieces to avoid the dropped ssh connection), I had to delete the archive files, and then resync (again, in pieces). And now the new archive file has the new inode numbers and the normal sync is fast again. Yay!

Or, at least that’s my model of what happened.

Port forwarding

I’m trying to understand iptables on Linux 2.4 or 2.6, for a fairly simple task, but it doesn’t seem to do anything.

My desktop machine at work is behind a firewall. As a department, we have control over just one server that has a range of ports open to the wider internet. So I either ssh twice, or tunnel as needed to access services on my own machine. I also wrote a tiny C program to open a socket and forward all traffic back and forth; I run it from inetd on the server, so now if I ssh to port XYZ on the server, that will be forwarded directly to my desktop.

Unfortunately, things aren’t working so well lately, and I think it might be the fault of my little C program. I will ssh through it, only to have the connection reset after a few idle minutes. This doesn’t happen when I ssh directly to the server, only through my forwarder program.

It’s so bad that when trying to synchronize between home and work with unison, my home machine takes much longer to look for changes, and by the time it tries to communicate changes to the work machine, the connection has died.

So it seems like the right way to do this is throw away my crummy C program and just do port forwarding in the kernel. The tutorials and FAQs make it seem easy:

  iptables -t nat -A PREROUTING -p tcp --dport PORT \
      -j DNAT –to ADDRESS:22

The rule shows up in the tables just fine, but it doesn’t seem to change anything when I try to connect to the specified port. I wrote a 1 to /proc/sys/net/ipv4/ip_forward, as suggested. I checked that the modules ip_tables and iptable_nat were loaded. I’ve tried it both on 2.4 and 2.6 kernels. Still no changes.

My Linux kernel knowledge is fairly good these days, but networking is certainly the weak spot. In fact, in obtaining my 3 degrees in CS, I don’t think I ever once took a networking course! Not that it would necessarily help me now…

Update (23:35) Some tracing with tcpdump revealed what was happening. I compared the packet-slinging going on with a successful connection to that with my faulty iptables rules. It turns out (I presume) that the DNAT (destination network address translation) needs a corresponding SNAT (source NAT). I thought I gathered from the various tutorials and docs that iptables does the reverse translation for you. Ah, but now I understand… this isn’t really the reverse translation; it’s the same packet, but now the other end will know where to send the ACK. Blah, at least it works now. Here’s the successful configuration, with the actual IP addresses replaced by SERVER and DESKTOP.

# Generated by iptables-save v1.2.11 on Wed Jul 26 23:30:21 2006
*nat
:PREROUTING ACCEPT [7757:741448]
:POSTROUTING ACCEPT [4166:279366]
:OUTPUT ACCEPT [4175:279906]
-A PREROUTING -p tcp -m tcp –dport 2000 -j DNAT –to-destination DESKTOP:22
-A POSTROUTING -d DESKTOP -p tcp -m tcp –dport 22 -j SNAT –to-source SERVER
COMMIT
# Completed on Wed Jul 26 23:30:21 2006
# Generated by iptables-save v1.2.11 on Wed Jul 26 23:30:21 2006
*filter
:INPUT ACCEPT [939121233:266118375723]
:FORWARD ACCEPT [316:50624]
:OUTPUT ACCEPT [1194943253:1032490859121]
COMMIT
# Completed on Wed Jul 26 23:30:21 2006

So now I can throw away my little clunky inetd-spawned C program. But still the connection isn’t staying alive long enough for unison to do its thing. ARGH.

Dependence on proprietary software

Something has become a bit worrisome in my computing life. Since I got my Powerbook last Fall, I’ve allowed myself to become increasingly dependent on closed software applications.

One of the things I treasure about open standards and simple file formats is that I can still easily read and edit my emails and documents going back to September 1991, to the day I typed my first command at a Unix prompt. That’s nearly 15 years ago!

Many people fret about bit rot: put your newborn’s pictures in some seemingly ‘archival’ format like JPEGs on CD-ROMs, and your ability to revisit them when your kid turns 21 is very much not assured. There are two issues here: the physical media, and the file formats. In the net-centric, GNUish world I first entered as an undergrad, neither issue seems to be much of a problem.

With ‘offline’ physical media — cartridges, cards, disks, etc. — you must be extremely vigilant to copy all your stuff from one dominant form to the next during the narrow window in time when both are available. Copy your stack of 5¼″ floppy disks onto 3½″ disks. Copy those onto Zip disks. Copy those onto CD-RWs. Copy those onto DVD-Rs. Copy those onto external USB-2 hard disks. Copy those onto whatever the hell is next. Who has the patience for all that? But if you get lax and skip a step, you end up with valuable stuff on a 5¼″ floppy but a computer that only supports 3½″ and CD. Now what do you do? (Substitute the latest technologies as needed.)

This is one reason why I never recommend any kind of offline storage medium, including today’s popular USB sticks. Many folks in personal computing thought it was a major coup when Apple released the first home computer with no floppy drive — the iMac in 1998 — but the DECstation 3100s we used at Hopkins had no external storage facility at all, and they were produced in 1989. They had internal hard disks and ethernet, and that’s it. It’s still basically all I need; some of my machines can write CDs or DVDs, but I really hardly ever use that functionality.

So, my strategy for keeping data alive through the years is just to copy it over the network from one Unix machine to the next, whenever I change institutions and workstations. Nowadays, I always keep multiple copies alive (home, work, laptop) as a backup strategy as well. This has the added advantage that whenever you buy a new machine or disk, it generally has 10 times or more the capacity of the previous one. So bringing along all the old stuff every time costs very little space.

Now, as for the file formats themselves, this has until now been very easy as well. On Unix-y systems, the plain text file is still king, and the few binary formats tend to be open, stable, and supported by multiple applications (think JPEG, PS, PDF). There have always been exceptions: xfig is one that I used way back. And with the more desktop-oriented applications of Gnome, KDE, and Mac OS, there are even more exceptions: I currently rely on gnumeric and gnucash. But as long as the apps are portable, open-source, and provide a variety of export formats, I’m not too worried.

Incidentally, XML-based formats are often touted as a solution here, but they only get you so far. Sure, a text-based format is going to be easier to decode than an opaque, arbitrary, binary format. But open up the .apxl (XML) file used by Apple Keynote in your text editor and tell me with a straight face that it would help you reconstruct your presentation if you no longer had access to Keynote.

I started out writing this with the intent to think ‘out loud’ about what Mac applications I’ve come to depend on, and how I might reduce that dependence and transition back to mostly open source stuff. (Then I can make use of my GNU/Linux desktops at home and work again, instead of carrying the Mac laptop back and forth always.) But this post has become long already, and I’m ready to head home and seek out dinner, so maybe it’s best just to publish this and restart that brainstorm another day.

Restaurant week

It’s restaurant week in NYC, which means two weeks of prix fixe menus at some of the higher-end establishments in the city. They do this roughly 3 times per year.

Last night, a group of us met at Nice Matin, on 79th and Amsterdam. The restaurant takes its name from the most popular newspaper on Côte d’Azur, so we hoped to relive some of our lovely experiences in Southern France last year.

We split the cheapest Côte du Rhone red — still yummy — and I ordered the soupe de poissons for a starter and the arctic char in a mussel sauce for the main. The former was served with croutons and sides of aïoli and gruyère. All very tasty.

The place was a bit louder and more boisterous than I expected, especially for 9 p.m. on a weeknight, but that’s what to expect during restaurant week. We were a group of 5, and were seated at a table for 4 with an extra chair on the end, so it was a bit tight… er… intimate.

There have been a number of complaints on Chowhound about the service, but I thought they did fine, particularly given the restaurant week crowds. I’d go back.

We need to find out if there’s a way to consolidate our OpenTable accounts. For some reason, we opened two separate ones, and so we’re not accumulating points as fast as we should!

Ani

Ani DiFranco was great last night at SummerStage in Central Park. We had hoped to see her in Montréal while we were there, but that show was sold out.

Ani and her accompanist coaxed a lot of varied sound out of just an acoustic guitar and upright bass. But the highlight for me was her lyrics: very clever and complex, full of word play and irony. We have nearly all her albums, but I’m mostly familiar with Little Plastic Castle and Up⁶. The performance encouraged me to listen more carefully to the others as well.

One of the strongest segments was the poem called ‘Reprieve’ from her upcoming album of the same name. Now, I’m generally not very big on poetry, believing as I do that eight of the scariest words in the English language are, “I’m going to read this poem I wrote.” But this was pretty amazing.

It did allude to the concept that war is a product of the patriarchy, and if women ran the world there wouldn’t be any. I hear this sentiment voiced periodically as part of the new anti-war movement, where it dovetails nicely with modern adaptations of Aristophanes’ Lysistrata. Although it might be a heartening thing to believe, I am aware of precious little evidence that it’s actually true. Admittedly, all women throughout recorded history have been products of the patriarchy themselves, and so their behavior can be understood only as reaction to it. We have yet to perform controlled experiments that could confirm or refute the hypothesis.

I guess I basically believe that all humans are engaged in a constant struggle to mollify their inner barbarians. And while there may be differences between the sexes, this isn’t one of them.

The idea of North

I just returned from a week in Montréal. This was an opportunity to visit (the tail end of) the Jazz Festival, and to reconnect with my friend, co-author, and former office-mate.

Unfortunately, getting there turned into a nightmare. We argued about whether to go by car, train, or plane, but we ultimately found a reasonable fare on Priceline, so we flew. It was US Airways, with a longish layover in Philadelphia. Shortly after landing at PHL, we learned that our Montréal flight was canceled, due to traffic congestion. So we rerouted through Boston. But then one leg or the other of that trip was delayed. And at the airport — as in the city itself — restaurants start closing at 8:00.

Anyway, here’s the Reader’s Digest condensed version: we left home at 10 in the morning, and got to my friend’s house in Montréal at 1:00 the next morning. By my reckoning, we could have driven there and back in that time.

Coming home was slightly better, although there was another moment of panic at PHL when the monitor listed our flight to LGA as delayed until half past midnight. It turned out to be a fluke. Still, we left my friend’s house at 3 in the afternoon and got home at 1 in the morning.

Sad, but true: the overhead of getting to/from the airport and the risk of being delayed or canceled are too high to justify flying anywhere within a 10 hour drive. (Non-stop flights minimize the risk, of course.)

Just as I started to write this post, iTunes shuffled its way onto Glenn Gould playing Partita #6 in E minor. So I chose a suitably Canadian (though not specifically Québécois) title for this post.

More on the trip later.

The more you know

I just watched a 45-minute ACLU video on how to assert your rights during police encounters. It reviewed the 4th, 5th, and 6th amendments, complete with reenactments and alternate re-reenactments. It was fairly cheesy production-wise, but an important message in my opinion.

One thing that bothered me though, is that in most of the scenarios presented, the targets of police attention actually did appear to be guilty of some crime. The white kids driving to the concert did have pot in the car, for example. They had everything to lose by consenting to a search. And so the video could have been titled “How to get away with doing illegal stuff.”

This is unfortunate, because one could easily come away from watching this video with the all-too-common sentiment “if I’m not doing anything wrong, then I have nothing to fear.” What is much more interesting to me is persuading people that it’s vital that we assert our rights even when we’re not doing anything wrong.

Lately I’ve been trying to promote signed and encrypted email again, among less technical friends. And I routinely encounter the similar sentiment, “My email is just not that personal or interesting.”

I first used email in 1991, and first learned about strong cryptography (PGP) in about 1992. I was thrilled, and I immediately dashed off encrypted messages to my good friends Alice and Bob. If you had told me then that in 15 years, people would still be sending plain text messages out in the open where anyone could read or alter them, I’d have thought you were nuts.

Oh sure, it’s fairly common for computer nerds to have PGP or GPG keys, but in most cases they’re not routinely used for email; it’s just too inconvenient. (They are routinely used in some quarters for signing code; c.f. Debian.) But isn’t it strange that my bank would send me an email with a URL where I can read my latest statement? Why not send the statement directly through the email? Answer: because we have reasonably good, wide-spread encryption standards for the web, but still not for email.

So I tried to look into what wide-spread standards do exist for email, because it certainly isn’t PGP/GPG. I haven’t quite straightened out all the acronyms yet, but it seems like the X509/PKCS7 is fairly common. Thawte offers free personal certificates, so I got myself one. I even met with a network of enthusiasts to get notarized (at a Starbucks on the upper west side). This just means that — in exchange for showing my passport to a couple of strangers — I can now put my real name in my certs, rather than just my email address.

All this stuff is supported fairly well in Apple Mail and the Keychain Assistant. My partner and I now routinely exchange encrypted messages. And now I’ve started signing messages I send to others, to see how their systems deal with it. The results to date are not encouraging.

Anyway, our esteemed president can take credit for my resurgent interest in the bill of rights. One day last month I got so pissed off by some executive transgression or another (sad that I don’t even remember which one) that I joined three organizations on the same day: the ACLU, the Electronic Frontier Foundation, and Americans United for the Separation of Church and State. Do I hear an ‘Amen’?

Oh yeah, the president has even made me appreciate the 2nd amendment more, which I interpret as being primarily about the ability (and responsibility) of the citizenry to overthrow a tyrannical government. ;) But YMMV, as IANACS.*

*CS = Constitutional Scholar