Subversion vs. distributed version control

Yesterday I started playing with GNU arch, version 1.x. (I know there's a version 2.x now, but it seems like a very different interface, and the documentation is even more sparse.)

This is my first experience with a version control system that is not based on a centralized repository. And I must admit I find it a bit strange conceptually. Through the years I've tried SCCS, RCS, CVS, and now my default is Subversion. Mostly I'm quite happy with Subversion, particularly for myself or if there is just one or two other collaborators that need write access. It is indeed a vast improvement over CVS.

I'm using Subversion in a project course with about 10 M.S. students. The project platform is LAMP (Linux, Apache, MySQL, PHP). They log in to the Linux server to do most of their programming, and use the svn command-line utility. Since everyone is on the same machine, I set up the repository using a file:// URL, made everyone part of the same group, and gave that group write access to the repository.

Wow, this was a mistake. Periodically (like every three days or so) Subversion reports that the repository is corrupted, and I have to go in and run the recovery process, fix up the permissions again, etc. I'm not sure what it will take to make this work… perhaps just making sure that everyone's umask is always set exactly right. But sometimes other things seem to go wrong. We also use websvn for browsing the repository over HTTP, and sometimes it seems to leave behind weird files owned by www-data (the user that runs the web server).

Next time I will do this with an svn:// URL and let the svn user be the only one who ever touches the repository. This means having a different set of password stored (in the clear) somewhere in the repository, and managed manually. So it's okay for a handful of users, and that's generally how I collaborate with co-authors and individual students.

Anyway, one version control strategy I rely on for myself is keeping a vendor branch. For example, I made a few hacks on WordPress 2.0.2, which I use to power this site. For this I keep a local subversion repository, within my home directory. It has two main branches (actually just directories in subversion's persistent file system): the vendor branch and my development branch. When WordPress 2.0.3 is released, I load it into the vendor branch, look at the changes they made since the last release, and merge those changes into my own hacked version. So I spent a good part of my day yesterday figuring how to do something like this with GNU arch.

Some concepts from arch are appealing: (1) creating clean deltas (change sets) for adding particular features or fixing particular bugs, (2) cherry-picking which deltas to apply to a particular tree, (3) mirroring and branching from projects where you don't have write access, (4) publishing a repository (archive) without needing any special software on the server, (5) etc.

But I'm not sure I have the hang of it yet. Version control is, without a doubt, a very complex problem. If we think we can solve it with simple tools, we're probably kidding ourselves. So why do I find Subversion much simpler than Arch? Am I just more accustomed to its perspective on the problem? Or, as Tom Lord might say, is it because Subversion doesn't actualy solve the problem at all?

©20022015 Christopher League