Sunday 6 April 2008 @14:15
Many students are mystified, at first, by my ignorance of and apathy toward the Microsoft environment. To many of them, Microsoft is the computer. They’re younger than the IBM PC architecture itself, and didn’t experience the great diversity of home computers (Commodore, TI, Atari, Tandy, pre-Macintosh Apple) that my friends and I had in the early-mid ’80s. Nor did they witness the following decade of dueling proprietary Unix workstation vendors (DEC, Sun, SGI, HP).
The simple story is that I started using these Unix workstations in college in 1991, and never even bothered to install the clunky contemporary Windows 3.0 on my own PC. By late 1992, I was installing early distributions of Linux. I stayed in that academic Unix-using bubble until 2002. Sure, I had seen Windows 95 briefly at a summer job. But it wasn’t anything to take seriously. I’m a bit younger than that condescending bearded hacker in the Dilbert strip, but I’m clearly his heir.
I bring up this story because I read a number of interesting comments on Slashdot about resistance to Vista, and the extended end-of-support deadlines for XP. I’ll just pull out some examples here, first about programming languages:
As for J#, C#, VB and WebDev, we’re back to the same “How do I keep giving Microsoft money” question again. Those are not standards. They’re proprietary solutions and stuff you build on them will obsolete every time Microsoft decides it needs more of your money. It’s a trap. Don’t fall into it. If you must program in those soon-to-be dead languages then you’ve created your own predicament and nobody can help you. —symbolset
I guess that articulates why I resist VB and C# in our program, even though there is clearly demand for them in industry, and thus students want to learn. Incidentally, another prof (who is not an anti-MS bigot) teaches those languages on occasion.
Here’s a provocative call for Microsoft to save themselves by turning Linux into Windows version 7:
This is going to sound crazy, but bear me out. So here’s what Microsoft does. They take the [Linux] OS and develop a Windows GUI for it. They pour a billion dollars or so into WINE development and research (while providing WINE’s coders with full access to existing Windows APIs) and they bring WINE’s performance and compatibility to dizzying heights. And then they sell it. Call it Windows, sell it as Windows and do what Apple’s done with Darwin. Keep the proprietary stuff proprietary and the OSS stuff OSS. You’d wind up with a rock-solid OS, and your users could run their old software until their apps received an update to the new system. Eventually WINE would no longer be needed.
This all sounds a lot like Apple, MacOS X and Classic, doesn’t it?
Anyway, there we go. I’m sure there are a thousand valid reasons why this couldn’t/wouldn’t work and naturally it will never happen. I understand that. I can dream though, can’t I? —penginkun
How delusional am I, that this sounds like a perfectly reasonable strategy? Problem is that Microsoft is completely allergic to it.
What Apple has done fantastically well with the OS X transition is to maintain basic compatibility between the GUI frameworks and the underlying command-line tools and system calls. Apple adopts cups as their print server, adds some new GUIs and such, but continues to maintain ‘lpr’ and ‘cupsd.conf’. When some new technology like Spotlight is added, it comes with command-line and file-system support. Meanwhile, on Microsoft, you create a symbolic link on the Desktop, and you can’t use it from the command-line or Perl scripts or normal C code. It’s not part of the file system; it’s just a façade that the (bloated) API provides.
I think this is why Apple has achieved significant market share among CS types and other scientists. I’m not even certain it was an explicit design goal for OS X, but now that we’re a market segment, let’s hope they keep it up. One thing that Linux distributions are doing better is package management. APT is great. You can run APT on Mac of course (and even on the iPhone), but if Apple blessed it and you started seeing most applications and demos installed that way, it would be an improvement.
Anyway, I’m not sure I’d go as far as to say that Vista represents the complete downfall of Microsoft, but the mind-share monopoly is certainly fading.
Tuesday 1 April 2008 @9:28
A short video (1:22) of me venting my frustrations about email formats:
Transcript: In the old days, email was always plain text. And hard-core techies like me liked it that way.
Grudgingly, we came to accept HTML email. With hypertext markup, you can have bold-face, fonts, images.
But this is still okay because you can resize the text if you need to, the text is searchable, and Apple Mail on Leopard even recognizes dates and integrates well with the calendar program.
Where I have to draw the line is emails where the entire message is an image. It’s extremely common in our institution to create a full-page flier and then distribute it via email as an image. It doesn’t resize. It’s not searchable. The dates are not recognized. The MIME standard for multimedia email allows for plain-text alternatives in this case, but they are rarely available.
Here’s an even more egregious example, about the availability of our schedule of classes online. The link is blue. It’s underlined. But it’s not clickable. I can’t even copy and paste. I have to type it in from sight.
Just say no to image-based email. (And nevermind the inconsistency of distributing a 4 MB flash applet to demonstrate this simple point! At least I provided a plain text alternative.)
Wednesday 27 February 2008 @11:14
Seems unlikely, but who am I to argue with the algorithm?
Dear Amazon.com Customer,
We’ve noticed that customers who have purchased or rated Computers & Typesetting, Volume B: TeX: The Program (Computers and Typesetting, Vol B) by Donald E. Knuth have also purchased Learning Microsoft Publisher 2007 Student Edition by Faithe Wempen. For this reason, you might like to know that Learning Microsoft Publisher 2007 Student Edition will be released on March 11, 2008. You can pre-order yours by following the link below.
Tuesday 15 January 2008 @14:19
Apple’s new sub-notebook looks incredible, although I haven’t checked the specs yet. I was personally rooting for a tablet, but anyway…
I’m happy with my MBP. I thought it might be too large and heavy for my taste, given my affinity for my previous tiny Powerbook. What I have found is that having the iPhone around for quick email checking and minor googling means that I’m not as bothered with digging out the laptop for trivial stuff or leaving it behind while out and about.
And now I sound like a complete Apple-whore fan-boy. Whatever, it’s Tuesday!
Sunday 30 December 2007 @1:02
Yesterday was a bad GNU/Linux day. I was getting angry, and I rarely get angry. Every once in a while, I type “apt-get upgrade” and everything goes to hell. Like the print server’s ability to print from PDF. Right when I’m trying to print last-minute boarding passes, confirmation pages for hotels and rental cars, directions from the airport, etc.
I of course found workarounds, then in some cases had to work around the workarounds. But in all it took about 3 hours to figure out what was going on and get 10 pages printed. Something was broken in how CUPS talks to ‘pdftops’, causing it to hang. There were recent security updates to CUPS, but when I downgraded it, the problem persisted. It’s still not fixed, but I found and subscribed to a probably-relevant entry on bugs.debian.org.
I guess I use Debian ‘stable’ so that this sort of thing happens very rarely. The frustration had me questioning my commitment to GNU/Linux and other free software. But of course it’s not that simple: what I treasure is the hackability, and so some instability is inevitable. Even on a Mac, I’m likely to access unsanctioned functionality with “sudo vi /etc/cups/cupsd.conf” or whatever, but this kind of customization rarely survives updates.
I suppose the solution is to take updates more seriously, evaluating them the way a business would: apply them only when there’s sufficient time to run a systematic set of regression tests.
Friday 21 December 2007 @14:43
A few months back, I was at a workshop on campus where some other staff member noticed my Mac and remarked (disapprovingly) on how quickly some hackers had gotten the latest OS X release running on stock (non-Apple) x86 hardware. “Why do people do that… why not just buy a Mac?” he asked. “They’re good computers.” I offered that I might like to have such skilled hackers as students (or employees). This seemed to surprise him, but I think it’s actually true.
For the moment, let’s set aside arguments supporting the morality of digital restrictions circumvention (legal or not). Just assume that the hack is ethically dubious. Assume also that it’s non-trivial. Then, I maintain that I would aim to recruit the hacker. Why? Ethics can be instilled, but raw technical talent is so rare that it’s still a net win.
Tuesday 25 September 2007 @9:18
I keep getting emails and postcards reminding me to update my information for the Johns Hopkins alumni directory. How quaint, to print a bunch of alumni data on dead trees. So last century. Have these folks never heard of web sites for professional and social networking, or for that matter, Google?
Thursday 9 August 2007 @11:41
I think I was a tinkerer even before I encountered computer programming or computer science. And even though my research is pretty theoretical, I still enjoy breaking out the screwdrivers and anti-static wrist strap.
Our main home PC was a midrange Dell Optiplex that Art bought when he started med school in 2000. About two years later he switched to a Mac laptop and I converted the PC to a GNU/Linux workstation. We mainly use it as a home file and backup server, but it’s also my desktop when I’m working at home. Over the years I added substantially more disk capacity, bought a DVD writer, swiped a video card from an older machine for a dual-head display, etc. By this summer, I was itching for a more substantial upgrade. Software builds are pretty slow, firefox was struggling with memory limitations, and occasionally I had trouble helping out friends with USB backup drives because the system didn’t support USB2.
One of the great things about the PC architecture (as opposed to laptops and small form-factor consumer systems like the Mac mini and iMac) is that it’s entirely possible to upgrade it piecemeal. I had two high-capacity disks and an optical drive that were newer than the base system — no point in replacing those. So I went onto Newegg and did some research on the latest specifications. I generally don’t like buying the very latest stuff because the price/performance ratio is too high. The economical sweet spot on the curve is usually a generation or two back.
I went for the AMD Athlon 64-bit X2 dual-core processor. I got a compatible ASUS mini-ATX motherboard, 2G RAM, and a new mini-tower case. One thing I knew I needed out of the motherboard was two IDE (PATA) buses: one for the optical drive, another for the two legacy disks. The newer I/O bus is called SATA. The prices on drives are so good that I bought a 320G SATA disk too. The new PC will have well over half a terabyte of storage over the 3 disks.
Putting it all together went okay. I scraped my fingers to bleeding twice.
The case seemed roomy — I chose it because having four hard drive bays was fairly rare for inexpensive cases — until I started putting the components inside. Before installing the disks, I booted an Ubuntu live CD to check that all the other components would work.
It turns out that three disks in a case this size is not ideal, even though it physically would hold four. After a few days of use, the disks were running hot. Really hot — the SMART temperature sensors reported 54°C (129°F)! Manufacturers are not always very precise about max operating temperatures. According to some numbers I was finding, this was on the high end, but not out of range. Also, the lifetime of the drive seems to depend more on the ambient case temperature, and ACPI was reporting 40°C on the motherboard.
Ultimately, I found a small fan in an older, unused system that I’ve been too lazy to take for recycling. I managed to secure it in between two of the drives and bring some air across them from the vents in the front of the case. Now the drive temperatures are in the range 45-47°C and reach 50°C only during heavy use (such as backups with rsync). I guess I’m satisfied with that, for now. But next time I think I will choose a case with more drive space, and with better front cooling facilities.
This is the first time I built a PC from the motherboard up, and overall, it has been a good experience.
Tuesday 20 February 2007 @8:07
I’m giving a talk on Thursday for our CS club (a student ACM chapter). Our compilers course is hardly ever offered, because it ends up being a fairly arcane topic considering the career goals of the majority of our students. There are ways to make it more relevant of course, but I don’t want to argue either way on that today.
Instead, I decided to put together a fun little talk for the club on some of the ‘big ideas’ in the area. Here’s the abstract:
One of the more profound concepts in computer science is compiler bootstrapping: very often, the compiler for a programming language is written in that language itself. This begs an almost mystical question: what compiles the compiler? (And what compiled that compiler, and so on…) The first part of this talk is an adaptation of the famous Turing Award speech “Reflections on Trusting Trust” by Ken Thompson, co-inventor of the UNIX operating system. We explore the bootstrapping concept, and how to exploit it to devious ends. The second part is a very brief introduction to program analysis and compiler optimization, using static single assignment form.
I’ve been experimenting with some code to show that technique of teaching the compiler once, and then removing it from the source. Ken Thompson used the example of control codes, with a fragment of code that essentially said case '\n': return '\n'; but I found an enlightening post on the topic that cites this trick, from a Pascal compiler:
insertSymbolConstantBinding("INTEGER_MAX", INTEGER_MAX)
The value of INTEGER_MAX was ‘taught’ to the compiler at some point in its evolution, but since the compiler compiles itself, it is no longer needed. Pray you don’t lose all the binaries!
I’d like to turn the talk into a screen-cast, just because I’d like to experiment with that as a pedagogical format, and (I think) I have the tools. We’re starting to see lots of ‘Web 2.0’ tool-builders publish video tutorials. I doubt I’ll record any audio/video directly during my talk, but rather use that as a trial run of the script. Watch this space!
Thursday 15 February 2007 @18:40
Follow-up about BitTorrent: at some point, I read the FAQ and found that for BT to work correctly, one needs to forward ports to overcome the Network Address Translation done by the router. (Would have been easy except that I forgot my router config password and had to reset it.) But after doing that, the number of peers increased significantly, and with it both upload and download rates. That’s more like it!
In my email the other day, I received a message with an image claiming that:
Wow, sounds great, I thought. Where was this when I was doing my thesis research? So is it based on the lambda calculus? I read on:
Ah, that kind of ‘type system’. Come to think of it, the message was from myfonts.com…
Wednesday 14 February 2007 @10:02
I’m really not into file sharing. Really! We currently have a 40G music archive and the vast, vast majority of it is legit; I have a closet full of CDs to prove it. The few tracks that are less than legit are more likely to be ripped from a borrowed CD or copied from a friend’s hard drive than downloaded from an anonymous peer-to-peer network.
The first time I heard about BitTorrent, it sounded extremely cool. You get pieces of the file from various peers, and make the pieces you’ve got available to other peers. They call the whole thing a swarm. Until today, I have used it only to download images of Linux Live CDs — perfectly legit.
But even for less legit targets, I’m not sure it’s doing me much good, perhaps because my tastes are too obscure? If there’s just one peer, in rural Spain, with a 2kb/s throttle, then it’s going to take two days to download a 1 hour video? And if nobody wants the seeds I have, am I destined to remain a leech?
This doesn’t really seem worthwhile!
I blurred the torrent filename, but for the curious, I’ve been a Showtime subscriber for the past 6 years, and for now I have no intention of canceling. I guess we subscribed for Queer as Folk, but when that ended the various other original series have kept us hooked: Huff, Sleeper Cell, L Word, Weeds, Penn & Teller Bullshit; and I’m looking forward to The Tudors and This American Life. But the other day, my DVR messed up and missed one episode in an ongoing series. I already pay for the production of these shows, so I firmly believe using an ‘alternative’ distribution channel to access them is legitimate.
Tuesday 16 January 2007 @18:36
So here I am installing a VNC client on my Powerbook so I can connect to my desktop Linux at work and control a VMware installation running Windows XP. And on that virtual XP? I’m running a GeekOS kernel on Bochs.

Just thought I’d share. Although I’d never personally choose XP for anything, I must admit that it has been convenient to be able to run it in VMware, just so I can see what kind of environment the majority of my students are using, and what problems they may run into. All the software I require for my courses is cross-platform, because I don’t want to be tied to anything. I even can cross-compile GeekOS on my PPC Mac and run it on Bochs there.
I managed to get a virtual XP running on my Linux desktop at work, but so far it doesn’t work at home… and that Linux machine at home is so underpowered at this point, I’m not sure I’d want it on there anyway. So using VNC to connect to it from elsewhere made sense.
More on GeekOS later, but so far hacking it is definitely fun. Learned more about segmentation registers on Intel this week than I ever needed to know.
Tuesday 26 December 2006 @22:07
For the past several weeks, I had been getting up to 30 spam comments per day in the moderation queue. None of them appeared on the site, but receiving the email notifications and having to clear out the queue periodically was a pain. Besides, when I set up WordPress, I took care to implement my own custom “Turing test,” where would-be respondents must answer simple questions like “What is Prof. League’s first name?” Were the spam-bots lucky or clever enough to be answering these questions correctly? Or were they somehow bypassing the test?
This morning, I finally had a chance to investigate what was going on. I added some tracing statements to the commenting functions, so that when they were invoked I would receive an email with some information about variables and control flow. Some tracing emails started showing up within 15 minutes, and I learned two things: the spam-bots were not providing correct answers to my Turing questions (that’s good), and the IP addresses in the traces and the ones getting spam into the moderation queue were disjoint (that’s bad). Well, good and bad. It means that the spam-prevention measures in the regular comment code are working, but also that there must be a back door.
By grepping the server logs for yesterday’s spam-submitting IP addresses — don’t know why I didn’t think of that first thing — I discovered the back door: trackbacks. This is a facility for one blog post to link to another as a comment. This is an interesting idea, but since it’s some other blogging software that does the posting, I can’t really implement extra spam prevention measures here. So I decided just to disable trackbacks and pingbacks completely. That should do the trick!
Wednesday 13 December 2006 @22:04
Next semester, I’ll be teaching an intermediate programming course on OOP and design patterns in C++. Additionally, I may do a series of projects based on GeekOS in my operating systems course. (I taught CMSC 412 at UMCP before the advent of GeekOS, but I see in it some influences from the more ad hoc projects we did back then.)
Recently I have been thinking about online tutorial, submission, and assessment systems. Since both of next semester’s courses will involve exchanging a good bit of code, I hit on the idea of using Subversion both to distribute project code to my students, and for them to submit their code for assessment. This has been done before; I found a SIGCSE paper on using CVS for this purpose [Reid & Wilson, 2005].
In the old days (the early 1990s), most CS students did their major programming assignments on a semi-centralized (UNIX) system, and most departments maintained some setuid script for managing submissions. At UMCP, my friend Gabe automated the assessment of assignments to amazing levels, with the help of Perl and shell scripts.
An area that I think is under-explored still is using some kind of automated tutor to help students in a CS0 or CS1 comprehend and practice the very fundamentals of programming: conditionals, loops, arrays, etc. There was a special issue of JERIC recently (Journal on Educational Resources in Computing) on automated assessment, but the aim of many of the articles was to save time and give individualized feedback to classes with 400 students. That seems a little dated now — with CS enrollments down as they are — but I guess it may still occur at a few large schools.
I’m interested in automated assessment not for the time-saving or scalability, so much as for a mechanism that can encourage students to practice on their own time, outside of class, and in addition to assigned work. The system should be able to generate a variety of unique problems to solve, offer hints and help, and assess the student’s progress.
Anyway, I did figure out today how to set up Subversion as a submission tool. It requires the (slightly) more sophisticated access control that you get running it from Apache 2 and the authz module. I set up the top-level of the repository with a public/ folder, and folders for each student: alice/, bob/, carol/, etc. The instructor and TAs should be able to read and write anywhere, but students can read from public and read/write their own folders only. Here’s my authz file that seems to do the right thing:
[cs150s07:/]
league = rw
* =
[cs150s07:/public]
league = rw
* = r
[cs150s07:/alice]
alice = rw
[cs150s07:/bob]
bob = rw
Then, files provided for the assignments are committed to public/a1/, public/a2/, etc. and copied into the student folders with svn copy.
Two tips from the CVS paper that I think are good ideas: first, when students have problems and seek assistance, insist that they commit what they have to the repository, so you can update and help them out without the awkward emailing of files back and forth. When helping a student through a problem face-to-face, check out a fresh copy, show him how to fix the problem, and then wipe the fresh copy so he still has to fix it again on his own.
Second, if we can encourage students to commit often, we may get a better glimpse of their working habits — such as when they start on assignments — and confront them about problems early on. This buys back a little of the surveillance power we had when everyone did their work on the same machine: you know there’s a problem when johnny hasn’t even logged in and the assignment is due in 5 hours.
As this semester is winding down, I feel the need to debrief myself a bit about how it went. But I’m going to try to hold off on that (at least publicly) until all the grades are in!
Tuesday 7 November 2006 @9:03
Computers suck. Just when information is most critical, it’s unavailable:

I’m responsible enough to know already where my polling place is (unless I arrive there and it’s mysteriously closed — that remains to be seen) but is that true of everyone in the state of New York?
Does the ‘plsql’ in the URL mean they’re using Oracle?
Update: contrary to my expectations, it was up again when I checked 15 minutes later. Just bad timing maybe.
Up-update: we should all re-familiarize ourselves with the definition of confirmation bias. I never would have posted (or even noticed) if the site had worked right away. 