November 2002

London calling

We spent Sunday in London. First to see Marsyas, Anish Kapoor’s giant red sculpture in the Tate Modern (this year’s Unilever Series installation). As Pete points out, it is very big, designed to look like it had to be positively rammed in to the Turbine Hall and almost didn’t fit. The title suggests that it is (part of) the flayed skin of a mythological figure, represented in Jack Kirby style as a titanic giant.

We only stayed a moment, because we were really there to see the Turner Prize exhibits at the Tate Britain. The place has been extensively remodelled since we last visited—even the route from the tube stop to the entrance was different—using similar white creamy stone to the British Museum refit. Jeremy and I liked the Turner Prize stuff.

I really liked Keith Tyson’s collection of poster-sized sketches of crazy ideas and images. (One of the complaints people were making in the comments room was that modern artists cannot draw. This is patently not the case Keith Tyson’s idea-posters.) His three-dimensional works start in poster form—in this case he had the diagram for The Thinker (After Rodin) and the sculpture itself in the same room. His Thinker is sort of the reverse of Rodin’s: instead of representing the appearance of a person thinking, it does something that represents the thinking itself: there is a computer system inside the column running a virtual world simulation. If you put your ear to the column you can hear its ‘thoughts’.

To round off the cultural evening we watched the low-budget British post-apocalyptic movie 28 Days Later. Afterwards as we walked home we passed some of the landmarks from the deserted London of the film...

[Written 2002-11-06]

Update (2003-03-08): Corrected the spelling of Anish Kapoor’s name and ‘Marsyas’

Picky, Skin, SaxLifter

Found some time to continue work on the Picky Picky Game. I have something which, given a graphics file, writes it in to the correct place in the directory structure. Tonight’s task was a routine for generating the index page, based on the pictures stored so far. In the eventual web application, this routine will be invoked in CGI scripts whenever a new picture is added or vote recorded. For the present I can just run the Python script (one of the ways in which creating web apps in Python is less hassle than, say, ASP .Net or webclasses).

The index page format is mainly controlled through a ‘skin’ file This has most of the HTML, with special XML tags for interpolating the dynamic content. This way hopefully Jeremy will be able to hack the HTML without touching any of the application code. (The immediate inspiration for the term skin comes from the Helma Object Publisher system, which does something similar, but using JavaScript.)

The picture metadata is written in XML which is straight­forward enough except that Python’s native SAX support is broken: it does not support XML namespaces! I have fixed this with my own SAX filter dubbed SaxLifter: it processes startElement events by scanning the attributes for namespace prefixes, maintaining a stack of namespace mappings, and generating startElementNS events. Presumably if I were using the XML-SIG or 4Thought enhancements to Python things would work better. Sigh.

The overall strategy is to generate as much static HTML as possible—that is, instead of creating the HTML for the list of pictures afresh each time someone visits the site (which is what PHP and ASP, etc., do), I intend to generate it only when a new picture is added to the list. Since adding pictures will happen much more rarely than viewing the list, this reduces the overall load on the web server. The aim is to use CGI only in the pages that make a change (adding a picture or voting).

Things your mother didn’t tell you

Opera 5 omits the boundary parameter when uploading files. Lynx 2.8.2 does not support uploading files at all (but, oddly, does generate multipart/form-data forms properly—it even gives the charset parameter to the content-type of its form items). Python’s multifile module raises an exception on all of the above, for some inputs.

I guess that if I want to handle uploaded images, I get to write my own multipart/form-data parsers from scratch. I have already done this in C++ for work; I guess I can do it again in Python. Sigh.

Picky Picky Game: Upload pictures, EAGAIN

I have now written a parser for HTML-4.0 file uploads (forms with enctype multipart/form-data). It will need some finessing to get character encodings to work right, but for the simple cases I tried it uploaded files flawlessly, and moreover, plugged in to the back-end script I mentioned in an earlier installment.

Alas! When I tried uploading from Jeremy’s NT box, my Python program crashed with an IOError exception with errno=EAGAIN. I guess I need to do some sort of loop to fill my buffer. Ho hum.

More on Opera’s boundaries

It occurs to me I may be being harsh on Opera; I notice that elsewhere they show a preference for splitting MIME parameters over multiple physical lines. For example, they use

Content-disposition: form-data;

as opposed to

Content-disposition: form-data; name="fred"

It is just about possible that this confuses thttpd so that it clips everything after the first CRLF when passing the headers to my script via CGI...?

CGI upload woes

On Monday I was troubled by EAGAIN interruptions when reading in a CGI script’s data. It turns out Python has a cgi module already. But when I tried creating a script that used that, it failed to work with Opera’s boundary-less multipart (the built-in cgi module uses the multifile, which I tried and rejected earlier).

I have tried looping until EAGAIN does not happen—but I put a limit of 10 iterations so as not to chew up the CPU. No dice. I have also tried using the fcntl module to remove the O_NONBLOCK flag from stdin. The result is that instead of crashing with EAGAIN it waits indefinitely (and gets interrupted by thttpd’s watchdog timer).

The upshot of this is that I have the beginnings of a CGI script that works if I connect to it from the same machine the server is running on, but not if I connect to it from a different machine (an NT box) on the same network. The thing is, I know that people have successfully written CGI programs in Python, and none of the examples I find on-line have any mention of these phenomena.

LifeJournal syndication

Jo Charman has created a LiveJournal ‘syndication account’ for me. As a result you can see my RSS feed, converted in to a LiveJournal journal. She says that if you have a paid-for LiveJournal account, you can add pdc to your friends roster. And people can comment on the LiveJournal pointers to my posts. Woohoo.

Updated (4 March 2007). Updated URL. Corrected the spelling of Jo’s first name.

EAGAIN, again

I throught I’d try out a different CGI framework, such as jonpy, and this requires Python 2.2. So I have now installed Python 2.2 (carefully installing GNU db, expat, etc. first so these modules will be built). During its self-tests, I noticed that takes 2½ minutes (to do something that should take approximately no time). Come to think of it, initiating connections to my Linux box from Jeremy’s NT box also takes an inordinate amount of time. That might be why initiating HTTP connections to my thttpd instance also takes an inordinate amount of time, so long that thttpd kills the CGI rather than waste any more time. In other words, my CGI problems may mostly stem from a broken network stack. Teriffic. This is a variation on Joel Spolsky’s law of leaky abstractions: I would like to be able to believe in POSIX’s abstraction of sockets as being a lot like a file, but sadly it is all frinked up. Another reason to spend a week or two installing Debian some time.

I think the way forward for now is probably to ignore the network problems and cross my fingers when I install it on the actual server. Given that fairly thorough search of the WWW and Netnews reveals no discussion of the sort of problems I’ve been having, I am fairly sure it is some freakish glitch in my computer...

Council’s anti-bike vendetta continues

Judging from their behaviour, it has long seemed that the powers that be in Oxfordshire hold cyclists in contempt but don’t feel able to actually come out and admit it. One of their techniques for discouraging cycle commuters is to make it as difficult to park a bicycle in town as it is to park a car. Case in point: up until yesterday, I had the perfect place to park my bicycle each work day—it had a roof overhead and plenty of railings to lock my machine to, and was not in anybody’s way. The County Council (in whose car park this spot was) has now fenced off this area with a big steel fence, with a notice to the effect that it was reserved for Environmental Services’ motorbikes and bicycles. When I tried locking my bike to the outside of the new cage, notices were put up ordering me not to (the implication being that they would not mind damaging my bike in the process of removing it).

The upshot of this is that there is nowhere to park near my office. All the road signs are already occupied by the time I get in. The foyer already has three bikes in it, but I would not have used it anyway, having already had a bike stolen from such a situation. In the end I locked it to the back of a street sign in the next street along. Psychologically it feels exposed out there in the middle of the footpath. I much preferred keeping it out of people’s way, on the grounds they will then feel less need to vandalize it…

Picky Picky Game: ZEO + CGI

Even in a toy web application like the Picky Picky Game, it is possible (but unlikely) that two people will want to upload a picture at (nearly) the exact same moment. If two processes try to write the same file at the same time, the results could be a mess. It follows that we need to include something to co-ordinate the changes.

Using ZEO to coordinate CGI scripts

ZEO just works

Converting my non-concurrent code to instead use a persistent store coordinated through ZEO is pretty easy once I’d grokked the documentation. In fact most of the work consisted of deleting some of the routines for just-in-time reading back of the metadata, since that is now taken care of for me by ZODB.

Picky Picky Game: The Joy of PIL

The final piece in the puzzle of my PPG platform is the Python Imaging Library (PIL) from Secret Labs AB (PythonWare). This makes it easy to check that the uploaded images are the right dimensions, for example:

im =
width, height = im.size
if width > game.maxWidth or height > game.maxHeight:
    log('Image is too large: the maximum is %d × %d.' \
            % (game.maxWidth, game.maxHeight), STOP)
    ok = 0

I don’t even need to know whether the image is a PNG, JPEG, or GIF.

Picky Picky Game: minimal voting

(Sunday night.) Still nothing up for you to see yet, I’m afraid. (Apart from anything else, I need to ask my host to install a few Python packages...) But I do do now have the start of the second CGI script, the one that accepts reader’s votes for the current round of pictures. These votes later are used to decide which picture to use for that panel of the comic strip.

At present the script accepts your vote but does not display them in any way. If you vote again, your previous ballot is silently overwritten. I plan to support Approval Voting in future by having a page where you have a checkbox for each candidate picture and can select as many as you like.

The word ‘your’ is a little misleading; we use people’s IP addresses as their identifiers, which sort of works most of the time, but means that people sharing a proxy server will end up sharing a vote. The alternative (requiring users to register in order to vote) is not likely to work because noone will want to register.

Update (Monday night): The voting form now shows you the pictures with checkboxes. When you first visit the page, the picture you cloicked on is ticked, but then you can tick as many more as you like. Because of the way HTML forms are processed, each form parameter is potentially a sequence anyway, so the code for each time around the voting form can be exactly the same. The code that adjusts the totals is very simple:

def vote(self, uid, pns):
    """Register a vote from the user identified by uid.

    uid is an integer, uniquely identifying a voter.
    pns is a list of picture numbers
    oldPns = self.userVotes.get(uid, [])
    if pns == oldPns:
    for pn in oldPns:[pn].nVotes += -1
    for pn in pns:[pn].nVotes += 1
    self.userVotes[uid] = pns

The first line retrieves that user’s old ballot, if any. The first for statement reverses the effect (if any) of their former vote, the second counts the new vote. Finally the ‘ballot’ is saved for later. Behind the scenes, ZODB takes care of reading the old data in off disc and (when the transaction is committed) saving the updated data.

My paid job involves writing a web application as well, except this one uses Microsoft ASP .Net linked via ADO .Net to Microsoft SQL Server® 2000. To do a similar job to the above snippet, I would be writing two SQL stored procedures (one to retrieve the exisiting ballot, one to alter the ballot). Invoking a stored procedure is several more lines of code in the C♯ or VB .Net layer as you create a Command object, add parameters to it, execute it, and dispose of the remains. (Or you can create DataSet objects which are even worse, but have specialized wizards to help you draft the code.) The actual algorithm (the encoding of the business logic) would be buried in dozens of lines of boilerplate. By comparison, the Python+ZODB implementation is a miracle of concision and clarity. The ZOPE people deserve much kudos.

Colour graphics the hard way

On my badly broken Linux desktop, the Gimp is missing its file-saving plug-ins, so it cannot save files except in a format I cannot use. XPaint does not exist, for some reason. The venerable bitmap program does work, but can only produce X11 bitmap files (which are black and white only). How then to produce colour icons for my Picky Picky Game mock-ups?

Using PBMPlus to colourize monochrome bitmaps