May 2003
I’ve been amusing myself by concocting an
RSS
reader using XSLT to do the processing.
XSLT can even handle the downloading of the RSS
files, but this does not allow for caching or
aggregating—so I thought I would knock
something together in Python.
Installing pyXML
As mentioned earlier, I am working on a version of the Picky Picky Game that does not need to
write to files (since some web servers will be set up that way
for security reasons). In some ways this simplifies things,
because it means there is only one URL hierarchy to worry about.
URL design
I have converted the Picky Picky Game so that it can run as a
BaseHTTPServer
server as an alternative to as a
CGI script, in
order to avoid the performance penalty caused by starting up
fresh Python processes for each request. But I discovered
that the sevrer would lock up from time to time.
Closing your connections
Had a bug in the Picky Picky Game where
uploaded pictures might have backslashes left in their names
(the picture name being derived from the file name supplied by
the client computer). Technically it is OK to have backslashes
in a
URL, and they should
be treated like any other character. Some web browsers
second-guess you, however, and replace backslashes with slashes
(http://foo/bar\baz
is treated as
http://foo/bar/baz
), with the result that these
pictures failed to appear.
The solution is, of course, to (a) change the code for
translating file names in to picture names so that it removes
backslashes, and (b) fix the existing databases.
ZODB makes the second part pretty easy; having acquired a
Game
instance from the databse, you just run a
script like
for rn, r in game.rounds.items():
for pic in r.pictures:
s = r.sanitizedName(pic.name, pic)
if s != pic.name:
pic.name = s
pic.dataUri = s + picky.mediaTypeSuffix(pic.mediaType)
The function sanitizedName
is the one that has to
be fixed for part (a).
Luckily, Apache honours the Status
pseudo-header,
and so my my
Picky Picky Game CGI script can issue response code 304 and it works. Yay. My
legions of testers report that the downloading goes more
smoothly now.
The finished panels now also have text below them saying how
many candidate panels were uploaded, and how many votes were
recorded. The tricky bit was making sure it says
‘votes’ when there are 0 or more than 1 votes, and
‘vote’ when there is only one.
The dynamic HTML
is generating using a template file that I have whimsically
named a skin. This is a more-or-less XML file (meaning
that I intend it to be well-formed XML, but actually
process it as plain text). Mostly it is XHTML, but it can
include elements in special namespaces like
<p:version/>
. These are replaced with text
generated by Python functions or strings (e.g.,
<p:version/>
is replaced by something like
0.9 <pdc 2003-05-19>
). The comics panel at the left
of the index page is produced with the following incantation:
<p:panel subtract="3" skin="index-panel"/>
This says to render the panel whose number is 3 less than the
current panel (so if the current panel is #20, this shows #17),
using the skin contained in the file
index-panel.skin
. This allows the hypothetical
graphic designer of the Picky Picky Game considerable latitude
in how panels are displayed.
While rendering index-panel.skin
, it comes across
this fragment:
<p class="detail">
[<p:pictureCount singular="candidate"/>,
<p:voteCount singular="vote"/>.]
</p>
The p:pictureCount
element sort of inherits the
subtract
attribute of the original
p:panel
, because it gives the number of pictures in
panel #17 (as opposed to the current panel). The
singular
and plural
attributes (if
provided) specify text to follow the number (in the absence of
an explicit plural
attribute, it adds
s
to the singular).
Simplicity itself.
Update (2003-05-20):
I have corrected a typo the above URL. Sorry
for any inconvenience.
I demonstrated the
Picky Picky Game prototype
to the
CAPTION committee. The
main trouble was picture resources not being
downloaded, or, oddly, vanishing when one refreshed the
page. It worked better on
the dial-up than the broadband connection (though Jo
blames that on
IE 6
being set up to cache nothing).
I resolved to make an effort to sort out
caching—or at least, the things my server needs to
do to enable caching to work smoothly.
So, in my development system at home,
I added If-Modified-Since
and
If-None-Match
(etag) support to the routine
that fetches picture data out of the database. I also
added an Expires
header set, as
RFC 2616
demands, approximately one year in to the future.
Result: none of the pictures appear.
The problem is
that the web server I am using at home always
returns status code 200 for CGI scripts (it ignores the
Status
pseudo-header). As a result, my clever
304 (‘Not modified’) responses result in
apparently zero-length data. Argh!
When I worked this out, I though I would
demonstrate to Jeremy that it worked in the stand-alone server (which
does not use CGI). But Lo! all the pictures failed to
appear once more. So did the page itself. What gives?
This time the trouble was its logging
function—it tried to resolve the client IP
address. Now, I thought that the address used by my
PowerBook did have reverse look-up in my local
DNS, but
in any case, the server should not be indulging in DNS
look-ups given that on my system that is a blocking
operation that tends to mean the program locks up for 75
seconds. Luckily BaseHTTPServer
makes it
easy to override the function that indulges in DNS
queries and it all now runs smoothly.
On the positive side, I have made one cache-enhancing
change that has worked, albeit only for the old panels
(which saved their images as separate disc files rather
than in ZODB).
Simply put, there is another base URL
used (in addition to the base of the web application and
the base URL for static files), and this one is for
picture files. This means that these old pictures are
now, once again, served as static files, with etags
and caching the responsibility of my host
HTTP server,
not me.
I have now rejigged my log-rendering scripts (used to create
this very web site) so that entries are archived on daily pages
rather than monthly ones.
This was necessary because I’m writing more entries and
longer ones. In recent months I had started splitting each
entry in to a leader paragraph that linked to the full article,
but this was kind of clunky, and I never automated it. With
one day’s posts per archive page this is less of an issue.
It would have made the topic pages
awful long, however, so I have switched the non-archive pages to
only show the first article on the page in full, and show the
rest as links to the archive page. This is much tidier, and
makes the site more like a collection of short essays than a
weblog, which it really isn’t.
There are still a few details to sort out. The year indexes are
confused by the change in format (at the time I write this, the
yearly archives omit May), and the newly
introduced monthly indexes need to be generated. More on this
as I get around to it.
Update (2003-05-21):
I have created the new per-month index page.
The Archives listing at the top roghjt of the page now has a
‘by topics’ link.
I have taken the liberty of bumping the Picky
Picky Game forward to the next panel. I have also cranked
the speed up a notch: each ‘week’ will be 3 days for
the next little while.
The idea is to let us test the various behind-the-scenes
mechanisms of the game so that we can risk mentioning it to
people at COMICS
2003 in Bristol this bank-holiday weekend.
The CSS has been
tweaked to not use auto
margins, because they are
not supported in MSIE.
To make it easier to create pictures for the Picky
Picky Game, I have added code to automatically convert
Microsoft Windows Bitmap (.bmp
) files into
PNGs.
This is straightforward because I already was using PIL to check the
dimensions of the images. Converting to PNG (if the format is
one PIL knows) is pretty simple:
permittedTypes = ['image/png', 'image/jpeg', 'image/pjpeg', 'image/gif']
...
if not imt in permittedTypes:
buf = StringIO.StringIO()
im.save(buf, 'PNG')
data = buf.getvalue()
logger.log(slog.WARNING, 'Converted your image from %s to image/png' % imt,
'This may have lead to a slight loss of image quality.')
imt = 'image/png'
buf.close()
The above goes in the sequence of checks on uploaded images
(after the check for width × height, but before the check
for number of bytes). I think I spent longer creating
a BMP image to test it on than I did writing the new code!
The advantage of BMP support is that, if you have Microsoft
Windows, then you definitely have Microsoft Paint installed.
So long as you know about Start menu → Programs →
Accessories → Paint, and the Image → Attributes menu item,
you can create panels for Picky Picky Game.
Before going in to work today I have managed to fix one of the
JavaScript problems (it causes MSIE to report ‘one error
on page’), but only half-fixed the other (which causes
artists’ names with links to vanish when you cycle through
the panels). In the latter case, the name no longer vanishes,
but, alas! the link does.
I think need a JavaScript
debugger—in other words, to install Mozilla on my
PowerBook (the only computer I own with enough welly for
Mozilla). Ho hum.
I have returned the Picky Picky Game to its
weekly schedule. The reason for going at double speed was to
fill up the home page with non-gash pictures, and this has been
achieved. So you have until Thursday night to upload a
candidate for panel 19.
We just got back from COMICS 2003, which was fun. We mentioned
Picky Picky to as many people as we could, so either the server
will be overwhelmed with activity, or no-one will bother
clicking though and we will be miserably ignored. Who knows
what the future holds?
The journey
back was a disaster—approximately 4½ hours (mostly
spent waiting in cold drafty stations). A nasty combination of
reduced Oxford–Bristol service, engineering works and
football game crowds made the train journeys particularly
unpleasant—at when there was a substitute bus we got to
sit down...
I have written a short note on how
to use MSPAINT to draw a Picky Picky Game panel.
The advantage of MSPAINT is that it is available on all
Microsoft Windows computers, even ones not set up for image
enditing.
Much to my suprise, there is no longer a drawing prgoram bundled
with all Apple Macs—my PowerBook 12″ is without a drawing
program.
I have added JavaScript to the upload form Picky Picky Game on
caption.org to optionally remember your details for next
time (using a cookie). This way you don’t have to enter
your URL each time you upload a new panel.
Debugging JavaScript without a JavaScript debugger is a real
pain in the arse, and illustrates how subtle aspects of language
design affect the experience of working in that language. There
is one crucial difference between Python and JavaScript. In
Python, a variable is implicitly created the first time you
assign to it; in JavaScript, it is created the first time you
refer to it. This means that the following fragment is valid
JavaScript:
var cookieHeader = this.$document.cookie;
var m = myRegexp.exec(cookiesHeader);
if (m) {
... use the match info to process the cookies ...
}
The equivalent Python looks like this:
cookieHeader = self._document.cookie
m = myRegexp.search(cookiesHeader)
if m:
... use the match info to process the cookies ...
In the JavaScript version, the regexp (used to extract one
cookie from the Cookies
header) will mysteriously never match
and you will spend ages scrutinizing the regexp and flipping
though the documentation on what is and is not valid regexp
syntax in JavaScript. In Python you will get an error message
telling you that the variable cookiesHeader
is
referred to before it is assigned to—and immediately
realise its name is misspelled in the second line.
The tedious thing about testing the ‘remember me’
option is that it involves repeatedly doing the very thing it is
supposed to be saving me from: entering my URL and details
on the picture-upload form. Luckily I was testing on Safari,
which has a form auto-completion feature that makes repeatedly
filling in the form less annoying—but which also makes the
‘Remember me’ feature almost entirely redundant
;-)
I have added a simple comment system to the Picky Picky Game. Go me!
There are all sorts of design considerations when it comes to
on-line comments. I was aiming at simplicity so it has no
branching (threading), no HTML ... no nothing, basically.
URLs in your posts get magically turned in to links, and blank
lines become paragraph breaks, but that is about all.
There is
one discussion page per panel.