10 entries tagged cgi

Things your mother didn’t tell you

Opera 5 omits the boundary parameter when uploading files. Lynx 2.8.2 does not support uploading files at all (but, oddly, does generate multipart/form-data forms properly—it even gives the charset parameter to the content-type of its form items). Python’s multifile module raises an exception on all of the above, for some inputs.

I guess that if I want to handle uploaded images, I get to write my own multipart/form-data parsers from scratch. I have already done this in C++ for work; I guess I can do it again in Python. Sigh.

Picky Picky Game: Upload pictures, EAGAIN

I have now written a parser for HTML-4.0 file uploads (forms with enctype multipart/form-data). It will need some finessing to get character encodings to work right, but for the simple cases I tried it uploaded files flawlessly, and moreover, plugged in to the back-end script I mentioned in an earlier installment.

Alas! When I tried uploading from Jeremy’s NT box, my Python program crashed with an IOError exception with errno=EAGAIN. I guess I need to do some sort of loop to fill my buffer. Ho hum.

More on Opera’s boundaries

It occurs to me I may be being harsh on Opera; I notice that elsewhere they show a preference for splitting MIME parameters over multiple physical lines. For example, they use

Content-disposition: form-data;
			name="fred"

as opposed to

Content-disposition: form-data; name="fred"

It is just about possible that this confuses thttpd so that it clips everything after the first CRLF when passing the headers to my script via CGI...?

CGI upload woes

On Monday I was troubled by EAGAIN interruptions when reading in a CGI script’s data. It turns out Python has a cgi module already. But when I tried creating a script that used that, it failed to work with Opera’s boundary-less multipart (the built-in cgi module uses the multifile, which I tried and rejected earlier).

I have tried looping until EAGAIN does not happen—but I put a limit of 10 iterations so as not to chew up the CPU. No dice. I have also tried using the fcntl module to remove the O_NONBLOCK flag from stdin. The result is that instead of crashing with EAGAIN it waits indefinitely (and gets interrupted by thttpd’s watchdog timer).

The upshot of this is that I have the beginnings of a CGI script that works if I connect to it from the same machine the server is running on, but not if I connect to it from a different machine (an NT box) on the same network. The thing is, I know that people have successfully written CGI programs in Python, and none of the examples I find on-line have any mention of these phenomena.

EAGAIN, again

I throught I’d try out a different CGI framework, such as jonpy, and this requires Python 2.2. So I have now installed Python 2.2 (carefully installing GNU db, expat, etc. first so these modules will be built). During its self-tests, I noticed that test_socket.py takes 2½ minutes (to do something that should take approximately no time). Come to think of it, initiating connections to my Linux box from Jeremy’s NT box also takes an inordinate amount of time. That might be why initiating HTTP connections to my thttpd instance also takes an inordinate amount of time, so long that thttpd kills the CGI rather than waste any more time. In other words, my CGI problems may mostly stem from a broken network stack. Teriffic. This is a variation on Joel Spolsky’s law of leaky abstractions: I would like to be able to believe in POSIX’s abstraction of sockets as being a lot like a file, but sadly it is all frinked up. Another reason to spend a week or two installing Debian some time.

I think the way forward for now is probably to ignore the network problems and cross my fingers when I install it on the actual server. Given that fairly thorough search of the WWW and Netnews reveals no discussion of the sort of problems I’ve been having, I am fairly sure it is some freakish glitch in my computer...

Picky Picky Game: ZEO + CGI

Even in a toy web application like the Picky Picky Game, it is possible (but unlikely) that two people will want to upload a picture at (nearly) the exact same moment. If two processes try to write the same file at the same time, the results could be a mess. It follows that we need to include something to co-ordinate the changes.

Using ZEO to coordinate CGI scripts

Serving my own damn files

I am rethinking my original plan for the Picky Picky Game, which was to store resoures in files as often as possible. For example, index.html is a static file (not dynamically generated every time someone visits it). This requires that when something happens that means index.html should change, this file has to be updated.

Pros and cons

Picky Picky Game: No writing files

As mentioned earlier, I am working on a version of the Picky Picky Game that does not need to write to files (since some web servers will be set up that way for security reasons). In some ways this simplifies things, because it means there is only one URL hierarchy to worry about.

URL design

CGI vs. If-Modified-Since and other stories

I demonstrated the Picky Picky Game prototype to the CAPTION committee. The main trouble was picture resources not being downloaded, or, oddly, vanishing when one refreshed the page. It worked better on the dial-up than the broadband connection (though Jo blames that on IE 6 being set up to cache nothing). I resolved to make an effort to sort out caching—or at least, the things my server needs to do to enable caching to work smoothly.

So, in my development system at home, I added If-Modified-Since and If-None-Match (etag) support to the routine that fetches picture data out of the database. I also added an Expires header set, as RFC 2616 demands, approximately one year in to the future. Result: none of the pictures appear.

The problem is that the web server I am using at home always returns status code 200 for CGI scripts (it ignores the Status pseudo-header). As a result, my clever 304 (‘Not modified’) responses result in apparently zero-length data. Argh!

When I worked this out, I though I would demonstrate to Jeremy that it worked in the stand-alone server (which does not use CGI). But Lo! all the pictures failed to appear once more. So did the page itself. What gives?

This time the trouble was its logging function—it tried to resolve the client IP address. Now, I thought that the address used by my PowerBook did have reverse look-up in my local DNS, but in any case, the server should not be indulging in DNS look-ups given that on my system that is a blocking operation that tends to mean the program locks up for 75 seconds. Luckily BaseHTTPServer makes it easy to override the function that indulges in DNS queries and it all now runs smoothly.

On the positive side, I have made one cache-enhancing change that has worked, albeit only for the old panels (which saved their images as separate disc files rather than in ZODB). Simply put, there is another base URL used (in addition to the base of the web application and the base URL for static files), and this one is for picture files. This means that these old pictures are now, once again, served as static files, with etags and caching the responsibility of my host HTTP server, not me.