Programmers learn through programming, and my research into OpenID has got as far as a web application that lets you do nothing at all except log in with OpenID. Along the way I have played a bit with Canonical’s own object-relational mapper (ORM) called Storm.
In part 1, back in February I had created a home page log-in page using WSGI and some higher-order functions to make writing direct to the WSGI conventions more convenient. These include the following function decorations for WSGI app functions:
@form_data_filter—decode form parameters, and place them in the
@query_string_filter—decode form parameters, and place them in the
@cookie_parsing_filter—decodes the cookies supplied with the request, and places them in the
@genshi_html(template_name)—the app function must return a dictionary rather than the usual iterator, and the output will be generated from the named template with this dictionary supplying the template arguments; and
@log_in_filter(cookie_name)—the app function will have the side effect of discarding any log-in cookie.
These filters, and the functions used to implement them, live in Python
modules in a package
alleged.http. They are intended to be reusable,
although without more doco they are not really ready for other people to
The upshot of which is that the shortest of the application’s controllers can be written as follows:
def logout_POST(environ, start_response):
start_response('200 OK', )
The application as a whole is knitted together from a bunch of WSGI
functions using the Selector class, and this is wrapped in
log_in_filter (which provides a generic mechanism for creating and
tracking the log-in cookie) and
genshi_config_filter (which adds an
environ entry for the benefit of
Having created a bare-bones web-application framework, I took the OpenID-processing code from my previous effort and beat it in to the new conventions. This results in two factory functions that take configuration information (such as the cookie name) and return WSGI apps for the log-in URL. These in turn are plugged in to the log-in filter framework and lo! we have an application that allows the user to log in with OpenID, and remembers who they are logged in as (with a cookie).
Actually, Part 1 ended just before this point, because I had somehow broken PySqlite support when I upgraded to Python 2.5 (or something). Although I said I would try to finish it ‘next weekend’, I ended up leaving it for a good few months. Once SQLite support was restored, the OpenID log-in process worked fine.
I moved the code in to its own Python module
alleged.http.openidfactory. The function
some configuration parameters and returns a WSGI function suitable for
use as the handler of GET requests for the
login page. A similar
login_POST_factory is used for the intermediate step.
Adding User Settings
The next step was adding a way to store user settings (without which there is not much point in getting them to log in, of course). There are lots of way to handle storage in this context, but I went for the conventional solution in Python-land, which is an ORM—essentially a library that hides most of the details of talking to the database and exposes a set of objects that correspond to the entities in the database, such as users.
ORMs are popular in Python because Python’s introspection and flexible class structures makes it easy to implement, and it lets one avoid writing (much) SQL. The downside is that the access to the RDBMS may be inefficient because it is trying to impose an object-oriented view of the world on the relational database (the object–relational impedance mismatch). Microsoft’s frameworks, such as VB6 and .NET, take the other approach, with the relational database’s view of the world imposed on the object-oriented program.
Of the various options, I am trying out Storm, one of the outputs of
the Canonical company’s Launchpad project. Storm code is short and
involves slightly less magic (or less of the appearance of magic) than
some other ORMs I have tried. Here’s our bare-bones
__storm_table__ = 'Users'
id = Int(primary=True)
nick = Unicode()
uri = Unicode()
def __init__(self, uri, nick):
self.uri = uri
self.nick = nick
For the first iteration, there are only two settings: their display name
nick) and the URI of their home page. When and if this
application gets extended in to something more than an OpenID testbed,
there will be more fields here to record application-specific info. The
third field mentioned in the class definition is
id, the database’s
identifier (dbid) for users; these are an artefact of the database that
Storm takes care of for us.
There is no field for login name, because this is a pure-OpenID system. there is no field for OpenID URI, because a given user account might have more than one OpenID: in the relational world, this means OpenIDs need their own database table. We define this table in Python as follows:
__storm_table__ = 'OpenIDs'
uri = Unicode(primary=True)
user_id = Int()
user = Reference(user_id, User.id)
def __init__(self, uri, user):
self.uri = uri
self.user = user
Storm is happy for me to use the URIs as the primary key of the table,
rather insisting on having an
id column as well. The field
is again an artefact of the database representation—it holds the dbid of
the user that owns this OpenID. In Python code we will instead use the
user, which contains a reference to the
User object itself.
Storm takes care of keeping the two of them in sync with each other.
Given a Storm store called
store, we could retrieve a user by using
openID = store.get(OpenID, openID_uri) and then use
openID.user to get the
User object. Another option is to use a SQL-like declarative query:
users = store.find(User, OpenID.user_id == User.id, OpenID.uri == uri)
The arguments (apart from the first) are using operator overloads
defined on the classes
Unicode, et al. to create objects
representing expressions. These are used by Storm to generate the SQL
incantation that will be used to retrieve the results.
One thing I like about Storm is that the store is not hidden: this
means you add a user to the store explicitly (
implicitly, when linking it to another object in the store. I prefer
this slightly to the having the database details stored in some global
Registration = Settings
Users do not need to register before they can log in, since their OpenID
is set up outside this application. Instead, the first time they sign
in, we redirect to the settings page, inviting them to enter their
nickname (any any other settings that this application needs). When they
click the submit button, an entry is created for them in the
This sounds complicated but the code in the GET function for the
settings page starts like:
openID_uri = environ.get('REMOTE_USER')
user = model.get_user_from_OpenID(openID_uri)
is_new = openID_uri and not user
user is not
None, the form is filled in with data from
otherwise we fill in the form with blanks or guesses. When the user
presses the submit button, the POST function extracts the arguments and
does something like the following:
user = model.create_user(openID_uri, nick)
user.nick = nick
user.uri = home_page
There is very little difference left between changing your settings and
registering for the first time: the template has a message that is
is_new is true that contains extra instructions, but that
is about it.
My first version of this placed the check for whether a visitor was new,
and so needed to be redirected to the settings page, in the welcome
page. After some thought I factored it out as (yet another) filter. Like
log_in_filter, it converts the wrapped app in to one that checks
whether we have a
User object for this OpenID, and redirects to the
settings page if not, adding a
next query-string parameter so that
the settings page can include a link back to the page originally asked
for. Because it has to know about the application-specific database,
this new filter
need_settings_filter is not part of my generic
Attaching more OpenIDs
There are various reasons why someone might want to use multiple OpenIDs with an account. There will be several points where we might attach an OpenID to an existing account:
Sign in with the OpenID used before. On the settings page, there is an Attach another OpenID link. From a programming point of view, this is the easy case.
Sign in with a new OpenID. When redirected to the Settings page, the ‘Hello, you are new’ message should have a link to a page for attaching this new openID to your existing account, by signing in as that OpenID as well.
Sign in with a new OpenID. Create a new account. Belatedly remember you have an existing account. From the Settings page, click on Attach new OpenID. When signed in as the old OpenID, it offers to merge the two accounts.
To the user these may seem like different operations. The underlying mechanism for each of them is similar (collect two OpenIDs, then do what is needed to join them), but the wording of the links and form labels may need to vary to keep the operations clear to the user.
The main complication was that my mechanism for verifying an OpenID was designed for logging in and setting the log-in cookie—but if it did that when attaching an OpenID, it would overwrite the original OpenID.
After pondering various schemes, like copying the login cookie before
creating the new one, and so on, I worked out that I could use the
login_GET_factory with one simple change: where it calls
log_in_and_redirect, it now calls a function passed as a parameter
log_in_and_redirect as the default value). In a way, this is
vindication of my splitting the maintenance of the login cookie in to a
In the application, I define this function to present yet another form, showing the old and new OpenIDs, and asking the user to confirm they want these to be attached to the same user account. This form is needed first so that merging accounts requires a POST, not a GET request, and second so that if there are indeed two accounts involved, we can insert a warning message.
We have to be careful here to avoid creating a spoofable form. The form must contain the new OpenID as a hidden form item. Without adding safeguards, it would be possible to create a similar form on another site that POSTs a form to my application containing the attacher’s OpenID as the hidden parameter. This would merge your account with the attacker’s, allowing the attacker to access whatever resources are owned by your account.
My fix for this is to include a second hidden parameter called
that is a HMAC of the two OpenIDs. You don’t get the check field except
by displaying ownership of both OpenIDs, and outsiders cannot create a
valid check field (because they lack the secret key).
I now have a web application that runs and allows people to log in with OpenID, and to record user settings thereby. (You can also visit some pages without logging in.) What it lacks is anything other than OpenID support! I really ought to consider writing some amusing web bame based on this framework.
Some of the new code is still mixed in with the application code. I
could spin some of it out in to reusable modules. The application's
controllers live in the
run_server.py script. They should have their
own Python module.
Because WSGI functions are just functions, it should be possible to use unit testing do do more of the development than I have so far: I have been testing new WSGI code by running it in the web application, rather that writing tests. This is a little careless, and if this were a production project I would be feeling a bit guilty for not having tests with better coverage.
At some point I might want (for academic interest only) to investigate how well my OpenID testbed would fit in with deployment systems like Python Paste. I believe that the factory functions I have would fit in with Paste’s conventions, but I have not actually tried this out.