Mark Pilgrim reports that he’s been misusing the HTML cite
element all these years because the HTML 5 definition contradicts his use of cite
to wrap authors’ names. Just when I was about to crow excitedly that I’d always said he was wrong, I checked the old specs and discovered we both were—or actually, that HTML 4 was wrong.
For what it’s worth, the element cite
originally got its name and meaning from the
@cite
request in Texinfo (documented at least as far back as 1988), and the
description there refers simply to book titles:
Use the `@cite' command for the name of a book that lacks a companion Info file. The command produces italics in the printed manual, and quotation marks in the Info file.
The HTML 2 spec (1995) also mentions book titles only:
The CITE element is used to indicate the title of a book or other citation. It is typically rendered as italics. For example:
He just couldn't get enough of <cite>The Grapes of Wrath</cite>.
Both mention that titles are usually shown italicized.
The HTML 4.01 definition (1999) is ‘contains a citation or a reference to other sources’ which would be vaguely consistent with HTML 2, except the examples they supply show it used to surround the name of a speaker and a coded reference, not book titles:
As <CITE>Harry S. Truman</CITE> said,
<Q lang="en-us">The buck stops here.</Q>
More information can be found in <CITE>[ISO-0000]</CITE>.
Please refer to the following reference number in future
correspondence: <STRONG>1-234-55</STRONG>
Both of their examples would not normally be italicized. Obviously the authors of the
HTML 4 recommendation were guided not by existing definition or usage in existing
documents, but by the name of the element. To my mind this constitutes poor scholarship,
the same sloppiness that has caused the dfn
and var
elements to lose their original
meanings.
The result of this is that HTML 4 changed the meaning of one of the OH SO IMPORTANT
semantic elements, which in turn does rather indicate that the semantic content was
perhaps not so important as all that, given that no-one working on HTML 4 knew or cared
what the established meaning of the cite
element was!
And before you accuse me of being some proponent of wysiwyg point-and-drool word-processing, check out the articles on this very web site. There are documents going back over ten years, and for the first few years I consistently used cite
for book titles, which was correct according to HTML 2, but incorrect according to HTML 4. Since I switched to using Markdown to write my articles in, I have lost the facility to easily distinguish italicized book titles from italicized anything else—but after years of careful, pedantic mark-up with no reward I find I don’t care any more.