I have decided to change the way my home site works. Up until now I
have been writing entries by creating a quasi-XML file containing the
HTML text; I am changing the format to be quasi-RFC-822: a text file
with a short header section at the top. The text is translated to
HTML via the usual hacked-together nest of regexps.
The motivation for the change in format is the same as that described on John Gruber's Markdown page: while I can happily type HTML until the cows come home, for writing prose it is nice to be able concentrate on the words without being distracted by the requirements to get all the tags just right. Unlike HTML source, it is possible to read Markdown text fluently.
I intend to implement something close to a proper subset of the Markdown+Smartypants conventions, simply because I don't want to add one more to the existing proliferation of different Wiki-like text formats. By subset I mean I do not intend immediately to implement all of Markdown; it is actually a fairly complex syntax.
Update (2004-05-08). I have switched to doing my Markdown clone
in Python rather than Tcl. This is partly because nowadays I am more
fluent in Python (this web site is the only Tcl project I still have).
In order to accomodate the interactions between the \\
-escapes and
```-quoting I have replaced a lot of the regexp substitutions with a
single-pass finite-state machine, and I find this easier in Python
than Tcl.
It's still early days; in writing the previous paragraph I discovered a bug to fix, and I haven't got around to processing apostrophes yet.
Update (2004-05-10). I have apostrophe ('
) automatically
translated to ‘ and ’ (the Unicode characters for
inverted comma and apostrophe) according to context. Similarly "
becomes “ and ”. (Except they do not when inside
\
...`` or part of HTML tags.)