The Friday Fillip

A day or so ago, Elizabeth Ellis remarked that someone might help her write accented letters in her posts, so she could use French. It’s fairly easy to tell Elizabeth — and you — how to do that, and to put all sort of other lovely characters into your posts as well. It’s a good deal harder to explain the why’s and wherefore’s of UTF-8 encoding, so, because this is a fillip, I’m not going to try. The curious among you — yes, I’m looking at you — can work at this explanation.

How do you do it? You have to have access to a chart that sets out various code equivalents of the characters you might want to use. A handy one is available on Webmonkey, and lists what are called ISO entities. (Sounds like X-Files stuff, no?) In order for your browser to know that there’s an entity coming up, you have to surround each with an ampersand and a semi-colon. All of this is when you’re working in html, of course, as you are when you enter your posts into Slaw’s “write post’ area.

So, for example, an e with an accent grave gets coded as è, an e with an accent acute é.

The more popular entities, like these, can be created with a bit of text between the & and the semi-colon — mdash, ndash, and so forth. More esoteric entitites — indeed, any entity — can be formed with numbers between the markers, as you’ll see when you look at the chart.

Herewith a few of my favourites:

§ – (§) the section sign, a very American legal thing
» – (») the right-facing double quote mark, very French, and of course its companion left-facing version « («)
Þ – (Þ) the Icelandic thorn, uppercase

Try ’em. You’ll like ’em.

Oh, and for a brain teaser, think about how you’d do it so that the entity got printed on the web page not as the entity but as the code for it. That is, how did I encode ¶ such that it didn’t simply get turned into ¶?


  1. Cheeky. You typed “& para;”.

    Another method to input accents etc, is to use ASCII character codes (although the method Simon mentioned is my preferred method). I have noticed that browsers are becoming less proficient at parsing ASCII codes. Thus an é could also be inputted by hitting the alt key and 130- é

  2. Well now, not quite & para; because there’s no space in my examples…