ISO-8859-1 HTML Reference

ISO-8859-1 is a character encoding standard. It was released by the International Organization for Standardization (ISO) in 1998 as an extension to ASCII.

About Character Encoding

To understand ISO-8859-1, you have to understand what character encoding is and how it works.

Everything in a computer is stored as bits, binary digits of information represented by 1s and 0s. Any human-readable character, such as a number or a letter, has to be represented by a string of bits.

For example, in most computer systems today, the capital letter A is represented with 01000001.

A set of human-meaningful characters along with binary codes for each is called a "character encoding". Character encodings are arbitrary. There is nothing intrinsic to the string 01000001 that "means" the letter A.

There have been a number of character encoding schemes in the history of computing. Each one encodes a particular set of characters and assigns each one a string of bits.

There's no reason 01000001 couldn't mean "A" in one encoding scheme, "B" in another, and "1" in another. Or it could be a smiley face, a Chinese character, or even a ping sound effect.

It's important for everyone who works with or views a document to be using the same character encoding, so they have to be standardized.

ASCII and ISO-8859-1

The most famous character encoding standard is ASCII. ASCII used 7 bits of an eight-bit byte in order to encode the most basic 128 characters used for writing English. A number of system-specific uses were developed for the eighth (high-order) bit.

For example, one system used it to toggle between roman and italic printing styles. Other systems used it to encode additional characters. By using all eight bytes, 256 characters can be encoded.

Since the original ASCII set didn't include a number of characters needed to write in common non-English languages (such as letters with diacritical marks), extending the character set to 256 greatly increased its capabilities.

IS0-8859-1 is one of those extensions. It was intended to be an international, cross-platform standard. Since it is a superset of standard 8-bit ASCII, it is backwards-compatible: a document encoded in ASCII could easily be decoded using ISO-8859-1.

ISO-8859-1 and HTML

According to the standard, ISO-8859-1 was the default character encoding in HTML 4. However, most browsers supported a super-set of ISO-8859, called ANSI.

ANSI contains an extra 32 characters which were empty in ISO-8859-1. (Most of the time, when you see a list of ISO-8859-1 characters, it's actually the full ANSI list.)

Today, the HTML5 standard uses UTF-8, a very large superset that includes the original ASCII, ISO-8859-1, and ANSI encodings.

However, most English-language HTML documents, even those explicitly declaring ISO-8859-1 or UTF-8 as their character set, actually use the smaller ASCII character set. There are two reasons for this:

  • ASCII can be typed on a standard QWERTY keyboard.

  • Many of the technologies used to generate HTML only support ASCII.

Since ISO-8859-1 and UTF-8 are both ASCII-compatible, this doesn't usually cause any problems.

ISO-8859-1 and Character Entities

The extended set of characters available in ISO-8859-1 can be produced in an ASCII-only document by using HTML character entities. These are strings that begin with the ampersand ("&") and terminate with a semicolon (";").

For example, the copyright symbol (the circle with a "C" in it) can be encoded directly using ISO-8859-1 or UTF-8. But since there is no "©" key on most keyboards, many people find it easier to type ©.

This is stored in the file as six ASCII characters: &, c, o, p, y, and ;. Web browsers then display the appropriate ISO-8859-1 character to the user.

Most of the non-ASCII ISO-8859-1 characters have named HTML character entities. Those that do not can be typed with their numerical code. The numerical code is actually the decimal (base 10) version of the binary encoding.

For example, the copyright symbol is encoded as 10101001 in binary, which is 169 in base 10. So you could type © or ©.

Non-ASCII Characters in ISO-8859-1 and ANSI

Characters 128-159 on this chart are ANSI characters not included in ISO-8859. The first 127 codes in ISO-8859-1/ANSI are not included here, as they are identical to ASCII.

Character HTML Name HTML Number Description
€ € euro sign
‚ ‚ single low-9 quotation mark
ƒ ƒ ƒ lowercase letter f with hook
„ „ double low-9 quotation mark
… … horizontal ellipsis
† † dagger
‡ ‡ double dagger
ˆ ˆ ˆ modifier letter circumflex accent
‰ ‰ per mille sign
Š Š Š capital letter S with caron
‹ ‹ single left-pointing angle quotation
ΠΠΠcapital ligature OE
Ž   Ž captial letter Z with caron
‘ ‘ left single quotation mark
’ ’ right single quotation mark
“ “ left double quotation mark
” ” right double quotation mark
• • bullet
– – en dash
— — em dash
˜ ˜ ˜ tilde
™ ™ TM trade mark sign
š š š lowercase letter S with caron
› › right-pointing angle quotation mark
œ œ œ lowercase ligature oe
ž   ž lowercase letter z with caron
Ÿ Ÿ Ÿ capital letter Y with diaeresis
      non-breaking space
¡ ¡ ¡ inverted exclamation mark
¢ ¢ ¢ cent sign
£ £ £ pound sign (currency)
¤ ¤ ¤ currency sign
¥ ¥ ¥ yen/yuan sign
¦ ¦ ¦ broken vertical bar
§ § § section sign
¨ ¨ ¨ diaeresis
© © © copyright sign
ª ª ª feminine ordinal indicator
« « « left double angle quotation mark (guillemet)
¬ ¬ ¬ not sign (logic)
­ ­ ­ soft/discretionary hyphen
® ® ® registered trade mark sign
¯ ¯ ¯ spacing macron / overline
° ° ° degree sign
± ± ± plus/minus sign
² ² ² superscript two (squared)
³ ³ ³ superscript three (cubed)
´ ´ ´ acute accent
µ µ µ micro sign
¶ ¶ paragraph sign (pilcrow)
· · · middle dot
¸ ¸ ¸ cedilla
¹ ¹ ¹ superscript one
º º º masculine ordinal indicator
» » » right double angle quotation mark (guillemet)
¼ ¼ ¼ one quarter fraction (1 over 4)
½ ½ ½ one half fraction (1 over 2)
¾ ¾ ¾ three quarters fraction (3 over 4)
¿ ¿ ¿ inverted question mark
À À À capital letter A with grave accent
Á Á Á capital letter A with acute accent
   capital letter A with circumflex
à à à capital letter A with tilde
Ä Ä Ä capital letter A with diaeresis
Å Å Å capital letter A with ring above
Æ Æ Æ capital AE ligature
Ç Ç Ç capital letter C with cedilla
È È È capital letter E with grave accent
É É É capital letter E with acute accent
Ê Ê Ê capital letter E with circumflex
Ë Ë Ë capital letter E with diaeresis
Ì Ì Ì capital letter I with grave accent
Í Í Í capital letter I with acute accent
Î Î Î capital letter I with circumflex
Ï Ï Ï capital letter I with diaeresis
Ð Ð Ð capital letter ETH(Dogecoin symbol)
Ñ Ñ Ñ capital letter N with tilde
Ò Ò Ò capital letter O with grave accent
Ó Ó Ó capital letter O with acute accent
Ô Ô Ô capital letter O with circumflex
Õ Õ Õ capital letter O with tilde
Ö Ö Ö capital letter O with diaeresis
× × × multiplication sign
Ø Ø Ø capital letter O slash
Ù Ù Ù capital letter U with grave accent
Ú Ú Ú capital letter U with acute accent
Û Û Û capital letter U with circumflex
Ü Ü Ü capital letter U with diaeresis
Ý Ý Ý capital letter Y with acute accent
Þ Þ Þ capital letter THORN
ß ß ß lowercase letter sharp s (Eszett / scharfes S )
à à à small letter a with grave accent
á á á lowercase letter a with acute accent
â â â lowercase letter a with circumflex
ã ã ã lowercase letter a with tilde
ä ä ä lowercase letter a with diaeresis
å å å lowercase letter a with ring above
æ æ æ lowercase ae ligature
ç ç ç lowercase letter c with cedilla (cé cédille)
è è è lowercase letter e with grave accent
é é é lowercase letter e with acute accent
ê ê ê lowercase letter e with circumflex
ë ë ë lowercase letter e with diaeresis
ì ì ì lowercase letter i with grave accent
í í í lowercase letter i with acute accent
î î î lowercase letter i with circumflex
ï ï ï lowercase letter i with diaeresis
ð ð ð lowercase letter eth
ñ ñ ñ lowercase letter n with tilde
ò ò ò lowercase letter o with grave accent
ó ó ó lowercase letter o with acute accent
ô ô ô lowercase letter o with circumflex
õ õ õ lowercase letter o with tilde
ö ö ö lowercase letter o with diaeresis
÷ ÷ ÷ division sign
ø ø ø lowercase letter o with slash
ù ù ù lowercase letter u with grave accent
ú ú ú lowercase letter u with acute accent
û û û lowercase letter u with circumflex
ü ü ü lowercase letter u with diaeresis
ý ý ý lowercase letter y with acute accent
þ þ þ lowercase letter thorn
ÿ ÿ ÿ lowercase letter y with diaeresis

Further Reading and Resources

We have more guides, tutorials, and infographics related to coding and website development:

HTML for Beginners — Ultimate Guide

If you really want to learn HTML, we've created a book-length article, HTML for Beginners — Ultimate Guide.

And it really is the ultimate guide; it will take you from the very beginning to mastery.