UTF-8

8 bit Unicode character representation. ASCII characters are represented by one byte. Other European characters are represented in 2 bytes. Most Asian characters are represented in 3 bytes. The characters in the surrogate area are addressed directly using 4 bytes.