The Erlang source file encoding is selected by a comment in one of the first two lines of the source file. The first string that matches the regular expression coding\s*[:=]\s*([-a-zA-Z0-9])+ selects the encoding. If the matching string is not a valid encoding it is ignored. The valid encodings are Latin-1 and UTF-8 where the case of the characters can be chosen freely.
As of Erlang/OTP R16 Erlang source files can be written in either UTF-8 or bytewise encoding (a.k.a. latin1 encoding). The details on how to state the encoding of an Erlang source file can be found in epp(3). Strings and comments can be written using Unicode, but functions still have to be named using characters from the ISO-latin-1 character set and atoms are restricted to the same ISO-latin-1 range. These restrictions in the language are of course independent of the encoding of the source file. Erlang/OTP R18 is expected to handle functions named in Unicode as well as Unicode atoms. http://www.erlang.org/doc/apps/stdlib/unicode_usage.html
|000000 - 00007F||0xxxxxxx|
|000080 - 0007FF||110xxxxx 10xxxxxx|
|000800 - 00FFFF||1110xxxx 10xxxxxx 10xxxxxx|
|010000 - 10FFFF||11110xxx 10xxxxxx 10xxxxxx 10xxxxxx|