name | labels |
---|---|
UTF-8 | “unicode-1-1-utf-8”; “utf-8”, “utf8” |
IBM866 | “866”; “ibm866” |
ISO-8859-2 | “latin2”; “iso88592”; “iso-8859-2” |
ISO-8859-3 | “latin3”; “iso-8859-3”; “iso88593” |
ISO-8859-4 | “latin4”; “iso-8859-4”; “iso88594” |
ISO-8859-5 | “cyrillic”; “iso-8859-5”; “iso88595” |
ISO-8859-6 | “arabic”; |
ISO-8859-7 | “sun_eu_greek”; |
ISO-8859-8 | “visual” |
ISO-8859-8-I | “logical” |
ISO-8859-10 | “latin6” |
ISO-8859-13 | |
ISO-8859-14 | |
ISO-8859-15 | “csisolatin9” |
ISO-8859-16 | |
KOI8-R | |
KOI8-U | |
macintosh | |
windows-874 | |
windows-1250 | |
windows-1251 | |
windows-1252 | |
windows-1253 | |
windows-1254 | |
windows-1255 | |
windows-1256 | |
windows-1257 | |
windows-1258 | |
x-mac-cyrillic | |
GBK | “chinese”; “gbk” |
gb18030 | “gb18030” |
Big5 | “big5” 繁体 |
EUC-JP | “euc-jp” |
ISO-2022-JP | |
Shift_JIS | |
EUC-KR | “korean” |
replacement | |
UTF-16BE | “utf-16be” |
UTF-16LE | “utf-16”; “utf-16le” |
x-user-defined | “x-user-defined” |
x-user-defined
decoder
1.If byte is end-of-stream, return finished.
2.If byte is an ASCII byte, return a code point whose value is byte.
3.Return a code point whose value is 0xF780 + byte − 0x80.(所以要用 &0xFF还原)
encoder
If code point is end-of-stream, return finished.
If code point is an ASCII code point, return a byte whose value is code point.
If code point is in the range U+F780 to U+F7FF, inclusive, return a byte whose value is code point − 0xF780 + 0x80.
Return error with code point.
参考:
https://encoding.spec.whatwg.org/#encoding
https://encoding.spec.whatwg.org/#x-user-defined