Java中的字符编码

最新推荐文章于 2022-12-02 21:27:13 发布

denlee

最新推荐文章于 2022-12-02 21:27:13 发布

阅读量708

点赞数

分类专栏： Java 文章标签： java transformation byte 虚拟机 character string

本文链接：https://blog.csdn.net/denlee/article/details/4343435

版权

Java 专栏收录该内容

34 篇文章 0 订阅

订阅专栏

字符编码
    java.lang和java.io包中的许多构造函数和方法在进行8位字节和16位Unicode字符间转换时，都有一个指定所使用的字符编码字符串参数。字符编码由以下字符构成：
大写字母'A'到'Z' ('/u0041'到'/u005a'),
小写字母'a'到'z' ('/u0061'到'/u007a'),
数字'0'到'9' ('/u0030'到'/u0039'),
破折号'-' ('/u002d', 连接符-减号),
冒号':' ('/u003a', COLON),
下划线 '_' ('/u005f', LOW LINE).
    编码名称必须由字母或数字开头，空串不是合法的编码名称。一个编码可能有多个名字，其中一个是他的规范名字。规范名称可以通过InputStreamReader和OutputStreamWriter类的getEncoding返回。关于更多字符编码信息可以参考协议文档RFC2278: IANA Charset Registration Procedures。
    每种Java平台的实现要求支持下面的字符编码。可以通过查阅发行文档来看支持那些字符编码：

US-ASCII     Seven-bit ASCII, a.k.a. ISO646-US, a.k.a. the Basic Latin block of the Unicode character set
ISO-8859-1   ISO Latin Alphabet No. 1, a.k.a. ISO-LATIN-1
UTF-8        Eight-bit Unicode Transformation Format
UTF-16BE     Sixteen-bit Unicode Transformation Format, big-endian byte order
UTF-16LE     Sixteen-bit Unicode Transformation Format, little-endian byte order
UTF-16       Sixteen-bit Unicode Transformation Format, byte order specified by a mandatory initial byte-order mark (either order accepted on input, big-endian used on output)

每种Java虚拟机都有缺省的字符编码，缺省编码在虚拟机启动期间就确定了，这依赖于虚拟机下层的操作系统的字符编码。

在调用方法时如果使用不被支持的字符编码，则会抛出UnsupportedEncodingException。如public String(byte[] bytes, String enc) throws UnsupportedEncodingException