System.out.println(“系统默认编码:” + System.getProperty(“file.encoding”));
System.out.println(“系统默认语言:” + System.getProperty(“user.language”));
System.out.println(“系统默认字符编码:” + Charset.defaultCharset());
运行结果:
系统默认编码:UTF-8
系统默认语言:zh
系统默认字符编码:UTF-8
Char字符
Char本质上是一个固定占用两个字节的无符号正整数,这个正整数对应与Unicode编号,用于表示那个Unicode编号对应的字符。
由于固定占用两个字节,char只能表示Unicode编号在65536以内的字符,而不能表示超出范围的字符。
超出范围的字符只能使用String类表示,例如汉字‘𠮷’的码点为0x20BB7
,该码点显然超出了65535,只能用String表示,复制‘𠮷’粘贴到代码中会自动转换为\uD842\uDFB7
。
示例:
char c = ‘程’;
System.out.println©;
String s = “\uD842\uDFB7”;
System.out.println(s);
运行结果:
程
𠮷
JDK支持的字符集
Map<String, Charset> map = Charset.availableCharsets();
System.out.println(“the available Charsets supported by jdk:” + map.size());
for (Map.Entry<String, Charset> entry : map.entrySet()) {
System.out.println(entry.getKey());
}
输出:
the available Charsets supported by jdk:170
Big5
Big5-HKSCS
CESU-8
…
编码的方法
getBytes()
public byte[] getBytes()
示例:
import java.util.Arrays;
public class GetByteEncode {
public static void main(String[] args) {
// 获取默认编码
String encode = System.getProperty(“file.enc
oding”);
System.out.println("encode = " + encode);
String s = “生活”;
byte[] bytes = s.getBytes();
System.out.println(Arrays.toString(bytes));
}
}
IDEA输出:
Java去打印字节的时候,最高位是1,则为负数,反之最高位为0,则为正数。
encode = UTF-8
[-25, -108, -97, -26, -76, -69]
CMD输出:
java GetByteEncode
encode = GBK
[-25, -108, -97, -26, -76, -69]
CMD指定编码输出:
java -Dfile.encoding=UTF-8 GetByteEncode
encode = UTF-8
[-23, -112, -94, -25, -122, -72, -26, -92, -65]
getBytes(String charsetName)
根据指定的码表名称进行编码
public byte[] getBytes(String charsetName) throws UnsupportedEncodingException
示例:
String s = “生活”;
byte[] bytes1 = s.getBytes(“GBK”);
System.out.println(Arrays.toString(bytes1));
byte[] bytes2 = s.getBytes(“UTF-8”);
System.out.println(Arrays.toString(bytes2));
输出:
[-55, -6, -69, -18]
[-25, -108, -97, -26, -76, -69]
解码的方法
public String(byte bytes[])
public String(byte bytes[], String charsetName) throws UnsupportedEncodingException
示例:
String s = “生活”;
byte[] bytes = s.getBytes();
System.out.println(Arrays.toString(bytes));
String s1 = new String(bytes);
System.out.println(s1);
String s2 = new String(bytes, “GBK”);
System.out.println(s2);
输入:
[-25, -108, -97, -26, -76, -69]
生活
鐢熸椿
乱码的不可逆情况
示例:
String s = “生活”;
byte[] bytes = s.getBytes(“ISO-8859-1”);
String s1 = new String(bytes, “ISO-8859-1”);
System.out.println(s1);
String s2 = new String(bytes, “UTF-8”);
System.out.println(s2);
输出:
??
??
InputStreamReader
示例:
InputStreamReader reader = new InputStreamReader(new FileInputStream(“src/main/resources/life.txt”), “utf-8”);
int cn;
while ((cn = reader.read()) != -1) {
System.out.print((char) cn);
}
reader.close();
输出:
生活
执行的过程:
OutputStreamWriter
示例:
OutputStreamWriter writer = new OutputStreamWriter(new FileOutputStream(“life.txt”), “utf-8”);
writer.write(“生活”);
writer.flush();
writer.close();
输出:
项目根目录下生成life.txt
文件,打开后内容:生活
执行的过程:
复制图片
字符流UTF-8编码复制
示例:
InputStreamReader isr = new InputStreamReader(new FileInputStream(“a.png”), “utf-8”);
OutputStreamWriter osw = new OutputStreamWriter(new FileOutputStream(“b.png”), “utf-8”);
int cn;
while ((cn = isr.read()) != -1) {
osw.write(cn);
}
isr.close();
osw.close();
输出:
复制后,新的图片存储大小会改变,并且无法正常打开