Unicode
Unicode(统一码、万国码、单一码)是一种在计算机上使用的字符编码。它为每种语言中的每个字符设定了统一并且唯一的二进制编码,以满足跨语言、跨平台进行文本转换、处理的要求。Unicode Character Table(入口在这里) 包含常见语言的字符和可打印的符号字符,字符提供了 HTML 代码,名称/描述和相应的打印符号。
Unicode应用
可以这么理解:
字符 | 十六进制 | 十进制 |
---|---|---|
‘A’ | 41 | 65 |
注意:这些都可以转化为’A’
‘\u0041’
(char)65
(char)0x41
另:
“\u0041”==”A”
“\u5F20\u4E09”==”张三”
char,字节(byte)的关系
char
- char data type is a single 16-bit Unicode character
- Minimum value is ‘\u0000’ (or 0)
- Maximum value is ‘\uffff’ (or 65,535 inclusive)
- Char data type is used to store any character
- Example: char letterA = ‘A’
byte
byte, int, long, and short can be expressed in decimal(base 10), hexadecimal(base 16) or octal(base 8) number systems as well.
Prefix 0 is used to indicate octal, and prefix 0x indicates hexadecimal when using these number systems for literals. For example −
int decimal = 100;
int octal = 0144;
int hexa = 0x64;
byte VS char
byte a = 65;
byte a = 0x41
char a = 'A'
byte i = 21363;//Type mismatch: cannot convert from int to byte
int i=21363;
char hanz = (char)i;//21363是汉字 “即”
注意:这些都可以转化为’A’
‘\u0041’
(char)65
(char)0x41
总结
- Unicode 其实就是\u+十六进制来代表所有的字符
- 同时十六进制转化为十进制后,也可以代表所有的字符
- 十进制不只是byte类型,byte类型不能包括所有的字符。
- char data type is a single 16-bit Unicode character
- Minimum value is ‘\u0000’ (or 0)
- Maximum value is ‘\uffff’ (or 65,535 inclusive)