I will review it later.
public static void gbk2Utf() throws UnsupportedEncodingException {
String gbk = "编码了";
System.out.println("original info: " + gbk);
char[] c = gbk.toCharArray();
byte[] fullByte = new byte[3 * c.length];
for (int i = 0; i < c.length; i++) {
String binary = Integer.toBinaryString(c[i]);
StringBuffer sb = new StringBuffer();
int len = 16 - binary.length();
// 前面补零
for (int j = 0; j < len; j++) {
sb.append("0");
}
sb.append(binary);
System.out
.println("(before insert)the character with gbk encoding of index "
+ i + " : " + sb.toString());
for (int k = 0; k < 16; k++) {
System.out.print(sb.charAt(k));
if ((k + 1) % 8 == 0) {
System.out.println();
}
}
// 增加位,达到到24位3个字节
sb.insert(0, "1110");
sb.insert(8, "10");
sb.insert(16, "10");
System.out
.println("(after insert)the character with gbk encoding of index "
+ i + " : " + sb.toString());
for (int k = 0; k < 24; k++) {
System.out.print(sb.charAt(k));
if ((k + 1) % 8 == 0) {
System.out.println();
}
}
fullByte[i * 3] = Integer.valueOf(sb.substring(0, 8), 2)
.byteValue();// 二进制字符串创建整型
fullByte[i * 3 + 1] = Integer.valueOf(sb.substring(8, 16), 2)
.byteValue();
fullByte[i * 3 + 2] = Integer.valueOf(sb.substring(16, 24), 2)
.byteValue();
}
// 模拟UTF-8编码的网站显示
System.out.println("new result: " + new String(fullByte, "UTF-8"));
}
Running Result:
original info: 编码了
(before insert)the character with gbk encoding of index 0 : 0111111100010110
01111111
00010110
(after insert)the character with gbk encoding of index 0 : 111001111011110010010110
11100111
10111100
10010110
(before insert)the character with gbk encoding of index 1 : 0111100000000001
01111000
00000001
(after insert)the character with gbk encoding of index 1 : 111001111010000010000001
11100111
10100000
10000001
(before insert)the character with gbk encoding of index 2 : 0100111010000110
01001110
10000110
(after insert)the character with gbk encoding of index 2 : 111001001011101010000110
11100100
10111010
10000110
new result: 编码了
refer to:
http://www.iteye.com/topic/1097560