About encoding issue one

 I will review it later.

 

public static void gbk2Utf() throws UnsupportedEncodingException {
  String gbk = "编码了";
  System.out.println("original info: " + gbk);
  char[] c = gbk.toCharArray();
  byte[] fullByte = new byte[3 * c.length];
  for (int i = 0; i < c.length; i++) {
   String binary = Integer.toBinaryString(c[i]);
   StringBuffer sb = new StringBuffer();
   int len = 16 - binary.length();
   // 前面补零
   for (int j = 0; j < len; j++) {
    sb.append("0");
   }

   sb.append(binary);

   System.out
     .println("(before insert)the character with gbk encoding of index  "
       + i + " : " + sb.toString());

   for (int k = 0; k < 16; k++) {
    System.out.print(sb.charAt(k));
    if ((k + 1) % 8 == 0) {
     System.out.println();
    }
   }
   // 增加位,达到到24位3个字节
   sb.insert(0, "1110");
   sb.insert(8, "10");
   sb.insert(16, "10");

   System.out
     .println("(after insert)the character with gbk encoding of index  "
       + i + " : " + sb.toString());

   for (int k = 0; k < 24; k++) {
    System.out.print(sb.charAt(k));
    if ((k + 1) % 8 == 0) {
     System.out.println();
    }
   }

   fullByte[i * 3] = Integer.valueOf(sb.substring(0, 8), 2)
     .byteValue();// 二进制字符串创建整型
   fullByte[i * 3 + 1] = Integer.valueOf(sb.substring(8, 16), 2)
     .byteValue();
   fullByte[i * 3 + 2] = Integer.valueOf(sb.substring(16, 24), 2)
     .byteValue();
  }
  // 模拟UTF-8编码的网站显示
  System.out.println("new result: " + new String(fullByte, "UTF-8"));
 }

 

Running Result:

original info: 编码了
(before insert)the character with gbk encoding of index  0 : 0111111100010110
01111111
00010110
(after insert)the character with gbk encoding of index  0 : 111001111011110010010110
11100111
10111100
10010110
(before insert)the character with gbk encoding of index  1 : 0111100000000001
01111000
00000001
(after insert)the character with gbk encoding of index  1 : 111001111010000010000001
11100111
10100000
10000001
(before insert)the character with gbk encoding of index  2 : 0100111010000110
01001110
10000110
(after insert)the character with gbk encoding of index  2 : 111001001011101010000110
11100100
10111010
10000110
new result: 编码了

 

refer to:

http://www.iteye.com/topic/1097560


 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值