java 之字符集的种种 unicode gbk utf8 utf-16

最新推荐文章于 2022-03-14 17:58:29 发布

魅离儿

最新推荐文章于 2022-03-14 17:58:29 发布

阅读量454

点赞数

分类专栏： java学习记录重点文章标签：字符集 java

本文链接：https://blog.csdn.net/kakaxiaoxing/article/details/46432493

版权

java学习记录重点专栏收录该内容

15 篇文章 0 订阅

订阅专栏

好久没有关注字符集的问题。其实这些原始的自然语言的编码还是非常有趣的。很兴奋发现了一篇文章写的特别有趣。
http://blog.csdn.net/fanwenbo/article/details/2298800
还有一篇文章写的特别好
http://blog.csdn.net/tianjf0514/article/details/7854624

接下来自己便开始了测试。

import java.io.UnsupportedEncodingException;

public class Test {
    public static void main(String args[]) {
        String tmp = "中国";
        byte[] b1 = null;
        byte[] b2 = null;
        byte[] b3 = null;
        byte[] b4 = null;
        byte[] b5 = null;

        try {
            b1 = tmp.getBytes("unicode");
            b2 = tmp.getBytes("utf-8");
            b3 = tmp.getBytes("utf-16");
            b4 = tmp.getBytes("gbk");
            b5 = tmp.getBytes();
        } catch (UnsupportedEncodingException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        byte[][] b = { b1, b2, b3, b4, b5 };
        for (int j = 0; j < b.length; j++) {
            for (int i = 0; i < b[j].length; i++) {
                System.out.print(b[j][i] + " ");
            }
            System.out.println();
        }
    }
}

结果如下：

-2 -1 78 45 86 -3  
-28 -72 -83 -27 -101 -67 
-2 -1 78 45 86 -3 
-42 -48 -71 -6 
-42 -48 -71 -6

因为工程默认的编码是gbk编码所以最后俩个是一样的。
-2，-1的原因在上面第二篇文章中有说明。

魅离儿

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
java 之字符集的种种 unicode gbk utf8 utf-16

好久没有关注字符集的问题。其实这些原始的自然语言的编码还是非常有趣的。很兴奋发现了一篇文章写的特别有趣。 http://blog.csdn.net/fanwenbo/article/details/2298800 还有一篇文章写的特别好 http://blog.csdn.net/tianjf0514/article/details/7854624接下来自己便开始了测试。import java.
复制链接

扫一扫