2017.4.17-CSDN博客

本文链接：https://blog.csdn.net/qq_30788949/article/details/70216910

public class Test1 {
    public static void main(String[] args)throws Exception{
        InputStream is = new FileInputStream("d:/test.txt");
        OutputStream os = new FileOutputStream(new File("d:/test2.txt"));
        PrintStream ps = new PrintStream(os);
        byte[] b = new byte[10];
        // int s = is.read(b);

        int s = readArray(is, b);
        while(s != -1){
            os.write(b);

            s = readArray(is, b);
        }   
    }

    public static int readArray(InputStream is, byte[] b) throws Exception{
        int read = 0;
        int len = b.length;
        int i = 0;
        while(read < len){
            i =  is.read(b, read, len - read);
            // 因为最后三个字节读取的时候，不能够填满b，所以返回的时-1，以至于上面的方法中不能将最后的三个字节输出
            if(i == -1) return -1;   
            read += i;
        }
        return read - i;
    }
}

test.txt内容如下：
测试，编程，1
测试，编程，2
测试，编程，3 // 加上换行回车一共43个字节

test2.txt内容如下：
测试，编程，1
测试，编程，2
测试，编程

// ============= 17.8.17更新 ====================

StringBuffer buffer = new StringBuffer();
        byte[] bytes = new byte[1024];
        try {
            for(int n ; (n = input.read(bytes))!= -1 ; ){
                System.out.println("n = " + n + new String(bytes,0,n,"UTF-8"));
                buffer.append(new String(bytes,0,n,"UTF-8"));
            }
        } catch (IOException e) {
            e.printStackTrace();
        }

上面这种方式用字节流读取文件内容时会发生乱码，主要问题是每次读取1024个字节，假如说一个中文3个字节，而这个中文正好在1023-1025之间，也就是第一次读1024个字节的时候把后一个中文的前两个字节也读取了，然后后面直接new String()，将字节数组转成UTF-8编码的字符，这样就会导致乱码，一个中文3个字节被拆开来了。