java-01基础篇-03 Java IO流之多学一点系列

土司先生

已于 2024-04-06 01:54:48 修改

阅读量869

点赞数 27

分类专栏： java SE 17 源码篇文章标签： java 开发语言

于 2024-04-06 01:51:03 首次发布

本文链接：https://blog.csdn.net/qq_45281820/article/details/137216387

版权

java SE 17 源码篇专栏收录该内容

16 篇文章 0 订阅

订阅专栏

文章探讨了Java中read()方法读取字节转为int的原因，介绍了JDK1.7引入的自动关闭机制，以及如何通过合理使用read(),write()方法、设置缓冲区大小、NIO和流的复用来提高IO效率。同时讨论了流的复用性和不支持重置的流如何借助第三方实现流操作的重置功能。

摘要由CSDN通过智能技术生成

一. read() 方法读取的是一个字节，为什么返回是int,而不是 byte?

因为字节输入流可以操作任意类型的文件，比如图片音频等，这些文件底层都是以二进制形式的存储。如果每次读取都返回byte,有可能在读取到中间的时候遇到 11111111 那么 8个1的二进制在byte类型里面表示-1；程序遇到-1 就会停止不读了。后面的数据也无法读取到了。所以用int来接受；当遇到 11111111 会在其前面补上 24个0凑足4个字节。那么byte类型的-1 就变成int类型的255了，这样就可以保证整个数据读完，而结束标记的-1就是int类型

二. JDK1.7 提供自动关闭

JDK1.7之前，原始的关闭；一般都会放在finally 代码块里面进行关闭；

public static void main(String[] args) {
        String source = "src/main/resources/test/source.txt";
        String target = "src/main/resources/test/target.txt";
        FileInputStream inputStream = null;
        FileOutputStream outputStream = null;
        try {
            inputStream = new FileInputStream(new File(source));
            outputStream = new FileOutputStream(new File(target));
            byte[] buffer = new byte[8192];
            while (inputStream.read(buffer) != -1) { // 将source文件数据读取出来，并存放至buffer中
                // 将buffer的数据写入至target文件中
                outputStream.write(buffer);
            }
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                if (inputStream != null) inputStream.close();
                if (outputStream != null) outputStream.close();
            } catch (IOException e) {
                throw new RuntimeException(e);
            }
        }
    }

JDK1.7 之后，可以在 try（）括号里面声明流的定义；之后不需要在考虑流的关闭；委托给了AutoCloseable 接口；继承该接口拥有自动关闭功能；

public static void main(String[] args) {
        String source = "src/main/resources/test/source.txt";
        String target = "src/main/resources/test/target.txt";

        try (
                FileInputStream inputStream = new FileInputStream(new File(source));
                FileOutputStream outputStream = new FileOutputStream(new File(target));
        ) {
            byte[] buffer = new byte[8192];
            while (inputStream.read(buffer) != -1) { // 将source文件数据读取出来，并存放至buffer中
                // 将buffer的数据写入至target文件中
                outputStream.write(buffer);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

package java.lang;

/**
 * @author Josh Bloch
 * @since 1.7
 */
public interface AutoCloseable {

    void close() throws Exception;
}

从继承结构上来字节流和字符流都实现AutoCloseable接口，意味着都拥有自动关闭的功能！

当我们自定义一个DefineStream 类的时候，并且存放至 try () {} 的括号里面的时候，语法爆红异常；并提示需要的类型(Required type)是：AutoCloseable; 而我提供的类型（Provided）类型是：DefineStream。那意思是我实现一下AutoCloseable接口就不会报错了。试验一下！

也就是说我们在try()里面允许存放AutoCloseable类型的子类。并且方法结束之后；会自动帮我们调用close()方法。从控制台的输出可以得到！毕竟代码上没有主动调用close方法。

package org.toast.io.auto;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;

/**
 * @author toast
 * @time 2024/4/1
 * @remark
 */
public class AutoCloseableDemo {
    public static void main(String[] args) {
        String source = "src/main/resources/test/source.txt";
        String target = "src/main/resources/test/target.txt";
        try (
                FileInputStream inputStream = new FileInputStream(new File(source));
                FileOutputStream outputStream = new FileOutputStream(new File(target));
                DefineStream defineStream = new DefineStream();
        ) {
            byte[] buffer = new byte[8192];
            while (inputStream.read(buffer) != -1) { // 将source文件数据读取出来，并存放至buffer中
                // 将buffer的数据写入至target文件中
                outputStream.write(buffer);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

class DefineStream implements AutoCloseable {
    public void close() throws IOException {
        System.out.println("自定义流进行关闭业务");
    }
}

try() 括号里面的关闭方法执行顺序比finally 要早一点；

package org.toast.io.auto;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;

/**
 * @author toast
 * @time 2024/4/1
 * @remark
 */
public class AutoCloseableDemo {
    public static void main(String[] args) {
        test();
        System.out.println("main结束");
    }

    private static void test() {
        String source = "src/main/resources/test/source.txt";
        String target = "src/main/resources/test/target.txt";
        try (
                FileInputStream inputStream = new FileInputStream(new File(source));
                FileOutputStream outputStream = new FileOutputStream(new File(target));
                DefineStream defineStream = new DefineStream();
        ) {
            byte[] buffer = new byte[8192];
            while (inputStream.read(buffer) != -1) { // 将source文件数据读取出来，并存放至buffer中
                // 将buffer的数据写入至target文件中
                outputStream.write(buffer);
            }
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            System.out.println("test finally 代码结束");
        }
    }
}

class DefineStream implements AutoCloseable {
    public void close() throws IOException {
        System.out.println("自定义流进行关闭业务");
    }
}

三. IO读取的效率怎么提高？

3.1 IO的基本工作流程

首先，需要了解一下操作系统的进程管理，为了底层资源的安全性；操作系统管理进程的时候分成了两个区，一个是用户态，一个是内核态；而用户态里面的进程是没有权限操作磁盘资源，更没有权限可以操作CUP资源，调用CUP相关指令；只有在内核态中运行的进程才有权限调用硬件资源;而用户态的进程故名意思是给用户使用的。像QQ, 微信，java开发的软件软件等等都是运行在用户态；

所以当用户态的应用进程进行IO处理时(调用磁盘资源,比如读取磁盘里的文件)，需要进行用户态到内核态之间的切换；委托给操作系统来完成IO处理。而操作系统在处理IO的时候，应用进程进入阻塞等待状态，等待操作系统IO处理完成，然后应用进程拿到操作系统处理完的IO数据，就会正常执行程序。

应用进程在用户态切换之内核态的时候，用户态的数据也会拷贝至内核态，操作系统完成IO之后，数据又会进一步进行拷贝。所以一次IO处理，需要来回两次数据的拷贝。外加阻塞等待，所以IO操作是一件耗时操作。

所以优化手段处理就从用户态到内核态之间切换次数，来回数据的拷贝，是否阻塞，同步异步来进行优化。

3.1 合理调用read()和write()相关方法

io操作处理的时候，inputStream提供如下一个读取方法

read(): 一个字节一个字节的读取

read(byte[]): 一组字节一组字节的读取

read(byte[], start, end): 指定位置读取一段字节到byte数组里面；

readAllBytes(): 这个方法是将数据全部一次性读取出来。调用该方法时要注意内存溢出问题；假如你的文件一个电影视频（4G）大小；一次性全部读取4G,如果java程序内存没有4G则会有内存溢出问题。

read(byte[]) 比read()逐个读取要快的多。

OutputStream也提供对应写方法

write(): 一个字节一个字节的写

write(byte[] ): 一组数组一组数组的写

write(byte[], start, end): 写byte[]指定区间的数据

3.2 合理设置缓冲区大小，减少io操作

然而，缓冲区过大可能导致内存浪费，过小则可能无法充分利用磁盘带宽。java.io包下提供了BufferedInputStream, BufferedOutputStream两个缓冲区的类。合理的设置缓冲区大小使用缓冲区可以减少磁盘访问次数，提高IO速度。缓冲区过大时可能到导致内存浪费，过小则可能无法充分利用磁盘宽带。

3.3 使用NIO，而非传统的IO

java.io包下IO处理是实现了BIO模式的一种IO处理，BIO模式是在Inputstream/OutputStream进行IO读取的时候，会让应用进程进入阻塞状态，如果这种阻塞状态将一直只持续下去，则该线程将永远无法完结。

如果是本地的文件IO处理，性能即使不好，还是可以接受的，如果是网络的IO处理，此时用户的读取数据的速度较慢，就将拖垮整个应用服务器。

那么请问，可以使用多线程的技术来解决吗？

多线程可以提供服务器应用程序的并发量，但本质上每一条线程处理IO性能还是一样慢，并且没有解决本质上的问题。如果并发量一高。也会将服务器IO资源耗尽(IO链接快速用完，其他将继续等待处理状态)。这样的结果将导致响应速度慢，甚至会出现服务器宕机的问题。从而造成整个应用的瘫痪。

那么在请问，高并发解决不了问题，我集群可以不？

这样是可以，通过集群是可以解决IO处理慢的问题，一台IO性能的问题满足不了，就多台一起解决。金钱弥补了技术。但是这样并不能将一台服务器的性能发挥到极致。

NIO(非阻塞式IO), 采用非阻塞式IO 应用进程在处理IO时，不会在一直死等操作系统完成IO处理返回数据，在正常运行。而是如果有数据就接受数据，没有返回就直接返回，继续程序自己的处理，等操作系统完成IO处理，数据回来在进行接收数据，两不误！

这样IO处理方式大大减少而阻塞而带来的性能损耗。

四，流的复用

流本身就是数据从一个数据存储介质传输到另一个数据存储介质，就无法被再次使用。一旦传输另一个数据存储介质时，就无法被再次使用。流里面的数据是一个byte[] 的数组数据，数据传输完毕，意味着数组数据从0遍历到尾部。已经无法被这次使用了。

所以流InputStream/OutputStream,当中提供reset()重置方法，将索引重新回到头部支持重头开始读，并不是所有的IO流都支持重置处理。需要根据markSupported() 方法来判断当前类型的IO是否支持重置处理。mark(readlimit) 的参数readlimit用来指定索引的可读的最大位置。

【流复用-案例1-IO流不支持复用】

package org.toast.io.reset;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.util.concurrent.TimeUnit;

/**
 * @author toast
 * @time 2024/4/4
 * @remark
 */
public class TestResetDemo {
    public static void main(String[] args) {
        String source = "src/main/resources/test/source.txt";
        String target = "src/main/resources/test/reset-target.txt";
        String target2 = "src/main/resources/test/reset-target2.txt";

        try (
                FileInputStream inputStream = new FileInputStream(new File(source));
                FileOutputStream targetStream = new FileOutputStream(new File(target));
                FileOutputStream target2OutStream = new FileOutputStream(new File(target2));
        ) {
            byte[] buffer = new byte[8192];
            System.out.println("first-start-write");
            while (inputStream.read(buffer) != -1) { // 第一次写入
                System.err.println("first-start-writing......");
                targetStream.write(buffer);
            }
            System.out.println("second-start-write");
            while (inputStream.read(buffer) != -1) { // 第二次写入
                System.err.println("second-start-writing......");
                target2OutStream.write(buffer);
            }
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }
}

【程序运行结果】：第二次没有打印出 "second-start-writing......" 也就意味着并没有将数据写入到指定的reset-target2.txt文件里面。结果如下：

从内容上也能发现reset-target2.txt没有数据。DEBUG分析一下：

第二次inputStream.read(buffer) 再次读取文件内容时，其InputStream内部的内容数组索引
已经到尾部了。所以表示已经没有数据可以读取了。
除非能够将其内部的内容索引又设置到起始位置，这样又可以遍历内容数据。read()返回的也不是-1了。

【流复用-案例2-IO流支持复用】

InputStream/OutputStream 考虑到流的复用性问题；在设计的时候提供如下方法：

数据内容重置方法，reset(); 当前IO流子类是否支持重置处理 markSupported()，以及 mark(readlimit)方法

方法	描述
public boolean markSupported()	当前子类IO流是否支持重置处理；默认是不支持重置处理
public synchronized void reset()	重置IO流的数据内容；将索引从头开始读取！如果mark(readlimit)设置readlimit在读取范围，则reset()方法开始从mark()设置的位子开始读取
public synchronized void mark(int readlimit)	readlimit指定了在标记位置变为无效之前可以读取的最大字节数。如果在调用reset()方法之前从流中读取的字节数超过了readlimit，则标记位置将被认为是无效的，即无法重新定位到标记位置。

JDK17 中可以看出只有BufferedInputStream, ByteArrayInputStream, FilterInputStream,以及后面的PushbackInputStream 子类实现了markSupported() 方法其中BufferedInputStream, ByteArrayInputStream是支持的；PushbackInputStream 不支持；FilterInputStream 要根据内嵌的inputStream而定，如果内嵌的支持则支持，内嵌的不支持则不支持；

由于ByteArrayInputStream 支持流的复用处理；用ByteArrayInputStream来进行举例子；

package org.toast.io.reset;

import java.io.*;
import java.util.concurrent.TimeUnit;

/**
 * @author toast
 * @time 2024/4/4
 * @remark
 */
public class TestResetDemo2 {
    public static void main(String[] args) {
        String source =
                "别人不会考虑是否过的辛苦不辛苦，因为我们需要的是结果，没有结果，谈感受，谈困难，我们表示理解，如果还是解决不了问题，" +
                "不好意思！你的产品不适合我们";
        String target = "src/main/resources/test/reset-byte-target.txt";
        String target2 = "src/main/resources/test/reset-byte-target2.txt";
        // 将source字符串内容存到ByteArrayInputStream里面，并输入到target,target2
        try (
                ByteArrayInputStream inputStream = new ByteArrayInputStream(source.getBytes());
                FileOutputStream targetStream = new FileOutputStream(new File(target));
                FileOutputStream target2OutStream = new FileOutputStream(new File(target2));
        ) {
            byte[] buffer = new byte[8192];
            System.out.println("first-start-write");
            TimeUnit.SECONDS.sleep(1); // 太快了，慢点保证打印的顺序
            while (inputStream.read(buffer) != -1) { // 第一次写入
                System.err.println("first-start-writing......");
                targetStream.write(buffer);
            }
            System.out.println("是否支持重置：" + inputStream.markSupported());
            if (inputStream.markSupported()) inputStream.reset();
            System.out.println("second-start-write");
            TimeUnit.SECONDS.sleep(1);
            while (inputStream.read(buffer) != -1) { // 第二次写入
                System.err.println("second-start-writing......");
                target2OutStream.write(buffer);
            }
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }
}

将输入流的数据进行一次重置之后，让索引回到起始位置。重新进行一次读取即可。

这样就可以实现流的重复读取。

【流复用-不支持复用的子类IO流，用第三方内存进行实现】

以案例为例，案例中使用的是FileInputStream 该文件流是不支持流数据重置处理。需要借助第三方来实现；

思路就是：将文件的数据都读取到一个 data[] 数据里面，data[]包含这个文件的所有内容；然后围绕这个data[]写一系列reset(); mark（）方法并抽离成一个独立的工具类；使得其data[]的索引具有重置处理功能。

而java设计者也为此提供了一个方案；这个方案就是使用BufferedInputStream；或者自己写一个工具类；对其data[]写一系列关于重置相关业务功能(重置，指定重置的位置等等，之后outputStream写数据时就用这个data[])

package org.toast.io.reset;

import java.io.BufferedInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.util.concurrent.TimeUnit;

/**
 * @author toast
 * @time 2024/4/4
 * @remark
 */
public class TestResetDemo3 {
    public static void main(String[] args) {
        String source = "src/main/resources/test/source.txt";
        String target = "src/main/resources/test/reset-target.txt";
        String target2 = "src/main/resources/test/reset-target2.txt";

        try (
                BufferedInputStream inputStream = new BufferedInputStream(new FileInputStream(new File(source)));
                FileOutputStream targetStream = new FileOutputStream(new File(target));
                FileOutputStream target2OutStream = new FileOutputStream(new File(target2));
        ) {
            inputStream.mark(0);
            byte[] buffer = new byte[1024];
            int len = 0;
            System.out.println("first-start-write");
            TimeUnit.SECONDS.sleep(1);
            while (inputStream.read(buffer) != -1) { // 第一次写入
                System.err.println("first-start-writing......");
                targetStream.write(buffer);
            }
            System.out.println("是否支持重置：" + inputStream.markSupported());
            if (inputStream.markSupported()) inputStream.reset();

            System.out.println("second-start-write");
            TimeUnit.SECONDS.sleep(1);
            len = 0;
            while (inputStream.read(buffer) != -1) { // 第二次写入
                System.err.println("second-start-writing......");
                target2OutStream.write(buffer);
            }
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }
}

友谊提示

ByteArrayInputStream的 readAheadLimit 是没有意义的。调用mark只是将pos 赋值给标记。如果你的ByteArrayInputStream是无参构造。则mark的值默认从0开始

    public ByteArrayInputStream(byte buf[]) {
        this.buf = buf;
        this.pos = 0;
        this.count = buf.length;
    }

如果是指定通过另一个构造开始，则mark等于偏移量的开始值

    public ByteArrayInputStream(byte buf[], int offset, int length) {
        this.buf = buf;
        this.pos = offset;
        this.count = Math.min(offset + length, buf.length);
        this.mark = offset;
    }

土司先生

关注

27
点赞
踩
21

收藏

觉得还不错? 一键收藏
2
评论
java-01基础篇-03 Java IO流之多学一点系列

因为字节输入流可以操作任意类型的文件，比如图片音频等，这些文件底层都是以二进制形式的存储。如果每次读取都返回byte,有可能在读取到中间的时候遇到 11111111 那么 8个1的二进制在byte类型里面表示-1；程序遇到-1 就会停止不读了。后面的数据也无法读取到了。所以用int来接受；当遇到 11111111 会在其前面补上 24个0凑足4个字节。那么byte类型的-1 就变成int类型的255了，这样就可以保证整个数据读完，而结束标记的-1就是int类型。
复制链接

扫一扫

专栏目录