java i o 接口,Java ：回顾基础 I/O 接口与序列化

最新推荐文章于 2022-03-27 17:50:46 发布

Emily Yu

最新推荐文章于 2022-03-27 17:50:46 发布

阅读量156

点赞数

文章标签： java i o 接口

首先，输入 ( Input ) 与输出 ( Output ) 是相对于内存而言的。如果要将其读入内存，则称之为输入流 ( 包括从网络中获取字节流 ) ，反之则称之为输出流，比如将字节流推送到网络中，或者写出到文件。

四个最基本的 I/O 流接口，分别为 InputStream / OutputStream ( 面向字节 ) 和 Reader / Writer ( 面向字符 ) 。字符文件，典型的就是 .java 源代码，字节文件，典型的就是编译后的 Class 二进制文件。

字节操作 I/O

这里以 InputStream 为例，它是字节输入流的最高抽象。以下是几个常用的 InputStream ：InputStream

├ FileInputStream-> 读取文件

| FilterInputStream-> 用于装饰其它 InputStream。

| ├ BufferedInputStream-> 附带缓存的 InputStream

| └ DataInputStream-> 可直接从二进制文件中读取数据的 InputStream

└ ByteArrayInputStream-> 按照 IO 方式操作内存中的 bytes[] 数组

└ ObjectInputStream-> 可直接从二进制文件中读取数据并恢复某个对象的状态，或称序列化。复制代码

上述提到的每一个 XXXInputStream 都有对应的 XXXOutputStream ，其层次关系也保持一致。

File I/O Stream

最简单的例子：使用 FileInputStream 和 FileOutputStream 尝试将一个图片读取到内存，然后重新输出到磁盘的另一个地方，整个过程相当于是文件复制。给出下述代码块：// 抽象成 Java File 类File file = new File("src/com/i/io/test.png");// 装进 FileInputStreamFileInputStream fileInputStream = new FileInputStream(file);// 也可以: FileInputStream fileInputStream = new FileInputStream("/src/com/i/io/test.png");// 准备一个字节数组作为容器，这种处理方式存在 bug ，我们后续去讨论它。byte[] bs = new byte[(int) file.length()];// read 函数具备返回值，并且具备副作用。因为它将读取到的内容输出到了 bs 数组中。// 如果这个 fileInputStream 读到文件末尾 EOF 了，则返回 -1。fileInputStream.read(bs);// 创建一个 OutputStream 流File file2 = new File("src/com/i/io/output.png");

OutputStream outputStream = new FileOutputStream(file2);// 将读取的二进制流输出到另一个文件outputStream.write(bs);// 省略 close 方法 ...复制代码

read 函数具备返回值，并且具备副作用。因为它将读取到的内容输出到了 bs 数组中。当这个 fileInputStream 读到文件末尾 EOF 时，返回 -1 。通常我们都会利用这一点作为文件流是否已经读取完毕的判断条件。如果程序执行顺利，我们可以在项目目录中打开被拷贝的图像副本。

"少拿多取" 的 I/O 实现

因为上传的不过是几 Kb 大小的文本，所以上面的例子运行起来没有什么问题。byte[] bs = new byte[(int)file.length()];复制代码

这行代码创建了一个和外部磁盘等大的内存空间并分配给了 bs 变量，在大文件 I/O 时，这么做的结果将是灾难性的。比如读取 10 Gb 大小的蓝光电影，JVM 理论上需要创建一个容量为 10 Gb 的数组对象。

然而对于 bs 数组而言，当其长度已经大于等于 Integer.MAX-1 时，系统就会提示这样的错误：Exception in thread "main" java.lang.OutOfMemoryError: Requested array size exceeds VM limit复制代码

即便文件大小比这稍小一些，比如 Integer.MAX-2 ，这样一个近 2 Gb 的数组对象也极易导致堆内存溢出。Exception in thread "main" java.lang.OutOfMemoryError: Java heap space复制代码

为了加以对比，这里同时编写一个 "一次读取，一次写出" 的 cpWithoutBuff 函数，还有一个 "限制每次读取的数量，且边读边写" 的 cpWithBuff 函数。public static void cpWithoutBuffer(String targetFileName,String destFileName) throws IOException {

File file = new File(targetFileName);

File dest = new File(destFileName);

FileInputStream fileInputStream = new FileInputStream(file);

FileOutputStream fileOutputStream = new FileOutputStream(dest);// avaliable 获取目前流的可读字节数，在最开始它是等于文件字节数的。// 随着读取的进行，每次 avaliable 的值实际上是逐渐减小的。byte[] bytes = new byte[fileInputStream.available()];

fileInputStream.read(bytes);

fileOutputStream.write(bytes);// 省略 close 方法 ...}public static void cpWithBuffer(String targetFileName,String destFileName,int bufferSize) throws IOException {

File file = new File(targetFileName);

File dest = new File(destFileName);

FileInputStream fileInputStream = new FileInputStream(file);

FileOutputStream fileOutputStream = new FileOutputStream(dest);// buffedSize 是外部规定的，有限的数组长度byte[] bytes = new byte[bufferSize];// 边读入边输出。while((fileInputStream.read(bytes)) != -1){

fileOutputStream.write(bytes);

}// 省略 close 方法 ...}复制代码

在主函数中向两个方法分别传入同一个 7 Gb 大小的文件测试。显然，cpWithBuff 可以正常运行，但是 cpWithoutBuff 在读文件之前就抛出了 OOM 异常，原因是无法创建满足长度的数组。private static final int Kb = 1024;public static void main(String[] args) throws Exception {

String fn = "E:/Avenger_Endgame.mp4";

String dfn ="src/com/i/io/dest.mp4";int bufferSize = 1 * Kb;long l1 = System.currentTimeMillis();

cpWithBuffer(fn,dfn,bufferSize);long l2 = System.currentTimeMillis();

System.out.println(l2-l1 + "ms");//----------------------------------------------long l3 = System.currentTimeMillis();

cpWithoutBuffer(fn,dfn);long l4 = System.currentTimeMillis();

System.out.println(l4-l3 + "ms");//----------------------------------------------}复制代码

通过 cpWithBuff 函数拷贝 7 Gb 的文件耗费了将近 70s ，这个函数使用的是 File I/O Stream，这里约定程序一次读取的数据量 bufferSize 为 1024 ( 1 Kb ) 。

至少，这解决了读大文件导致内存溢出的问题。还有一个有趣的问题就是这个 bufferSize 的取值应该是多少。通过后续的实验发现，当一次读取的 bufferSize 在合理范围内时，cpWithBuff 的运行效率也要比 cpWithoutBuff 的效率更高一些，或者说 "分小批读写" 比 "一次读写" 要更快。

下方的表格是当拷贝文件大小为 150 Mb ( 这个例子不能用太大的文件做测试 ) 时，当 cpWithBuff 中每次读取的 bufferSize 取不同值时的运行效率 ( 两种 I/O 在一个主线程内串行执行，下同) ：1 Kb4 Kb8 Kb32 Kb4 Mb16 Mb32 MbcpWithBuffer1402ms613ms379ms170ms230ms252ms341ms

而选择 "一次读取" 的 cpWithoutBuff 的平均运行效率大概稳定在 350ms 左右。

bufferSize 取决于文件的体量有多大。如果每次读取的数据量相对过小，那么 I/O 需要反而耗费大量时间。而如果每次读取的数据量过大，甚至接近于文件大小时，I/O 效率也会降低，且这会占用大量内存。

但总体而言，bufferSize 完全没有必要取一个很大的值，只要取 Kb 量级的大小即可。

I/O 中应用的装饰者模型

Java I/O 是典型的装饰者案例，这里充当被装饰者的是 File I/O Stream ，而充当装饰者的则有 Buffered I/O Stream，Data I/O Stream，Object I/O Stream 等。它们都在原有的 File I/O Stream 基础上增强了 read/write 方法，或者并根据不同需要扩充了额外方法。

只有充当被装饰者的 File I/O Stream 是直接对文件进行操作的，它因此又被称之为节点流。而充当装饰者的其它流又统称为处理流。装饰者模式在 Java 的直接体现是：你可以在装饰者的构造函数中将被装饰者当作参数传入，然后通过调用装饰者来获取增强的原方法，或者是拓展的新方法。

在关闭流时，只要关闭最外层的处理流即可，这样节点流也会随之关闭。如果流和流之间存在依赖关系，那么先关闭被依赖的流。

Buffered I/O Stream

在数据量传输量低的情况下使用 Buffered I/O Stream ，其工作效率要比 File I/O Stream 更高一些。Buffered I/O Stream 内置缓冲数组 buf[] ，默认大小为 8 Kb ，但是可以在构造器中调整。

以 BufferedInputStream 类为例，当调用 read(bytes) 方法时，它实际会率先从内部的缓冲数组 buf[] 中取数据。只有 BufferedInputStream 发现缓冲数组内的数据读完了之后，才会进行一次系统调用调入新一批的数据以供外部的 bytes 读取。

简而言之就是：BufferedInputStream 总是希望尽可能少地进行系统调用来提高效率，对于 BufferedOutputStream 同理。这里再编写一个用 Buffered I/O Stream 实现的文件拷贝方法：public static void cpWithBis(String targetFileName, String destFileName,int bufferSize) throws IOException {

File file = new File(targetFileName);

File dest = new File(destFileName);

BufferedInputStream bis = new BufferedInputStream(new FileInputStream(file));

BufferedOutputStream bos = new BufferedOutputStream(new FileOutputStream(dest)); byte[] bytes = new byte[bufferSize];while((bis.read(bytes)) != -1) bos.write(bytes); // 省略 close 方法 ...}复制代码

在 bufferSize 同样设置为 1 Kb 的条件下，使用该函数去拷贝之前 7 Gb 文件所花费的时间从 70s 降低到了 50s 。同样的，我们在这里也要测试不同的 bufferSize 对 Buffered I/O Stream 的影响：

通过拷贝案例得到的结论

下方的表格是当拷贝文件大小为 7 Gb 时，两种 I/O 方式在 bufferSize 参数取不同值的平均执行效率：2 Kb4 Kb8 Kb16 Kb4 MbFile I/O Stream40s / 50s40s30s28s27s

Buffered I/O Stream35s35s30s28s27s

下方的表格是当拷贝文件大小为 150 Mb 时，两种 I/O 方式在 bufferSize 参数取不同值的平均执行效率：1 Kb2 Kb8 Kb16 Kb32 KbFile I/O Stream1402ms971ms325ms210ms183ms

Buffered I/O Stream391ms341ms212ms237ms173ms

下方的表格是当拷贝文件大小为 10 Mb 时，两种 I/O 方式在 bufferSize 参数取不同值的平均执行效率：1 Kb2 Kb4 Kb8 Kb16 KbFile I/O Stream100ms67ms50ms31ms16ms

Buffered I/O Stream31ms27ms27ms28ms15ms

在这些测试中，Buffered I/O Stream 的 buf[] 统一为 8 Kb。得到以下现象：随着 bufferSize 的适当增加，两种 I/O 效率都会提升；

bufferSize 从 1 Kb 提升到 8 Kb 时，File I/O Stream 效率相对有较大提升，但总体还是 Buffered I/O Stream 更好；

随着 bufferSize 逐渐接近 buf[] 缓冲的大小时，两者的效率已趋近相同。

针对于第二，三个现象的原因，经网上查阅资料，得到的答案是：在简单的读写情境下，如果一次读/写的 bufferSize 大于或等于内部缓冲 buf[] ，那么 Buffered I/O Stream 会直接调用 File I/O Stream 的 read / write 方法。

由此可知：如果读取的文件是小文件，bufferSize 一般都会设置的非常小 ( 约 1 ~ 2 Kb 就足够了 ) 。出于 "性价比" 考虑，此时 Buffered I/O Stream 效率普遍比 File I/O Stream 更好。传输的文件较大的话，则一般 bufferSize 也要适当地更大一些。如果它的尺寸超过了 Buffered I/O Stream 内部缓冲的 buf[] ( 这个尺寸其实可以设置 ) ，那么只从文件读写的角度来看，选择两者的区别不大。

Data I/O Stream

下一个例子是 DataInputStream / DataOutputStream。它们允许将 Java 的基本数据类型转换为字节写入到文件内部，或者在某个时刻再将其数据重新读回到内存。File file = new File("src/com/i/io/target.txt");// 创建 File I/O StreamFileInputStream inputStream = new FileInputStream(file);

FileOutputStream fileOutputStream = new FileOutputStream(file);// 创建 Data I/O Stream DataInputStream dataInputStream = new DataInputStream(inputStream);

DataOutputStream dataOutputStream = new DataOutputStream(fileOutputStream);复制代码

由于可以在任何平台的 Java 虚拟机中重新使用 DataInputStream 将数据还原，因此可以认为这种方式存储的数据独立于操作系统。

DataOutputStream / DataInputStream 装饰器本身就提供 readXXX()，writeXXX ( XXX 代表八种基本数据类型 ) 方法供我们方便地读写 Java 基本数据类型，因此在这里不需要考虑原生的 read/write 方法。如果要写入字符串，则应调用 readUTF ，writeUTF() 方法，其字符串会按照 UTF-8 编码方式存储、读取。dataOutputStream.writeInt(1);

dataOutputStream.writeBoolean(true);

dataOutputStream.writeDouble(1.00d);

dataOutputStream.writeUTF("你好");//按什么顺序写，就按什么顺序读。System.out.println(dataInputStream.readInt());

System.out.println(dataInputStream.readBoolean());

System.out.println(dataInputStream.readDouble());

System.out.println(dataInputStream.readUTF());// 省略 close 方法 ...复制代码

同时需要注意，被 DataOutputStream 写入的文件内的数据将是紧凑排列的，没有任何额外的字节/字符修饰，当然也没有 "换行" 一说。如果要通过 DataInputStream 将 Java 数据从文件中恢复过来，我们需要按照原先的字节写入顺序来依次读取，这样才能够获得正确地数据。

从 Java 基本数据类型的持久化加以引申，如果要对一个类对象进行数据持久化，我们将利用 Serializable 接口对对象进行序列化。

序列化与 Object I/O Stream

对于要被序列化的对象，它首先需要满足以下要求，下面给定一个例子：package com.i.io;import java.io.Serializable;import java.util.Date;// Serializable 是一个标记接口，没有实质功能，Java 虚拟机将负责序列化 / 反序列化的过程。public class Protocol implements Serializable {/* 极力建议主动声明一个 serialVersionUID (相当于声明这个源码对应的 Class 文件的版本)。

* 即便不主动声明这个 UID ，javac 编译器仍然会主动添加。但是，此时的 Class 文件将对更改十分敏感,

* 因为 javac 编译器每编译一次该文件就会生成不一样的 UID 号，

* 这会导致虚拟机在执行反序列化时在对比 UID 的过程出现错误而抛出异常。

*/private final static long serialVersionUID = 10L;// 对于不需要进行持久化的数据可以添加 transient 关键字，这样 Java 在序列化对象时忽略这个参数。private transient String info;private Date date; // 省略构造器和 get set 方法 ... }复制代码

既然理解了 Data I/O Stream ，那么操作 Object I/O Stream 也并非难事。File file = new File("src/com/i/io/save.txt");

ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream(file));

Protocol tar = new Protocol("hello", new Date());

oos.writeObject(tar);

ObjectInputStream ois = new ObjectInputStream(new FileInputStream(file));

Protocol protocol = (Protocol) ois.readObject();

System.out.println(protocol.getDate());

System.out.println(protocol.getInfo());复制代码

不过，当尝试打印 getInfo() 信息时会发现它为 null 。原因是我们在 Protocol 类的 info 字段中标记了它是暂存 ( transient ) 的。

ByteArray I/O Stream

ByteArray I/O Stream 稍许不同，因为它们读取 ( 输出 ) 的来源 ( 目的 ) 地并不是磁盘文件，而是堆内存中另一个 byte[] 类型数组。

以ByteArrayInputStream 类为例，它的构造函数不再接收一个 InputStream ，而是一个预加载的二进制流；而对于 ByteArrayOutputStream 类而言，它提供一个 toByteArray 方法将已经写好的二进制流转化为一个普通的 byte[] 数组。

下面的函数利用了 BufferedInputStream 从文件中读取数据，但是却输出到了一个 byte[] 数组：private static final int Kb = 1024;public static byte[] read2Byte(String path,int bufferSize) throws IOException {

bufferSize = bufferSize

BufferedInputStream bis = new BufferedInputStream(new FileInputStream(path));

ByteArrayOutputStream baos = new ByteArrayOutputStream();byte[] bf = new byte[bufferSize]; // available() 获取的是这个流目前可读的长度，在当前语境下可以理解成是文件尺寸，因为这个流就是从文件中读取的。if (bis.available()>= 1024 * 200 * 1024) throw new IOException("文件尺寸不得大于 200 Kb");while((bis.read(bf)) != -1 ){

baos.write(bf);

} // 省略 close 方法 ...return baos.toByteArray();

}复制代码

注意，写出的结果会在内存中堆积，为了防止 OOM ，在这里限制了外部文件的大小。类似地，我们也可以将刚才的 cpWithBuffer 等函数 "改造一番" ，使其接收内存的字节序列，而输出为文件：public static void cpWithBuff(byte[] targetBytes,String destFileName,int buffedSize) throws IOException {

ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(targetBytes);

File dest = new File(destFileName);

FileOutputStream fileOutputStream = new FileOutputStream(dest);byte[] bytes = new byte[buffedSize];// 边读入边输出。while((byteArrayInputStream.read(bytes)) != -1){

fileOutputStream.write(bytes);

} // 省略 close 方法 ...}复制代码

文本操作 R / W

InputStream 只负责读取二进制文件，比如图像文件，音频，视频文件的二进制流。

除去二进制媒体文件之外，如果我们要记录文本类型的文件 ( 如配置文件，记事本，日志 ) 等等，通常都不会直接使用 InputStream 和 OutputStream 流，因为纯二进制流不能直观反映出信息，如果一定要使用通常的文本编辑器去查看这类文件，那么得到的一定是乱码。

Java 提供的 Reader 和 Writer 接口则分别为文本的读和写提供了便捷。下面是常用的几个 Reader ：Reader

├ InputStreamReader -> 它是一个转换流，读入字节文件并以字符文件形式呈现。

| └ FileReader-> 提供更纯粹的文本阅读服务。

├ BufferedReader-> 支持以行为单位的文本读写。

├ StringReader-> 类似于 ByteArrayInputStream，它将 String 视为文本进行操作。

└ CharArrayReader -> 类似于 StringReader，但它是对 char[] 操作的。复制代码

同样，上述的 XXXReader 都存在对应的 XXXWriter。

I/O Stream Reader/Writer

字符文件，归根结底就是被特殊编码的字节文件，因此我们仍然可以使用 FileInputStream 来打开它。不过，这里还利用了转换流 InputStreamReader，通过提供编码格式，最终以字符的格式输出。

在操作纯二进制流时，我们使用 byte[] 作为读入的返回值，而在这里使用 char[] 作为读入的返回值。对于 Writer 同理。File file = new File("src/com/i/io/plain.txt");

FileInputStream fileInputStream = new FileInputStream(file);

InputStreamReader isr = new InputStreamReader(fileInputStream,"UTF-8");char[] chars = new char[10];// 这里仅仅是介绍了最朴素的一种调用方式 .....// 读入的内容可能和想象中的一些偏差，我们稍后使用 Buffered Reader 去解决它。while ((isr.read(chars))!=-1){

System.out.println(chars);

}// 省略 close ...复制代码

写的过程则类似，可以在构造时主动提供编码方式：File file = new File("src/com/i/io/plain.txt");

FileOutputStream outputStream = new FileOutputStream(file);

OutputStreamWriter osw = new OutputStreamWriter(outputStream,"UTF-8");

osw.write("hello\nworld");

osw.close();复制代码

对于 OutputStreamWriter 有很多其它的细节需要去介绍：

首先，osw 只有在手动执行 flush() 或者 close() 方法 ( 本质上是这个 Writer 在关闭之前自己调用了 flush() 方法 ) 时才会将内容写出到文件中。如果这两者都没有被调用，那么文本内容就不会写入到文件内。这和前文的 BufferedOutputStream 等是不一样的。

其次，其实 FileOutputStream 的构造器提供了一个 boolean 类型的 append 参数来设定我们是要追加还是覆写，只不过之前我们都默认选择了 false。如果将这个开关打开，write() 方法将是追加，而非覆盖。File file = new File("src/com/i/io/plain.txt");

FileOutputStream outputStream = new FileOutputStream(file,ture);

OutputStreamWriter osw = new OutputStreamWriter(outputStream,"UTF-8");// 文件将被追加，而非覆盖。osw.write("abcd");

osw.close();复制代码

File Reader / Writer

File Reader / Writer 可以直接将某个文件以文本形式加载到内存。File file = new File("src/com/i/io/plain.txt");

FileReader fileReader = new FileReader(file);

FileWriter fileWriter = new FileWriter(file);复制代码

但是，它们不像 I/O Stream Read/Writer 那样直接提供编码方式，而是遵循运行环境默认的编码格式，这可以在系统配置项中查到。如果要为 File Reader/Writer 替换指定的字符集，那就需要在此基础上调用 replace() 方法做替换。System.getProperty("file.encoding");复制代码

FileWriter 的构造器可额外提供boolean 类型的 append 参数来指定追加或者是覆盖：FileWriter fileWriter = new FileWriter(file,true);// "hello" 文本会被添加到文本的最后一行，而不是覆盖。fw.write("hello");复制代码

FileWriter 其实还额外提供了一个名为 append 的方法，而该方法底层仍然调用的是 write 方法，只不过它避免了当字符序列为空指针的异常情况。public Writer append(CharSequence csq) throws IOException {if (csq == null)

write("null");elsewrite(csq.toString());return this;

}复制代码

也就是说，对文件的操作是 "重写" 还是 "追加"，不取决于调用的是 append 方法还是 write 方法，而取决于在构造函数中的 append 项是 true 还是 false 。

Buffered Reader/Writer

Buffered Reader/Writer 可以充当 I/O Stream Reader/Writer 及其 File Reader/Writer 的装饰器。文本和纯二进制流的最大区别是：前者有 "换行" 的概念，而后者没有。写出过程还好：大不了就显式地在每个字符串后面加一个 \n 。而读取文本时，这就有点麻烦了：事先准备好的 char[] 缓冲数组长度应该怎么确定？

手动判断这个字符是不是 CR LF 。

在前文我们尝试过使用 InputStreamReader 读取文本，但是效果似乎不太理想。而 BufferedReader 解决了这个痛点问题，它提供 readLine 方法以直接按 "行" 为单位操作文本文件。File file = new File("src/com/i/io/plain.txt");

FileReader fileReader = new FileReader(file);

BufferedReader bufferedReader = new BufferedReader(fileReader);

String s;while ((s= bufferedReader.readLine())!=null){System.out.println(s);}复制代码

同时，BufferedReader 支持将文本按行解析并直接返回 Stream 流。比如这里配合 flatMap 方法解析了文本的单词个数 ( 以空格为分隔符 )：Stream lines = bufferedReader.lines();long wordCount = lines.flatMap(p -> Arrays.stream(p.split(" "))).count();

System.out.println(wordCount);复制代码

而 BufferedWriter 提供了一个 newLine() ( 返回值为 void ) 来实现换行。File file = new File("src/com/i/io/plain.txt");

FileWriter fileWriter = new FileWriter(file);

BufferedWriter writer = new BufferedWriter(fileWriter);

writer.write("hello");

writer.newLine();

writer.write("world");

writer.close();复制代码

String ( CharArray) Reader/Writer

String Reader/Writer 和 CharArray Reader/Writer 可类比之前的 ByteArray I/O Stream，它们实际上将内存中的长字符串当作是文本 ( 或者字符数组 char[] ) 看待，而我们可以对字符串进行 read() 或者 write() 等操作。另外，它们也可以被 Buffered Rader/Writer 所修饰。StringReader stringReader = new StringReader("hello\nhalo\n:)\n");

BufferedReader bufferedReader = new BufferedReader(stringReader);

bufferedReader.lines().forEach(System.out::println);复制代码

Emily Yu

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
java i o 接口,Java ：回顾基础 I/O 接口与序列化

首先，输入 ( Input ) 与输出 ( Output ) 是相对于内存而言的。如果要将其读入内存，则称之为输入流 ( 包括从网络中获取字节流 ) ，反之则称之为输出流，比如将字节流推送到网络中，或者写出到文件。四个最基本的 I/O 流接口，分别为 InputStream / OutputStream ( 面向字节 ) 和 Reader / Writer ( 面向字符 ) 。字符文件，典型的就是...
复制链接

扫一扫