BufferedInputStream效率的简单认识
BufferedInputStream是在处理I/O时的常客,通常也拿它与FileInputStream作比较,说BufferedInputStream比FileInputStream效率高,但今天做测试时貌似好像不是绝对的。
##一、FileInputStream
FileInputStream是InputStream的实现类,在该类中有三个方法read()、read(byte b[])和read(byte b[], int off, int len),
###1、read()方法
读一个字节的数据并返回该字节对应的ASCII值,内部是直接调用本地方法实现读取数据
/**
* Reads a byte of data from this input stream. This method blocks
* if no input is yet available.
*
* @return the next byte of data, or <code>-1</code> if the end of the
* file is reached.
* @exception IOException if an I/O error occurs.
*/
public int read() throws IOException {
return read0();
}
private native int read0() throws IOException;
###2、read(byte b[], int off, int len)
read(byte b[])、read(byte b[], int off, int len)的实现就是通过本地的readBytes(byte b[], int off, int len)实现,该方法是读取长度为len的字节数据,存放在起始下标为off的数组b中。
private native int readBytes(byte b[], int off, int len) throws IOException;
public int read(byte b[]) throws IOException {
return readBytes(b, 0, b.length);
}
public int read(byte b[], int off, int len) throws IOException {
return readBytes(b, off, len);
}
read(byte[] b)、read(byte b[], int off, int len)是调用一次本地方法读取(len - off)个数据,数组b充当缓存来暂时存储这些数据,相比read()一次读取一个字节相率要高。
##二、BufferedInputStream
BufferedInputStream是InputStream的间接子类,BufferedInputStream的设计属于装饰者设计模式,是对InputStream进行了包装,添加了一些其他功能。BufferedInputStream对象的创建需要传入InputStream对象。
在BufferedInputStream中也包含read()、read(byte b[])、read(byte b[], int off, int len)。
###1、read()
前面FileInputStream中的read()方法是调用本地方法读一个字节,那先看下BufferedInputStream的read()方法。FileInputStream中的readbyte b[])、read(byte b[], int off, int len)都是借助自己定义的数组充当缓存而一次读取多个字节提高效率,那在BufferedInputStream中已经提供了一个数组来充当缓存。
class BufferedInputStream extends FilterInputStream {
// 缓冲数组默认长度
private static int DEFAULT_BUFFER_SIZE = 8192;
private static int MAX_BUFFER_SIZE = Integer.MAX_VALUE - 8;
// 缓冲数组
protected volatile byte buf[];
// 构造
public BufferedInputStream(InputStream in) {
this(in, DEFAULT_BUFFER_SIZE);
}
// 构造
public BufferedInputStream(InputStream in, int size) {
super(in);
if (size <= 0) {
throw new IllegalArgumentException("Buffer size <= 0");
}
buf = new byte[size];
}
private byte[] getBufIfOpen() throws IOException {
byte[] buffer = buf;
if (buffer == null)
throw new IOException("Stream closed");
return buffer;
}
// 缓冲区填充
private void fill() throws IOException {
byte[] buffer = getBufIfOpen();
... (太多了省略,这一块是BufferedInputStream的核心)
int n = getInIfOpen().read(buffer, pos, buffer.length - pos);
if (n > 0)
count = n + pos;
}
public synchronized int read() throws IOException {
if (pos >= count) {
fill();// 缓冲区填充
if (pos >= count)
return -1;
}
return getBufIfOpen()[pos++] & 0xff;
}
}
BufferedInputStream的read()借助了缓冲数组,所以虽然BufferedInputStream的缓冲数组为8192,理论上应该比FileInputStream的read()快8192倍,实际测试确实快了一些,但并不是8192倍。
在fill()中调用了InputStream中read(byte[] b,int off ,int len)的方法读取数据,通过源码(自行查看,这里不贴出来了)可以发现read(byte[] b,int off ,int len)其实是循环调用InputStream中的read()方法,并不像FileInputStream中的read(byte[] b,int off ,int len)方法一次读取多个,所以一番的跋山涉水小效率自然会损失,并不是理想值。
public static void main(String[] args) throws IOException {
long start = System.currentTimeMillis();
read1();
long end = System.currentTimeMillis();
System.out.println("直接:" + (end - start));
start = System.currentTimeMillis();
read2();
end = System.currentTimeMillis();
System.out.println("缓冲:" + (end - start));
}
public static void read1() throws IOException {
FileInputStream in = new FileInputStream("E:/cash.txt");
FileOutputStream out = new FileOutputStream("e:/cash1.txt");
int len = 0;
while((len = in.read()) != -1) {
out.write(len);
}
in.close();
out.close();
}
public static void readArr2() throws IOException {
FileInputStream in = new FileInputStream("E:/cash.txt");
BufferedInputStream fin = new BufferedInputStream(in);
FileOutputStream out = new FileOutputStream("e:/cash2.txt");
BufferedOutputStream fout = new BufferedOutputStream(out);
int len = 0;
while((len = fin.read()) != -1) {
fout.write(len);
fout.flush();
}
in.close();
out.close();
fin.close();
fout.close();
}
结果:
直接:3870
缓冲:2401
###2、read(byte[] b)
将上面的read()改用read(byte[] b)方法:
public static void readArr1() throws IOException {
FileInputStream in = new FileInputStream("E:/jdk1.7.0_25_64.tar.gz");
FileOutputStream out = new FileOutputStream("e:/jdk1.71.tar.gz");
// 缓冲的数组
byte[] data = new byte[100];
int len = 0;
while((len = in.read(data)) != -1) {
out.write(data, 0, len);
out.write(len);
}
in.close();
out.close();
}
public static void readArr2() throws IOException {
FileInputStream in = new FileInputStream("E:/jdk1.7.0_25_64.tar.gz");
BufferedInputStream bf = new BufferedInputStream(in);
FileOutputStream out = new FileOutputStream("e:/jdk1.72.tar.gz");
BufferedOutputStream fout = new BufferedOutputStream(out);
// 缓冲的数组
byte[] data = new byte[100];
int len = 0;
while((len = bf.read(data)) != -1) {
fout.write(data, 0, len);
fout.flush();
}
in.close();
out.close();
bf.close();
fout.close();
}
这里有个不解的问题,目前数组大小是100,下面是运行结果,BufferedInputStream效率明显高于FileInputStream,符合之前说的。
直接:7882
缓冲:3314
但是把数组大小变大后,BufferedInputStream的优势会逐渐降低,甚至低于直接使用FileInputStream。
// 数组大小:500
直接:1781
缓冲:1133
*************************
// 数组大小:1024
直接:1029
缓冲:943
*************************
// 数组大小:1024*5
直接:300
缓冲:1165
*************************
// 数组大小:1024*10
直接:197
缓冲:761
*************************
// 数组大小:1024*50
直接:109
缓冲:746
去观察BufferedInputStream的read(byte[] b)方法发现其实是其父类FilterInputStream的read(byte[] b)。
在里面又调用了read(byte b[], int off, int len),in是InputStream类型引用,所以in.read(b,off,len),所以BufferedInputStream的read(byte[] b)其实是通过InputStream的read(byte[] b,int off,int len)实现的。