[Java IO源码]缓冲流的实现原理

最新推荐文章于 2024-08-05 13:31:35 发布

woorh

最新推荐文章于 2024-08-05 13:31:35 发布

阅读量3.1k

点赞数

本文链接：https://blog.csdn.net/woorh/article/details/8525133

版权

java.io包中的类大致可以分为：InputStream、OutputStream、Reader、Writer。InputStream/Reader可以理解为input from数据源，OutputStream/Writer可以理解为output to数据目的地。他们的前者处理的是字节，后者处理的是字符。而数据源则可能是来自硬盘上的文件、内存中的变量、网络的数据等等。

(分类图参考http://tutorials.jenkov.com/java-io/overview.html)

InputStream字节输入流下为什么还有这么多类？

数据源不同：FileInputStream的源是硬盘的文件，ByteArrayInputStream是byte[]数组变量，PipedInputStream是来自其他线程的PipedOutputStream。

DataInputStream、ObjectInputStream可用于网络传输提供了读取数据类型的功能，如readInt返回对应OutputStream发送的int类型的值。ObjectInputStream还可以用在网络中java对象的传输。

剩下的类就是封装了原始的流操作，增加了一些附加功能。

那原始的流操作时怎么样的，增加的附加功能又是怎么样？

“I/O流”真的像是流动一样的，read()一个字节后，指针指向便往后移，下次read的时候返回流中下个字节。就是过了这家没这店，奔流到海不复回。

那如果想从流中去读过的数据那该怎么办？

缓存！BufferedInputStream、PushbackInputStream提供缓存的功能，将流中的数据先独到数组中，做个标记，下次可以从标记的地方读取。

java.io类的设计使用了装饰模式，可以参考下http://hzxdark.iteye.com/blog/40133

BufferedInputStream bis = new BufferedInputStream(new FileInputStream("xxx.txt"));

BufferedInputStream 的源码可以参考http://icanfly.iteye.com/blog/1207397

如何实现缓存可以再阐述下：

缓存是一个byte数组buffer，刚开始的时候，先往buffer中读取固定长度的字节。这样的话，就可以用变量markpos来保存标记的位置，到时可以跳回到这个mark，再重新读一次。从buffer中读数据，pos指向buffer当前位置，一直向后移动，如果到buffer的末尾，则需整理缓存，重读数据。

1.此时没有设置markpos的话，把缓存清空再填充。pos指向buffer的首位。

2.如果有markpos的话，把markpos指向的位置到buffer末位这段数据移到buffer的前面，markpos指向buffer的首位。然后剩下多少空间，就从流中读取多少新的数据到缓存buffer中。

3.在设置标记markpos的时候，会同时指定一个marklimit变量用来限定markpos之后可以缓存多少数据。现在markpos指向buffer首位一直没有去改它，而另一边一直从buffer中读数据，会出现这样一种情况：pos又一次越过buffer的末位，markpos还指向buffer的首位。而此时要判断marklimit的长度，如果marklimit比buffer的长度还大。就要对buffer进行扩容，扩大一倍后再比较，如果还是这样，先不管，先用，用完再扩，buffer最大长度也只能扩到marklimit的长度。扩容完后，需要将buffer的数据搬过来。

4.markpos一直没去动它，buffer也扩到了marklimit的长度了，pos又一次越过了buffer末位，没有空间了。那就要直接把markpos废了，清空缓存。

	public synchronized int read() throws IOException {
		if (pos >= count) {
			fill();
			if (pos >= count)
				return -1;
		}
		return getBufIfOpen()[pos++] & 0xff;
	}

	private void fill() throws IOException {
		byte[] buffer = getBufIfOpen();
		if (markpos < 0)
			pos = 0; /* no mark: throw away the buffer */
		else if (pos >= buffer.length) /* no room left in buffer */
			if (markpos > 0) { /* can throw away early part of the buffer */
				int sz = pos - markpos;
				System.arraycopy(buffer, markpos, buffer, 0, sz);
				pos = sz;
				markpos = 0;
			} else if (buffer.length >= marklimit) {
				markpos = -1; /* buffer got too big, invalidate mark */
				pos = 0; /* drop buffer contents */
			} else { /* grow buffer */
				int nsz = pos * 2;
				if (nsz > marklimit)
					nsz = marklimit;
				byte nbuf[] = new byte[nsz];
				System.arraycopy(buffer, 0, nbuf, 0, pos);
				if (!bufUpdater.compareAndSet(this, buffer, nbuf)) {
					// Can't replace buf if there was an async close.
					// Note: This would need to be changed if fill()
					// is ever made accessible to multiple threads.
					// But for now, the only way CAS can fail is via close.
					// assert buf == null;
					throw new IOException("Stream closed");
				}
				buffer = nbuf;
			}
		count = pos;
		int n = getInIfOpen().read(buffer, pos, buffer.length - pos);
		if (n > 0)
			count = n + pos;
	}

read()方法中的pos是缓存数组buffer的当前位置，count是buffer中总共有多少值，一般来说count=pos+1。当pos>=count则说明缓存已经用完了，需要重新往里边填数据，调用fill()方法。之前有调用mark方法指定标记位置markpos，现在需要返回到标记位置重新读数据，只需调用reset方法即可，会将当前位置pos指向标记位置。

    public synchronized void reset() throws IOException {
		getBufIfOpen(); // Cause exception if closed
		if (markpos < 0)
			throw new IOException("Resetting to invalid mark");
		pos = markpos;
	}

PushbackInputStream中也是用缓存来处理流，提供了回退的方法unread。功能和上面的类似，实现相对简单:

public void unread(byte[] b, int off, int len) throws IOException {
    ensureOpen();
	if (len > pos) {
	    throw new IOException("Push back buffer is full");
	}
	pos -= len;
	System.arraycopy(b, off, buf, pos, len);
}

缓存的好处还在于不用频繁的进行IO操作，大部分时候这些操作其实是对缓存变量buffer进行的操作。

Reader/Writer类和InputStream/OutputStream的类相似，只不过处理的是字符。

多了StringReader，数据源为String类型的对象。

LineNumberReader继承了BufferReader，提供了行数的统计，而BufferReader除了提供与BufferInputStream类似提供了mark、reset的功能，还有一个readLine()的方法，可以整行整行的读数据，返回值是String。

	String readLine(boolean ignoreLF) throws IOException {
		StringBuffer s = null;
		int startChar;
		boolean omitLF = ignoreLF || skipLF;

		synchronized (lock) {
			ensureOpen();

			bufferLoop: for (;;) {

				if (nextChar >= nChars)
					fill();
				if (nextChar >= nChars) { /* EOF */
					if (s != null && s.length() > 0)
						return s.toString();
					else
						return null;
				}
				boolean eol = false;
				char c = 0;
				int i;

				/* Skip a leftover '\n', if necessary */
				if (omitLF && (cb[nextChar] == '\n'))
					nextChar++;
				skipLF = false;
				omitLF = false;

				charLoop: for (i = nextChar; i < nChars; i++) {
					c = cb[i];
					if ((c == '\n') || (c == '\r')) {
						eol = true;
						break charLoop;
					}
				}

				startChar = nextChar;
				nextChar = i;

				if (eol) {
					String str;
					if (s == null) {
						str = new String(cb, startChar, i - startChar);
					} else {
						s.append(cb, startChar, i - startChar);
						str = s.toString();
					}
					nextChar++;
					if (c == '\r') {
						skipLF = true;
					}
					return str;
				}

				if (s == null)
					s = new StringBuffer(defaultExpectedLineLength);
				s.append(cb, startChar, i - startChar);
			}
		}
	}

方法中调用的fill()方法，其实现和BufferInputStream中的类似。方法中还用了StringBuffer来做中转操作，StringBuffer效率与String的相比较快，因为StringBuffer对char数组进行了封装，来做缓存，append就是用System.arrayCopy对数组的拷贝。

woorh

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
[Java IO源码]缓冲流的实现原理

java.io包中的类大致可以分为：InputStream、OutputStream、Reader、Writer。InputStream/Reader可以理解为input from数据源，OutputStream/Writer可以理解为output to数据目的地。他们的前者处理的是字节，后者处理的是字符。而数据源则可能是来自硬盘上的文件、内存中的变量、网络的数据等等。(分类图参考ht
复制链接

扫一扫