BufferedReader源码分析与使用

最新推荐文章于 2022-02-14 12:54:42 发布

ac_dao_di

最新推荐文章于 2022-02-14 12:54:42 发布

阅读量1k

点赞数 1

分类专栏： jdk源码分析文章标签：源码

本文链接：https://blog.csdn.net/ac_dao_di/article/details/78011901

版权

jdk源码分析专栏收录该内容

1 篇文章 0 订阅

订阅专栏

1. 设计思路

    BufferedReader对各种Reader提供了缓存功能，这样可以避免多次读取底层IO（如FileReader），提高效率。默认的缓冲大小是8k，也就是每次读取都是8k为单位，这适用于大多数情况。另外，BufferedReader使用了装饰器模式，继承自Reader，并且构造函数需要Reader的子类入参。同时还提供了一行一行读取的功能readLine函数，这不是Reader中的方法，会把换行符（\r、\n、\r\n）去掉。
    设计思路是这样的，如果当前缓冲区没有数据，则调用底层reader去读取数据到缓冲区；如果有数据则直接读取。需要注意的是这里提供了mark和reset功能，也就是可以记住读取的位置，将来可以回滚，重新读取。这需要在读取数据时避免对缓冲区中的这部分数据覆盖掉，需要保存起来，同时保存的长度可以在mark的时候指定。调用reset可以回滚，但是必须mark过，而且mark过后读取的数据不能超过mark的时候设置的大小，否则会失效，调用reset会抛出异常。
    阅读源码，可以发现该源码的几个值得学习的点：

输入非法变量时，直接抛出运行时异常IllegalArgumentException，不做捕获声明。
除了可以在构造函数设置缓冲大小外，提供默认的缓冲大小8k，该默认大小对大部分情况比较合适。
将标记功能定义为全局常量，包括没标记、失效标记、无标记等。
使用装饰器对读取功能进行缓存，避免对源代码进行侵入。
使用底层的System.arrayCopy来复制数组，效率高。
处理不了的非受检异常声明到方法上，不做catch，交由调用者处理。
在构造函数中设置变量的初始化，没有在定义处初始化。
close方法调用后，不能再读，再读会直接抛出异常。
默认使用同步代码块，且锁住的对象是当前对象this，来防止并发读取出错。
一行一行读取时，如果读到空行，则line.length() == 0
如果不想换行符被去掉，则使用read(buf)的形式读取。

2. 源码注释和分析

package com.jessin.decorator;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.io.IOException;
import java.io.Reader;

/**
 * 这里将BufferedReader源码拷贝过来，改个名字就可以了
 * Created by zexin.guo on 17-9-17.
 */
public class MyBufferedReader extends Reader {
    private static final Logger LOGGER = LoggerFactory.getLogger(MyBufferedReader.class);
    private Reader in;
    // cb是缓冲区
    private char cb[];
    // nChars表示当前缓冲区的字符个数，nextChar表示下个可以读取的字符位置，这里命名不太好，应该命名为nextCharPos
    private int nChars, nextChar;

    // 与mark相关的标记，INVALIDATED表示无效的mark
    private static final int INVALIDATED = -2;
    // 表示没有mark，注意这里都是<0，大于等于零表示markedChar有效
    private static final int UNMARKED = -1;
    // 默认没有mark
    private int markedChar = UNMARKED;
    // 从markedChar开始可以读取的字符，包含markedChar这个位置
    private int readAheadLimit = 0;
    // 是否跳过\n，用于上一个是\r的情形
    private boolean skipLF = false;

    // 当调用mark时保存旧的skipLF，以便reset时恢复
    private boolean markedSkipLF = false;

    // 默认的缓冲区大小是8k
    private static int defaultCharBufferSize = 8192;
    // 默认的StringBuffer大小
    private static int defaultExpectedLineLength = 80;

    // 构造方法，指定缓冲区大小
    public MyBufferedReader(Reader in, int sz) {
        super(in);
        if (sz <= 0)
            throw new IllegalArgumentException("Buffer size <= 0");
        this.in = in;
        cb = new char[sz];
        nextChar = nChars = 0;
    }

    // 使用默认的缓冲区大小
    public MyBufferedReader(Reader in) {
        this(in, defaultCharBufferSize);
    }

    // 关闭后就不能再用了。每次读取前，都看是否已经被关闭了，关闭后再读会抛异常
    private void ensureOpen() throws IOException {
        if (in == null)
            throw new IOException("Stream closed");
    }

    // 从底层读取数据到缓冲区的函数。注意当markedChar有效时[markedChar, nextChar)必须保留，不能覆盖
    private void fill() throws IOException {
        int dst;
        // 没有标记的情况下，默认读到缓冲区0的位置，并赋值给nextChar，这也就是nextChar一直加加，却不越界的原因。
        // 缓冲数据读完了从头开始读，借助nextChar和nChars。
        if (markedChar <= UNMARKED) {
            /* No mark */
            dst = 0;
        } else {
            // 设置了markedChar，至少为0
            int delta = nextChar - markedChar;
            // 从markedChar开始读取的字符不能超过readAheadLimit上限，注意包含markedChar
            // 超过则直接失效，可以被覆盖，在调用reset会抛出异常
            if (delta >= readAheadLimit) {
                markedChar = INVALIDATED;
                readAheadLimit = 0;
                // 失效，从零开始读取，可能被覆盖
                dst = 0;
            } else {
                // 分两种情况，将[markedChar, nextChar)复制到[0, len)，然后可以覆盖的缓冲区的位置是[len, ...)
                if (readAheadLimit <= cb.length) {
                    // [cb + markedChar, cb + markedChar + delta) -> [cb, cb + delta)，同时从IO读取的起始位置是delta
                    System.arraycopy(cb, markedChar, cb, 0, delta);
                    // markedChar从零开始
                    markedChar = 0;
                    dst = delta;
                } else {
                    // 如果预留的位置比较大，则会扩大缓冲区，所以要注意readAheadLimit的设置
                    char ncb[] = new char[readAheadLimit];
                    // 比较底层的数组复制
                    System.arraycopy(cb, markedChar, ncb, 0, delta);
                    cb = ncb;
                    markedChar = 0;
                    dst = delta;
                }
                // 下一个可以读取的位置和缓冲区的个数均为delta
                nextChar = nChars = delta;
            }
        }

        int n;
        do {
            // 当dst=0时，一次读取8k
            n = in.read(cb, dst, cb.length - dst);
        } while (n == 0);
        if (n > 0) {
            // 新增的字符个数
            nChars = dst + n;
            nextChar = dst;
        }
    }

    // skipLF默认是false，只有readLine才会使用它，这里只要不混用read()(或者read(buf))和readLine()就不会跳过换行
    // 读取一个字符，返回字符的整数形式
    public int read() throws IOException {
        // 默认的锁是当前对象
        synchronized (lock) {
            ensureOpen();
            for (;;) {
                // 如果缓冲不够，则直接从底层读取，填充缓冲区，如果填充后还是没有数据，则表示读取到末尾
                if (nextChar >= nChars) {
                    fill();
                    if (nextChar >= nChars)
                        return -1;
                }
                // 如果是\n直接跳过，注意下一个\n会读取
                if (skipLF) {
                    skipLF = false;
                    if (cb[nextChar] == '\n') {
                        nextChar++;
                        continue;
                    }
                }
                return cb[nextChar++];
            }
        }
    }

    // 读取数据到一个外层数组中
    private int read1(char[] cbuf, int off, int len) throws IOException {
        // 如果缓冲区已经没有数据，重新读取
        if (nextChar >= nChars) {
            /* If the requested length is at least as large as the buffer, and
               if there is no mark/reset activity, and if line feeds are not
               being skipped, do not bother to copy the characters into the
               local buffer.  In this way buffered streams will cascade
               harmlessly. */
            if (len >= cb.length && markedChar <= UNMARKED && !skipLF) {
                return in.read(cbuf, off, len);
            }
            fill();
        }
        if (nextChar >= nChars) return -1;
        if (skipLF) {
            skipLF = false;
            // 跳过第一个换行符
            if (cb[nextChar] == '\n') {
                nextChar++;
                if (nextChar >= nChars)
                    fill();
                if (nextChar >= nChars)
                    return -1;
            }
        }
        // 读取缓冲数组到给外层数组，读取的个数不能超过缓冲区的个数
        int n = Math.min(len, nChars - nextChar);
        // [cb + nextChar, ...) -> [cbuf + off, ...)，共n个
        System.arraycopy(cb, nextChar, cbuf, off, n);
        nextChar += n;
        return n;
    }

    // 读取数据到外部数组中，如果外部数组越界，直接抛出数组越界异常，注意read1不处理越界的情形。
    public int read(char cbuf[], int off, int len) throws IOException {
        synchronized (lock) {
            ensureOpen();
            if ((off < 0) || (off > cbuf.length) || (len < 0) ||
                    ((off + len) > cbuf.length) || ((off + len) < 0)) {
                throw new IndexOutOfBoundsException();
            } else if (len == 0) {
                return 0;
            }

            // 分多次读取，直到末尾，或者达到len，n为实际读取的个数，并返回
            int n = read1(cbuf, off, len);
            if (n <= 0) return n;
            while ((n < len) && in.ready()) {
                int n1 = read1(cbuf, off + n, len - n);
                if (n1 <= 0) break;
                n += n1;
            }
            return n;
        }
    }


    // 读取一行，分割符号为\r、\n、\r\n
    String readLine(boolean ignoreLF) throws IOException {
        StringBuffer s = null;
        int startChar;

        synchronized (lock) {

            // 确保没有关闭
            ensureOpen();
            boolean omitLF = ignoreLF || skipLF;

            bufferLoop:
            for (;;) {

                if (nextChar >= nChars)
                    fill();
                // 达到末尾，返回锁读取的字符串，没有读到数据则返回null
                if (nextChar >= nChars) { /* EOF */
                    if (s != null && s.length() > 0)
                        return s.toString();
                    else
                        return null;
                }
                boolean eol = false;
                char c = 0;
                int i;

                // 跳过上一次读取遗留的\n，上一次读取到的是\r。结合下面可以发现分隔符为\r和\n和\r\n
                // \r后第一个字符若是\n，则跳过
                /* Skip a leftover '\n', if necessary */
                if (omitLF && (cb[nextChar] == '\n'))
                    nextChar++;
                // 只看第一个字符
                skipLF = false;
                omitLF = false;

                charLoop:
                // 从缓冲区读取到\n或者\r结束循环
                for (i = nextChar; i < nChars; i++) {
                    c = cb[i];
                    if ((c == '\n') || (c == '\r')) {
                        eol = true;
                        break charLoop;
                    }
                }

                startChar = nextChar;
                // nextChar此时指向\n或\r或缓冲区末尾
                nextChar = i;

                if (eol) {
                    String str;
                    if (s == null) {
                        str = new String(cb, startChar, i - startChar);
                    } else {
                        // 可能要遍历多次才能到达\n或\r
                        s.append(cb, startChar, i - startChar);
                        str = s.toString();
                    }
                    // 对于\n或\r，必须跳过
                    nextChar++;
                    // 注意这里设置了skipLF，也就是上一个是\r，则如果下一个是\n必须跳过
                    if (c == '\r') {
                        skipLF = true;
                    }
                    return str;
                }
                // 如果没有找到换行符，则存到StringBuffer中
                if (s == null)
                    s = new StringBuffer(defaultExpectedLineLength); // 初始化size，内容为空
                s.append(cb, startChar, i - startChar);
            }
        }
    }

    // 新增的功能
    public String readLine() throws IOException {
        return readLine(false);
    }

    // 跳过n个字符不读，返回实际读取的字符个数
    public long skip(long n) throws IOException {
        if (n < 0L) {
            throw new IllegalArgumentException("skip value is negative");
        }
        synchronized (lock) {
            ensureOpen();
            long r = n;
            // 跳过n个字符，r表示还有多少个字符要跳过不读
            while (r > 0) {
                if (nextChar >= nChars)
                    fill();
                if (nextChar >= nChars) /* EOF */
                    break;
                if (skipLF) {
                    skipLF = false;
                    if (cb[nextChar] == '\n') {
                        nextChar++;
                    }
                }
                long d = nChars - nextChar;
                if (r <= d) {
                    nextChar += r;
                    r = 0;
                    break;
                }
                else {
                    r -= d;
                    nextChar = nChars;
                }
            }
            return n - r;
        }
    }

    // 判断是否有数据可读取
    public boolean ready() throws IOException {
        synchronized (lock) {
            ensureOpen();

            /*
             * If newline needs to be skipped and the next char to be read
             * is a newline character, then just skip it right away.
             */
            if (skipLF) {
                /* Note that in.ready() will return true if and only if the next
                 * read on the stream will not block.
                 */
                if (nextChar >= nChars && in.ready()) {
                    fill();
                }
                if (nextChar < nChars) {
                    if (cb[nextChar] == '\n')
                        nextChar++;
                    skipLF = false;
                }
            }
            return (nextChar < nChars) || in.ready();
        }
    }

    // 判断是否支持mark，支持
    public boolean markSupported() {
        return true;
    }

    // 保存当前要读取的位置，设置可以往下读的个数
    public void mark(int readAheadLimit) throws IOException {
        if (readAheadLimit < 0) {
            throw new IllegalArgumentException("Read-ahead limit < 0");
        }
        synchronized (lock) {
            ensureOpen();
            this.readAheadLimit = readAheadLimit;
            // 记住当前要读取的位置
            markedChar = nextChar;
            // 保存skipLF标记
            markedSkipLF = skipLF;
        }
    }

    // 回滚到最近一次mark的位置，如果没有mark过，或者mark后读取的字符超过上限，会抛出异常。
    public void reset() throws IOException {
        synchronized (lock) {
            ensureOpen();
            // 必须mark后才能调用reset，mark后如果读取的字符超过上限，也会失效，这里抛出异常
            if (markedChar < 0)
                throw new IOException((markedChar == INVALIDATED)
                        ? "Mark invalid"
                        : "Stream not marked");
            // 恢复nextChar和skipLF标记
            nextChar = markedChar;
            skipLF = markedSkipLF;
        }
    }

    // close后就不能再读了，否则会抛出异常
    public void close() throws IOException {
        synchronized (lock) {
            if (in == null)
                return;
            in.close();
            in = null;
            cb = null;
        }
    }
}

3. 测试代码

(1) 一行一行读取时，分隔符为\r、\n、\r\n，且会把分隔符去掉：

package com.jessin.decorator;

import java.io.CharArrayReader;
import java.io.IOException;

/**
 * Created by zexin.guo on 17-9-17.
 */
public class BufferTest1 {
    public static void main(String[] args) throws IOException {
        String str = "hello\r\n\n\n";
        MyBufferedReader myBufferedReader = new MyBufferedReader(new CharArrayReader(str.toCharArray()));
        String line;
        while((line = myBufferedReader.readLine()) != null){
            System.out.println("line : " + line + " len : " + line.length());
            /**
             * line : hello len : 5
             * line :  len : 0
             * line :  len : 0
             */
        }
    }
}

(2) mark时最多读取readAheadLimit - 1个字符reset时不会出异常。

package com.jessin.decorator;

import java.io.CharArrayReader;
import java.io.IOException;

/**
 * Created by zexin.guo on 17-9-17.
 */
public class BufferTest3 {
    public static void main(String[] args) throws IOException {
        String str = "0123456789";
        MyBufferedReader myBufferedReader = new MyBufferedReader(new CharArrayReader(str.toCharArray()));
        int aChar;
        // 如果readAheadLimit <= str.length()，将在reset时抛出异常
        // 读取readAheadLimit - 1个字符最保险
        int readAheadLimit = str.length() + 1;
        myBufferedReader.mark(readAheadLimit);
        // 读完后，会再去读一遍fill，将检测到nextChar - 0 > readAheadLimit，从而在rest时抛出异常
        while ((aChar = myBufferedReader.read()) != -1) {
            System.out.println((char) aChar);
        }
        // reset时抛出异常
        myBufferedReader.reset();
        System.out.println((char)myBufferedReader.read()); // 0
    }
}

(3) readLine和read混用，自动把\n去掉

package com.jessin.decorator;

import java.io.CharArrayReader;
import java.io.IOException;

/**
 * Created by zexin.guo on 17-9-17.
 */
public class BufferTest2 {
    public static void main(String[] args) throws IOException {
        String str = "hello\r\nworld";
        MyBufferedReader myBufferedReader = new MyBufferedReader(new CharArrayReader(str.toCharArray()));
        System.out.println(myBufferedReader.readLine());
        // 自动把\n去掉，也就是以\r\n为读取行的标志
        int aChar;
        while ((aChar = myBufferedReader.read()) != -1) {
            System.out.println((char) aChar);
        }
        /**
         * hello
         * w
         * o
         * r
         * l
         * d
         */
    }
}

ac_dao_di

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
BufferedReader源码分析与使用

1. 设计思路 BufferedReader对各种Reader提供了缓存功能，这样可以避免多次读取底层IO（如FileReader），提高效率。默认的缓冲大小是8k，也就是每次读取都是8k为单位，这适用于大多数情况。另外，BufferedReader使用了装饰器模式，继承自Reader，并且构造函数需要Reader的子类入参。同时还提供了一行一行读取的功能readLine函数，这不是Reade
复制链接

扫一扫