InputStream:得到的是字节输入流,InputStream.read("filename")之后,得到字节流
Reader:读取的是字符流
InputStreamReader:从字节到字符的桥梁。InputStreamReader(InputStream.read("filename"));
reader.read(InputStreamReader(InputStream in));便可从字节变为字符,打印显示了。
java.io.Reader 和 java.io.InputStream 组成了Java 输入类。
Reader 用于读入16位字符,也就是Unicode 编码的字符;而 InputStream 用于读入 ASCII 字符和二进制数据。
Reader支持16位的Unicode字符输出,
InputStream支持8位的字符输出。
Reader和InputStream分别是I/O库提供的两套平行独立的等级机构,
1byte = 8bits
InputStream、OutputStream是用来处理8位元的流,
Reader、Writer是用来处理16位元的流。
而在JAVA语言中,byte类型是8位的,char类型是16位的,所以在处理中文的时候需要用Reader和Writer。
值得说明的是,在这两种等级机构下,还有一道桥梁InputStreamReader、OutputStreamWriter负责进行InputStream到Reader的适配和由OutputStream到Writer的适配。
在 Java中,有不同类型的 Reader 输入流对应于不同的数据源:
FileReader 用于从文件输入; CharArrayReader 用于从程序中的字符数组输入; StringReader 用于从程序中的字符串输入; PipedReader 用于读取从另一个线程中的 PipedWriter 写入管道的数据。
相应的也有不同类型的 InputStream 输入流对应于不同的数据源:FileInputStream,ByteArrayInputStream,StringBufferInputStream,PipedInputStream。
另外,还有两种没有对应 Reader 类型的 InputStream 输入流: Socket 用于套接字; URLConnection 用于 URL 连接。 这两个类使用 getInputStream() 来读取数据。
相应的,java.io.Writer 和 java.io.OutputStream 也有类似的区别。
===================================================
InputStreamReader源码如下,从源码我们可以看出,InputStreamReader是StreamDecoder的代理类,调用InputStreamReader实际上是调用StreamDecoder。
package java.io;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import sun.nio.cs.StreamDecoder;
/**
* An InputStreamReader is a bridge from byte streams to character streams: It
* reads bytes and decodes them into characters using a specified {@link
* java.nio.charset.Charset <code>charset</code>}. The charset that it uses
* may be specified by name or may be given explicitly, or the platform's
* default charset may be accepted.
*
* <p> Each invocation of one of an InputStreamReader's read() methods may
* cause one or more bytes to be read from the underlying byte-input stream.
* To enable the efficient conversion of bytes to characters, more bytes may
* be read ahead from the underlying stream than are necessary to satisfy the
* current read operation.
*
* <p> For top efficiency, consider wrapping an InputStreamReader within a
* BufferedReader. For example:
*
* <pre>
* BufferedReader in
* = new BufferedReader(new InputStreamReader(System.in));
* </pre>
*
* @see BufferedReader
* @see InputStream
* @see java.nio.charset.Charset
*
* @author Mark Reinhold
* @since JDK1.1
*/
public class InputStreamReader extends Reader {
private final StreamDecoder sd;
/**
* Creates an InputStreamReader that uses the default charset.
*
* @param in An InputStream
*/
public InputStreamReader(InputStream in) {
super(in);
try {
sd = StreamDecoder.forInputStreamReader(in, this, (String)null); // ## check lock object
} catch (UnsupportedEncodingException e) {
// The default encoding should always be available
throw new Error(e);
}
}
/**
* Creates an InputStreamReader that uses the named charset.
*
* @param in
* An InputStream
*
* @param charsetName
* The name of a supported
* {@link java.nio.charset.Charset </code>charset<code>}
*
* @exception UnsupportedEncodingException
* If the named charset is not supported
*/
public InputStreamReader(InputStream in, String charsetName)
throws UnsupportedEncodingException
{
super(in);
if (charsetName == null)
throw new NullPointerException("charsetName");
sd = StreamDecoder.forInputStreamReader(in, this, charsetName);
}
/**
* Creates an InputStreamReader that uses the given charset. </p>
*
* @param in An InputStream
* @param cs A charset
*
* @since 1.4
* @spec JSR-51
*/
public InputStreamReader(InputStream in, Charset cs) {
super(in);
if (cs == null)
throw new NullPointerException("charset");
sd = StreamDecoder.forInputStreamReader(in, this, cs);
}
/**
* Creates an InputStreamReader that uses the given charset decoder. </p>
*
* @param in An InputStream
* @param dec A charset decoder
*
* @since 1.4
* @spec JSR-51
*/
public InputStreamReader(InputStream in, CharsetDecoder dec) {
super(in);
if (dec == null)
throw new NullPointerException("charset decoder");
sd = StreamDecoder.forInputStreamReader(in, this, dec);
}
/**
* Returns the name of the character encoding being used by this stream.
*
* <p> If the encoding has an historical name then that name is returned;
* otherwise the encoding's canonical name is returned.
*
* <p> If this instance was created with the {@link
* #InputStreamReader(InputStream, String)} constructor then the returned
* name, being unique for the encoding, may differ from the name passed to
* the constructor. This method will return <code>null</code> if the
* stream has been closed.
* </p>
* @return The historical name of this encoding, or
* <code>null</code> if the stream has been closed
*
* @see java.nio.charset.Charset
*
* @revised 1.4
* @spec JSR-51
*/
public String getEncoding() {
return sd.getEncoding();
}
/**
* Reads a single character.
*
* @return The character read, or -1 if the end of the stream has been
* reached
*
* @exception IOException If an I/O error occurs
*/
public int read() throws IOException {
return sd.read();
}
/**
* Reads characters into a portion of an array.
*
* @param cbuf Destination buffer
* @param offset Offset at which to start storing characters
* @param length Maximum number of characters to read
*
* @return The number of characters read, or -1 if the end of the
* stream has been reached
*
* @exception IOException If an I/O error occurs
*/
public int read(char cbuf[], int offset, int length) throws IOException {
return sd.read(cbuf, offset, length);
}
/**
* Tells whether this stream is ready to be read. An InputStreamReader is
* ready if its input buffer is not empty, or if bytes are available to be
* read from the underlying byte stream.
*
* @exception IOException If an I/O error occurs
*/
public boolean ready() throws IOException {
return sd.ready();
}
public void close() throws IOException {
sd.close();
}
}
由此看出,StreamDecoder才是字节转字符的真正适配器类。
同时,我们还要把目光放到InputStreamReader的read(char cbuf[], int offset, int length)方法,因为这个方法是适配方法,它最终会调用到InputStream的read(byte b[], int off, int len)方法。这2个方法的唯一不同是,适配方法的第一个参数是char数组,而被适配方法的第一个参数是byte数组。
我们将通过源代码来证明适配方法将最终调用到被适配方法。
下面我们在来看StreamDecoder的实现:
1、在执行StreamDecoder的构造函数初始化时,分配字节缓冲区ByteBuffer并赋给成员变量bb,根据传入的charset参数生成相应的CharsetDecoder,并赋给成员变量decoder;
2、有两种方式实现字节转字符:基于字节输入流的通道的;直接使用字节输入流的。根据channelsAvailable静态线程可见(static volatile)变量判断使用哪种方式;
以直接使用字节输入流为例:
3、从字节输入流将数据“读进”Bytebuffer。如果是直接使用字节输入流,则从字节输入流将数据“读进”Bytebuffer的底层字节数组;
4、创建CharBuffer。将InputStreamReader的read()方法中传入的字符数组参数封装成CharBuffer,将该字符数组作为CharBuffer的底层数组。
5、调用CharsetDecoder的decode(BtyeBuffer in, CharBuffer out, boolean endofinput)方法,从BtyeBuffer中读取字节,并转换成字符写入CharBuffer。
至此,适配完成。
package sun.nio.cs;
import java.io.*;
import java.nio.*;
import java.nio.channels.*;
import java.nio.charset.*;
public class StreamDecoder extends Reader
{
private static final int MIN_BYTE_BUFFER_SIZE = 32; //最小的字节缓冲区大小
private static final int DEFAULT_BYTE_BUFFER_SIZE = 8192; //默认的字节缓冲区大小
private volatile boolean isOpen = true; //多线程可见量 校验字节流是否打开
private void ensureOpen() throws IOException {
if (!isOpen)
throw new IOException("Stream closed"); //如果流没打开,抛出流已关闭的异常
}
// In order to handle surrogates properly we must never try to produce
// fewer than two characters at a time. If we're only asked to return one
// character then the other is saved here to be returned later.
//为了保证读入的字符不乱码 则每次读入不能少于两个字符,如果只想要返回一个字符 那么可以保存左侧字符 等到下次返回
private boolean haveLeftoverChar = false;
private char leftoverChar;
// Factories for java.io.InputStreamReader
public static StreamDecoder forInputStreamReader(InputStream in,
Object lock,
String charsetName)
throws UnsupportedEncodingException
{
String csn = charsetName; //设置字符集
if (csn == null)
csn = Charset.defaultCharset().name(); //如果字符集为空,使用默认字符集
try {
if (Charset.isSupported(csn)) //判断JVM是否支持该字符集
return new StreamDecoder(in, lock, Charset.forName(csn));
} catch (IllegalCharsetNameException x) { }
throw new UnsupportedEncodingException (csn);
}
public static StreamDecoder forInputStreamReader(InputStream in,
Object lock,
Charset cs)
{
return new StreamDecoder(in, lock, cs); //包括下面几个方法其实都是调用构造函数
}
public static StreamDecoder forInputStreamReader(InputStream in,
Object lock,
CharsetDecoder dec)
{
return new StreamDecoder(in, lock, dec);
}
// Factory for java.nio.channels.Channels.newReader
public static StreamDecoder forDecoder(ReadableByteChannel ch,
CharsetDecoder dec,
int minBufferCap)
{
return new StreamDecoder(ch, dec, minBufferCap);
}
// -- Public methods corresponding to those in InputStreamReader --
// All synchronization and state/argument checking is done in these public
// methods; the concrete stream-decoder subclasses defined below need not
// do any such checking.
//所有的同步和状态检查在这些公共方法中完成,具体的StreamDecoder子类不需要再去做这些检查
public String getEncoding() {
if (isOpen())
return encodingName();
return null;
}
public int read() throws IOException {
return read0();
}
private int read0() throws IOException {
synchronized (lock) {
// Return the leftover char, if there is one
if (haveLeftoverChar) { //如果只有一个(左侧高位)字符 则直接返回 并haveLeftoverChar置位false
haveLeftoverChar = false;
return leftoverChar;
}
// Convert more bytes
char cb[] = new char[2];
int n = read(cb, 0, 2);
switch (n) {
case -1:
return -1; //读取完毕 返回-1
case 2:
leftoverChar = cb[1];
haveLeftoverChar = true;
// FALL THROUGH
case 1:
return cb[0];
default:
assert false : n;
return -1;
}
}
}
public int read(char cbuf[], int offset, int length) throws IOException {
int off = offset;
int len = length;
synchronized (lock) {
ensureOpen();
if ((off < 0) || (off > cbuf.length) || (len < 0) ||
((off + len) > cbuf.length) || ((off + len) < 0)) {
throw new IndexOutOfBoundsException();
}
if (len == 0)
return 0;
int n = 0;
if (haveLeftoverChar) {
// Copy the leftover char into the buffer
cbuf[off] = leftoverChar;
off++; len--;
haveLeftoverChar = false;
n = 1;
if ((len == 0) || !implReady())
// Return now if this is all we can produce w/o blocking
return n;
}
if (len == 1) {
// Treat single-character array reads just like read()
int c = read0();
if (c == -1)
return (n == 0) ? -1 : n;
cbuf[off] = (char)c;
return n + 1;
}
return n + implRead(cbuf, off, off + len);
}
}
public boolean ready() throws IOException {
synchronized (lock) {
ensureOpen();
return haveLeftoverChar || implReady();
}
}
public void close() throws IOException {
synchronized (lock) {
if (!isOpen)
return;
implClose();
isOpen = false;
}
}
private boolean isOpen() {
return isOpen;
}
// -- Charset-based stream decoder impl --
// In the early stages of the build we haven't yet built the NIO native
// code, so guard against that by catching the first UnsatisfiedLinkError
// and setting this flag so that later attempts fail quickly.
//
private static volatile boolean channelsAvailable = true; //判断通道是否可用
private static FileChannel getChannel(FileInputStream in) {
if (!channelsAvailable)
return null;
try {
return in.getChannel(); //获取字节输入流的通道
} catch (UnsatisfiedLinkError x) {
channelsAvailable = false;
return null;
}
}
private Charset cs; //字符集
private CharsetDecoder decoder; //字符解码器 可以将一个字节序列按照特定的字符集转换成一个16位的Unicode序列
private ByteBuffer bb; //字节缓冲区
// Exactly one of these is non-null
private InputStream in; //被转换的字节输入流
private ReadableByteChannel ch; //可读取的字节通道
StreamDecoder(InputStream in, Object lock, Charset cs) {
this(in, lock,
cs.newDecoder() //根据字符集生成相应的解码器
.onMalformedInput(CodingErrorAction.REPLACE) //设置读入数据时编码字符集错误时的响应-设置为替换错误字符
.onUnmappableCharacter(CodingErrorAction.REPLACE)); //设置读入数据时不兼容的字符内容时的响应-设置为替换错误字符
}
StreamDecoder(InputStream in, Object lock, CharsetDecoder dec) {
super(lock);
this.cs = dec.charset(); //将字符集设置给成员变量cs
this.decoder = dec; //将解码器设置给成员变量dec
// This path disabled until direct buffers are faster
if (false && in instanceof FileInputStream) {
ch = getChannel((FileInputStream)in);
if (ch != null)
bb = ByteBuffer.allocateDirect(DEFAULT_BYTE_BUFFER_SIZE);//如果通道存在,分配直接缓冲区
}
if (ch == null) {
this.in = in;
this.ch = null;
bb = ByteBuffer.allocate(DEFAULT_BYTE_BUFFER_SIZE); //分配字节缓冲区
}
bb.flip(); // So that bb is initially empty
}
StreamDecoder(ReadableByteChannel ch, CharsetDecoder dec, int mbc) {
this.in = null;
this.ch = ch;
this.decoder = dec;
this.cs = dec.charset();
this.bb = ByteBuffer.allocate(mbc < 0
? DEFAULT_BYTE_BUFFER_SIZE
: (mbc < MIN_BYTE_BUFFER_SIZE
? MIN_BYTE_BUFFER_SIZE
: mbc));
bb.flip();
}
private int readBytes() throws IOException {
bb.compact(); //清除BtyeBuffer中已经被读取的数据,将ByteBuffer转换为写模式,为将数据读进ByteBuffer做准备
try {
if (ch != null) {
// Read from the channel
int n = ch.read(bb); //调用字节输入流in的通道的read方法,将数据从通道读出并写进(简称“读进”)ByteBuffer中。
//注意不要被“read”误导,这里ByteBuffer进入写模式
if (n < 0)
return n;
} else {
// Read from the input stream, and then update the buffer 将数据从输入流读进ByteBuffer中,ByteBuffer进入写模式
int lim = bb.limit();//获取write模式下可以将数据写入ByteBuffer的最大位置,此时相当于Opacity属性
int pos = bb.position();//获取write模式下ByteBuffer中的下一个数据可写入位置
assert (pos <= lim);
int rem = (pos <= lim ? lim - pos : 0); //获取ByteBuffer中可写入的空间的长度
assert rem > 0;
int n = in.read(bb.array(), bb.arrayOffset() + pos, rem);//重点,调用被适配对象-字节输入流in的read方法
//bb.array()获取BtyeBuffer的底层数组,
//arrayIndex=arrayOffset() + pos,即获取底层数组的下一个数据可写入的数组下标
//该公式在api文档中有说明,https://docs.oracle.com/javase/8/docs/api/java/nio/ByteBuffer.html#arrayOffset--
//这条语句的意思是将数据从输入流“读进”ByteBuffer中,直到ByteBuffer的底层数组的可写入空间被填满,返回读进ByteBuffer的字节数
if (n < 0)
return n;
if (n == 0)
throw new IOException("Underlying input stream returned zero bytes");
assert (n <= rem) : "n = " + n + ", rem = " + rem;
bb.position(pos + n); //重置ByteBuffer的position属性为当前的写入位置
}
} finally {
// Flip even when an IOException is thrown,
// otherwise the stream will stutter
bb.flip(); //最后将ByteBuffer转换为读模式
}
int rem = bb.remaining(); //返回ByteBuffer的当前position属性和limit属性之间的元素长度
assert (rem != 0) : rem;
return rem;
}
int implRead(char[] cbuf, int off, int end) throws IOException {
// In order to handle surrogate pairs, this method requires that
// the invoker attempt to read at least two characters. Saving the
// extra character, if any, at a higher level is easier than trying
// to deal with it here.
assert (end - off > 1);
CharBuffer cb = CharBuffer.wrap(cbuf, off, end - off); //将字符数组参数封装成CharBuffer,将该字符数组作为CharBuffer的底层数组
//参数off将作为charBuffer的当前position属性值,end-off将作为CharBuffer的可用空间长度
if (cb.position() != 0)
// Ensure that cb[0] == cbuf[off]
cb = cb.slice(); //复制一个新的CharBuffer,而它的position属性值将为0
boolean eof = false;
for (;;) {
CoderResult cr = decoder.decode(bb, cb, eof);//从BtyeBuffer中读取字节,并转换成字符写入CharBuffer
if (cr.isUnderflow()) {
if (eof)
break;
if (!cb.hasRemaining())
break;
if ((cb.position() > 0) && !inReady())
break; // Block at most once
int n = readBytes();
if (n < 0) {
eof = true;
if ((cb.position() == 0) && (!bb.hasRemaining()))
break;
decoder.reset();
}
continue;
}
if (cr.isOverflow()) {
assert cb.position() > 0;
break;
}
cr.throwException();
}
if (eof) {
// ## Need to flush decoder
decoder.reset();
}
if (cb.position() == 0) {
if (eof)
return -1;
assert false;
}
return cb.position(); //返回CharBuffer中下一个数据将写入的位置
}
String encodingName() {
return ((cs instanceof HistoricallyNamedCharset)
? ((HistoricallyNamedCharset)cs).historicalName()
: cs.name());
}
private boolean inReady() {
try {
return (((in != null) && (in.available() > 0))
|| (ch instanceof FileChannel)); // ## RBC.available()?
} catch (IOException x) {
return false;
}
}
boolean implReady() {
return bb.hasRemaining() || inReady();
}
void implClose() throws IOException {
if (ch != null)
ch.close();
else
in.close();
}
}