Java基础-IO学习之字节流

最新推荐文章于 2023-08-21 19:43:32 发布

嘻嘻兮

最新推荐文章于 2023-08-21 19:43:32 发布

阅读量730

点赞数 1

分类专栏： Java基础系列文章标签： java io流源码

本文链接：https://blog.csdn.net/Wang_1997/article/details/52356084

版权

Java基础系列专栏收录该内容

25 篇文章 2 订阅

订阅专栏

IO流的概述

概念

IO流用来处理设备之间的数据传输

Java对数据的操作是通过流的方式

Java用于操作流的类都在IO包中

流按流向分为两种：输入流，输出流。

流按操作类型分为两种：

字节流 : 字节流可以操作任何数据,因为在计算机中任何数据都是以字节的形式存储的

字符流 : 字符流只能操作纯字符数据，比较方便。

IO流常用父类

字节流的抽象父类：

InputStream
OutputStream

字符流的抽象父类：

Reader
Writer

IO程序书写

使用前，导入IO包中的类

使用时，进行IO异常处理

使用后，释放资源

FileInputStream

public int read() throws IOException  一次读取一个字节

	public static void main(String[] args) throws IOException {
		FileInputStream fis = new FileInputStream("reader.txt");
		int b;
		while((b = fis.read()) != -1) {
			System.out.println(b);
		}
		fis.close();
	}

Q：思考，read()方法读取的是一个字节,为什么返回是int,而不是byte

因为字节输入流可以操作任意类型的文件,比如图片音频等,这些文件底层都是以二进制形式的存储的,如果每次读取都返回byte,有可能在读到中间的时候遇到111111111

那么这11111111是byte类型的-1,我们的程序是遇到-1就会停止不读了,后面的数据就读不到了,所以在读取的时候用int类型接收,如果11111111会在其前面补上

24个0凑足4个字节,那么byte类型的-1就变成int类型的255了这样可以保证整个数据读完,而结束标记的-1就是int类型

/*假设如下一串二进制文件
 * 00010100 00100100 01000001 11111111 0000100
 * 假设使用byte进行读取(一字节一字节读取),当读取到   11111111 时,这个就为byte类型的-1(计算机中采用补码运算),那么后面的就不会继续读了
 * 10000001    byte类型-1的原码
 * 11111110	   -1的反码
 * 11111111    -1的补码
 * 00000000 00000000 00000000 11111111	进行0扩展成int类型(255)
 * byte的范围是[-128,127],扩展后read()返回的数在[0,255]
 * -128 & 0xff = 128
 * -1 & 0xff = 255
 * 那该如何读取呢？
 * (byte)128 = -128
 * (byte)255 = -1
 * 00000000 00000000 00000000 11111111	->		11111111
 * 总结：
 *  FileInputStream的read方法在做类型提升(将byte提升为int)
 *  FileOutputStream的write的方法在做类型强转(将int强转为byte)
 */

FileOutputStream

public void write(int b) throws IOException   一次写出一个字节

	public static void main(String[] args) throws IOException {
		//如果没有write.txt,会创建出一个.有则清空内部数据
		FileOutputStream fos = new FileOutputStream("write.txt");
		fos.write(97);
		fos.write(98);
		fos.write(99);
		fos.close();
	}

write.txt 内数据为：

FileOutputStream追加数据

public FileOutputStream(String name,boolean append) throws FileNotFoundException
public FileOutputStream(File file,boolean append) throws FileNotFoundException

	public static void main(String[] args) throws IOException {
		FileOutputStream fos = new FileOutputStream("write.txt",true);
		fos.write(99);
		fos.write(100);
		fos.close();
	}

追加后数据为：

拷贝文件

简单拷贝

	public static void main(String[] args) throws IOException {
		FileInputStream fis = new FileInputStream("desk.jpg");	//创建输入流对象,关联desk.jpg
		FileOutputStream fos = new FileOutputStream("desk_copy.jpg");//创建输出流对象,关联desk_copy.jpg
		
		int b;
		while((b = fis.read()) != -1) {
			fos.write(b);
		}
		
		fis.close();
		fos.close();
	}

简单原理图分析：

可以发现，此字节流一次读写一个字节复制音频，弊端：效率太低！文件过大时，耗费时间会很多。

字节数组拷贝之available()方法

public void write(byte[] b) throws IOException  一次写出一个字节数组
public int read(byte[] b) throws IOException 一次读取一个字节数组
public int available() throws IOException 获取读的文件所有的字节个数

	public static void main(String[] args) throws IOException {
		FileInputStream fis = new FileInputStream("desk.jpg");	
		FileOutputStream fos = new FileOutputStream("desk_copy.jpg");
		byte[] arr = new byte[fis.available()];
		fis.read(arr);
		fos.write(arr);
		fis.close();
		fos.close();
	}

拷贝大文件时，效率快了很多。但是有可能会内存溢出，因为文件过大,开辟不了那么大空间

自定义小数组拷贝

public void write(byte[] b, int off, int len) throws IOException	写出有效的字节个数

	public static void main(String[] args) throws IOException {
		FileInputStream fis = new FileInputStream("desk.jpg");	
		FileOutputStream fos = new FileOutputStream("desk_copy.jpg");
		int len;
		byte[] arr = new byte[1024*8];
		//当到达读取文件末尾时返回-1,否则返回读取的个数
		while((len = fis.read(arr)) != -1) {
			fos.write(arr, 0, len);
		}
		fis.close();
		fos.close();
	}

BufferedInputStream和BufferOutputStream拷贝

缓冲思想

字节流一次读写一个数组的速度明显比一次读写一个字节的速度快很多

这是加入了数组这样的缓冲区效果，java本身在设计的时候，也考虑到了这样的设计思想(装饰设计模式)，所以提供了字节缓冲区流

BufferedInputStream

BufferedInputStream内置了一个缓冲区(数组)

从BufferedInputStream中读取一个字节时，BufferedInputStream会一次性从文件中读取8192个, 存在缓冲区中, 在返回给程序一个

程序再次读取时, 就不用找文件了, 直接从缓冲区中获取

直到缓冲区中所有的都被使用过, 才重新从文件中读取8192个

BufferedOutputStream

BufferedOutputStream也内置了一个缓冲区(数组)

程序向流中写出字节时, 不会直接写到文件, 先写到缓冲区中

直到缓冲区写满, BufferedOutputStream才会把缓冲区中的数据一次性写到文件里

	public static void main(String[] args) throws IOException {
		FileInputStream fis = new FileInputStream("desk.jpg");	
		BufferedInputStream bis = new BufferedInputStream(fis);
		FileOutputStream fos = new FileOutputStream("desk_copy.jpg");
		BufferedOutputStream bos = new BufferedOutputStream(fos);
		int b;
		//当到达读取文件末尾时返回-1,否则返回读取的个数
		while((b = bis.read()) != -1) {
			bos.write(b);//这里是一个一个读取和写入
		}
		bis.close();
		bos.close();
	}

有些人可能会蒙了，这也是一个一个读取，效率也会高？

简单原理图分析：

有没有感觉这个带Buffer的和我们之前写的自定义小数组拷贝实现很类似，那这两个哪个更快呢？

定义小数组如果是8192个字节大小和Buffered比较的话，定义小数组会略胜一筹,因为读和写操作的是同一个数组，而Buffered操作的是两个数组。

flush和close方法的区别

flush()方法：用来刷新缓冲区的,刷新后可以再次写出
close()方法：用来关闭流释放资源的的,如果是带缓冲区的流对象的close()方法,不但会关闭流,还会再关闭流之前刷新缓冲区,关闭后不能再写出

BufferedInputStream简单源码分析(JDK1.8)

先来了解两个方法：

    public synchronized void mark(int readlimit) {
        marklimit = readlimit;
        markpos = pos;
    }
    public synchronized void reset() throws IOException {
        getBufIfOpen(); // Cause exception if closed
        if (markpos < 0)
            throw new IOException("Resetting to invalid mark");
        pos = markpos;
    }

上面两个方法中可以看出，当使用mark方法后，markpos就等于pos(当前缓冲区位置)，在使用reset后，便把当前的pos指向markpos，这样就实现重复读（re-readthe same bytes)，那么readlimit干什么用？marklimit指最多能mark的字节长度，也就是从markpos位置到当前pos的最大长度。下面我们来讨论下几种情况

	protected int count;//缓冲区有效字节数(简单理解为读取文件字节个数)
    protected int pos;//缓冲区当前位置
    protected int markpos = -1;//初始值为-1,未被mark

/*
 * 当pos小于count时说明缓冲区数据未读完,直接从缓冲区内读取
 * 我们对当pos大于等于count时情况进行详细说明,此时数据已读取完，需要从文件中再次读取数据到缓冲区中
 * 1 未mark的情况,直接将缓冲区清空,pos指向0
 * 2 有mark的情况(进行的mark后那你不能直接把缓冲区给清空了,因为你可能需要进行reset操作)
 * 		2.1  若未将buf缓冲数组读取满(pos < buffer.length),那么直接在后面追加即可
 * 		2.2 若读取满(pos = buf.length),在考虑下面情况
 * 			2.2.1 markpos大于0,那么我们需要保留缓冲区中markpos后的数据,markpos之前的数据可以清空
 * 				我们将缓冲数组中markpos后的数据往前移动,这样缓冲数组后面可以继续追加
 * 			2.2.2 markpos等于0时,那么移动已经没有什么意思了,reset的内容为整个缓冲数组
 * 				2.2.2.1  失效问题,若marklimit此时小于buf.length,那么markpos就失效了,和初始化一样即可
 * 				2.2.2.2 此时便是最后一种情况,只能对buf进行扩容了
 */

根据源码说明

public
class BufferedInputStream extends FilterInputStream {
	private static int DEFAULT_BUFFER_SIZE = 8192; //默认缓冲区大小
	protected volatile byte buf[]; //缓冲区数组
    public BufferedInputStream(InputStream in) {
        this(in, DEFAULT_BUFFER_SIZE);
    }
    public BufferedInputStream(InputStream in, int size) {
        super(in);
        if (size <= 0) {
            throw new IllegalArgumentException("Buffer size <= 0");
        }
        buf = new byte[size];//创建缓冲区
    }
    public synchronized int read() throws IOException {
        if (pos >= count) {//当前缓冲区当前位置大于读取字节个数
            fill();//再次从文件中读取
            if (pos >= count)//若是还大于，那说明到达文件末尾
                return -1;//返回int -1
        }
        return getBufIfOpen()[pos++] & 0xff;
        //getBufIfOpen() 获取的为buf[],从该缓冲区中获取byte字节并向上提升为int返回
    }
    private void fill() throws IOException {
        byte[] buffer = getBufIfOpen();
        if (markpos < 0)//这里是未mark的情况
            pos = 0; 
        else if (pos >= buffer.length)//这里是2.2情况
            if (markpos > 0) {//这里是2.2.1情况  
                int sz = pos - markpos;//计算需要移动的位数
                System.arraycopy(buffer, markpos, buffer, 0, sz);
                pos = sz;//移动完毕后markpos就为0,pos就为sz了
                markpos = 0;
            } else if (buffer.length >= marklimit) {//这里是markpos失效的情况
                markpos = -1;   
                pos = 0;     
            } else if (buffer.length >= MAX_BUFFER_SIZE) {//判断是否能进行扩容
                throw new OutOfMemoryError("Required array size too large");
            } else {   //对buf进行扩容的情况
                int nsz = (pos <= MAX_BUFFER_SIZE - pos) ?
                        pos * 2 : MAX_BUFFER_SIZE;
                if (nsz > marklimit)
                    nsz = marklimit;
                byte nbuf[] = new byte[nsz];
                System.arraycopy(buffer, 0, nbuf, 0, pos);
                if (!bufUpdater.compareAndSet(this, buffer, nbuf)) {
                    throw new IOException("Stream closed");
                }
                buffer = nbuf;
            }
        count = pos;
        int n = getInIfOpen().read(buffer, pos, buffer.length - pos);
        if (n > 0)//n大于0,说明文件还未读取到尽头
            count = n + pos;
    }
    public void close() throws IOException {
        byte[] buffer;
        while ( (buffer = buf) != null) {
            if (bufUpdater.compareAndSet(this, buffer, null)) {
                InputStream input = in;
                in = null;
                if (input != null)//将流关闭
                    input.close();
                return;
            }
            // Else retry in case a new buf was CASed in fill()
        }
    }
}

BufferedOutputStream简单源码分析(JDK1.8)

public
class BufferedOutputStream extends FilterOutputStream {
    protected byte buf[];//缓冲数组
    protected int count;//这里的count为读取到buf内字节的个数,相等于上面BufferedInputStream的pos
    public BufferedOutputStream(OutputStream out) {
        this(out, 8192);//默认大小为8192
    }
    public BufferedOutputStream(OutputStream out, int size) {
        super(out);
        if (size <= 0) {
            throw new IllegalArgumentException("Buffer size <= 0");
        }
        buf = new byte[size];//分配缓冲区
    }
    private void flushBuffer() throws IOException {
        if (count > 0) {//当缓冲区内还有数据,将其全部写入文件中
            out.write(buf, 0, count);
            count = 0;
        }
    }
    public synchronized void write(int b) throws IOException {
    	//当缓冲区未读取满数据,则直接将其读取到缓冲区,否则进行刷新操作
        if (count >= buf.length) {
            flushBuffer();
        }
        buf[count++] = (byte)b;
    }
    //该方法是来自FilterOutputStream
    public void close() throws IOException {
    	//1.7版本(后面会讲解),刷新并关闭流
        try (OutputStream ostream = out) {
            flush();//调用flushBuffer()
        }
    }
}

流的标准异常处理

因为IO流一般是用来操作底层的，在这里面我们不能用try...catch(这样会把问题给隐藏掉了)，我们需要将问题向上暴露。

1.6版本及其以前

	public static void main(String[] args) throws IOException {
		FileInputStream fis = null;
		FileOutputStream fos = null;
		try {
			fis = new FileInputStream("read.txt");
			fos = new FileOutputStream("write.txt");
			int b;
			while((b = fis.read()) != -1) {
				fos.write(b);
			}
		} finally {
			try {
				if(fis != null)
					fis.close();
			}finally {
				if(fos != null)
					fos.close();
			}
		}
	}

首先为什么流关闭前需要判断是否为空，因为 new 流的时候会抛出异常，这样子的话fis(fos)就会为null了。

那么为什么关闭里面还须try...finally,因为close方法也会抛异常（假设你关闭水龙头时其头正好被拗断了），这里原则是能关闭一个是一个，fis关闭不成功，使用finally后我们还能关闭fos.

1.7版本及以后

	public static void main(String[] args) throws IOException {
		try(
				FileInputStream fis = new FileInputStream("read.txt");
				FileOutputStream fos = new FileOutputStream("write.txt");
			){
				int b;
				while((b = fis.read()) != -1) {
					fos.write(b);
				}
			}
	}

与1.6相比较，其去掉了finally的关闭流，在try后多了一对（）,在其内放入流对象，作用其内流对象会自动关闭。

AutoCloseable接口

在try()中创建的流对象必须实现了AutoCloseable这个接口,如果实现了,在try后面的{}(读写代码)执行后就会自动调用,流对象的close方法将流关掉

class MyClose implements AutoCloseable {
	@Override
	public void close() {
		System.out.println("自动关闭流");
	}
}

测试

	public static void main(String[] args) {
		try(
				MyClose myClose = new MyClose();
			){
				//TODO
			}
	}
	/*
	 * outPut: 自动关闭流
	 */

字节流读写中文问题

简单讲讲GBK和UTF-8

GBK包含全部中文字符，其文字编码是双字节来表示的，即不论中、英文字符均使用双字节来表示，只不过为区分中文，将其最高位都定成1。

UTF-8则包含全世界所有国家需要用到的字符。对英文使用8位（即一个字节），中文使用24位（三个字节）来编码。

	public static void main(String[] args) throws IOException {
		FileInputStream fis = new FileInputStream("read.txt");
//		byte[] arr = new byte[2];
		byte[] arr = new byte[3];
		int len;
		while((len = fis.read(arr)) != -1) {
			System.out.println(new String(arr,0,len));
		}		
		fis.close();
	}

read数据

一次读取两个字节

一次读取三个字节

可以看出来一次读取两个字节会乱码,三个字节便ok，可以推断出我这里是使用的UTF-8编码。世界你好总共12个字节，每次读取2个字节也正好打印了6行，只是这里无法进行识别了(这里相当于重新组合了)

那么你可能会想那我就每次都三个三个读呗，但是遇到下面这种情况怎么办？

输出结果

单中文出现的几率是比较少的，而且我们可能也不能控制来源数据。结果可以看出来在，前面的都可以正常识别，但是后面又乱码了。

上面讲了读，下面讲讲写入

	public static void main(String[] args) throws IOException {
		FileInputStream fis = new FileInputStream("read.txt");
		FileOutputStream fos = new FileOutputStream("write.txt");
		byte[] arr = new byte[2];
		int len;
		while((len = fis.read(arr)) != -1) {
			fos.write(arr, 0, len);
		}		
		fis.close();
		fos.close();
	}

可以发现（结果未截图），这样子的读写无论byte定义多大都是没问题的，因为这样是将所有字节都一起写入完后，再进行转编码，而上面的new String是分步进行转换就出现了问题，而当你将字符串写入时，必须将其转化为byte数组在写入

	public static void main(String[] args) throws IOException {
		FileOutputStream fos = new FileOutputStream("write.txt");
		fos.write("我读书少,你不要骗我".getBytes());
		fos.write("\r\n".getBytes());
		fos.close();
	}

看其结果，因为最后写入了一个回车换行，所以显示了两行。

总的来说字节流来操作中文(字符)还是挺麻烦的，所以有了字符流，可将其直接写出

总结

字节流读取中文的问题

字节流在读中文的时候有可能会读到半个中文,造成乱码

字节流写出中文的问题

字节流直接操作的字节,所以写出中文必须将字符串转换成字节数组

IO小习

将键盘录入的数据拷贝到当前项目下的text.txt文件中,键盘录入数据当遇到quit时就退出

	public static void main(String[] args) throws IOException {
		Scanner sc = new Scanner(System.in);
		FileOutputStream fos = null;
		try {
			fos = new FileOutputStream("text.txt");
			System.out.println("请输入:");
			while(true) {
				String line = sc.nextLine();
				if("quit".equals(line))
					break;
				fos.write(line.getBytes()); //写入输入数据
				fos.write("\r\n".getBytes()); //写入回车
			}
		} finally {
			try{
				if(fos != null) 
					fos.close();
			} finally {
				sc.close();
			}
		}
	}

嘻嘻兮

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Java基础-IO学习之字节流

IO流的概述概念IO流用来处理设备之间的数据传输Java对数据的操作是通过流的方式Java用于操作流的类都在IO包中流按流向分为两种：输入流，输出流。流按操作类型分为两种：字节流 : 字节流可以操作任何数据,因为在计算机中任何数据都是以字节的形式存储的字符流 : 字符流只能操作纯字符数据，比较方便
复制链接

扫一扫