昨天我们已经,学习了字节流,下面我们首先通过字节流,完成如下功能:
1.向文本中写入数字或英文字符,然后将其读入内存,并在控制台上显示
2.向文本中写入中文字符,将其读入内存,并在控制台上显示
是否两次都能正确的显示呢?
对于包含中文字符的文本内容,并不能正确显示,因为,中文字符和英文字符不太一样,
一个中文字符对应的数值,通常使用多个字节的值表示,但用字节流读取时把一个中文字符拆分成多个字节,一个字节一个字节的显示出来了。
核心原因在于:数据单位不一致
因此,在某些场景下,使用字节流,来操作字符数据,不太方便
····················································································································································································································
关于编码表:
··························································································································································································································
一定千万要记住:对于字符或字符串数据才有所谓编解码 byte,short,int,boolean没有编解码
回顾:
编码: 字符数据 -> 数值(二进制字节值)
解码: 字符数据对应字节值 -> 字符数据
编解码对应java语言:
- 编码: 字符串对象上.getBytes(“编码所用字符集”) -> byte[]
- 解码: new String(byte[], int offset, int len, String charsetName)
- 在学习编码表(字符集)之前:
- 编码:字符串对象上.getBytes() -> byte[]
- 解码:new String(byte[], int offset, int len)
默认字符集:从开发者的角度,当我们自己在编解码的时候,当没有指定字符集的时候,jdk中默认使用的那个字符集
- a. 在idea中,UTF-8 是我们所使用的默认字符集
- b. 在原生情况下(操作系统下),默认字符集是GBK
常识:
- gbk字符集: 使用2个字节编码表示一个中文字符
- utf-8字符集: 使用3个字节编码表示一个中文字符
获取,系统中默认字符集,是被定义成了一个jvm系统属性:
Charset.defaultCharset();
private static void gbkCode() throws UnsupportedEncodingException {
String s = "你好";
// 编码 gbk
byte[] gbks = s.getBytes("gbk");
System.out.println(Arrays.toString(gbks));
//解码 gbk
String gbkStr = new String(gbks, 0, gbks.length, "gbk");
System.out.println(gbkStr);
}
··························································································································································································································
字节流中数据的逻辑单位是单个字节
字符流中数据的逻辑单位是单个字符对应的字节数据
所有的字符流都是包装流,都包装了一个字节流。其底层的数据传输都要基于字节流来完成。字符流本身只做了一件事情:编码和解码。
通过编码表相关知识我们可以得出以下结论:
字符的存储和传输天然与二进制数据密切相关
字符流需要在二进制的基础上,添加基于特定编码表的字符编解码
因此我们得出结论:
字符流 = 字节流 + 编码表(根据指定编码表,编解码的过程)
··························································································································································································································
- 字节流:输入流:InputStream 输出流:OutputStream。
- 字符流:输入流:Reader 输出流:Writer。
- 都是父类、抽象类
接下来,我们利用Writer向,文本中写入 中文 字符串。
但是考虑到Writer是抽象类,无法直接实例化,
于是我们使 用其子类 OutputStreamWriter。
OutputStreamWriter:
是字符流通向字节流的桥梁(编码)可使用指定的 charset 将要写入流中的字符编码成字节
它使用的字符集可以由名称指定或显式给定,否则将接受平台默认的字符集。
OutputStreamWriter的构造方法:
public OutputStreamWriter(OutputStream out)
public OutputStreamWriter(OutputStream out,String charsetName)
字符流的写方法:
-
void write(int c)
写入单个字符。
写入单个字符。要写入的字符包含在给定整数值的 16 个低位中,16 高位被忽略。 -
void write(char[] cbuf)
写入字符数组。 -
void write(char[] cbuf, int off, int len)
写入字符数组的某一部分。 -
void write(String str)
写入字符串。 -
void write(String str, int off, int len)
写入字符串的某一部分。
flush():刷新该流的缓冲。
对于字符输出流而言我们要注意:
字符输出流,自带了了一个小的缓冲区(该缓冲区是在编解码时使用,只是为了实现编解码操作,并不是为了实现传输效率(BufferedInputStream)),所以有可能我们通过,字符流的write方法,写入的数据还存在于,字符流自带的缓冲区中,没有真正写入底层流。 如何解决呢?
1. flush() 刷新该流的缓冲。
2. close() 关闭此流,但要先刷新它。
flush()的作用: 刷新缓冲,把缓冲区中的字符数据,写入底层流
close():关闭流,并释放系统资源(但是关闭流之前,先调用一次flush方法,刷新缓冲区一次)
// 转化流是一个包装流,只需要关闭转化流,它会自动关闭底层流。不需要try--catch
//类比同样是包装流的BufferedInputStream
writer.close();
private static void myCode() throws IOException {
Writer writer = new OutputStreamWriter(new FileOutputStream("F:\\test\\test07.txt"));
writer.write("I should try my best",0,10);
writer.write('你');
writer.flush();
writer.close();
}
··························································································································································································································
接着,我们使用Reader,将刚刚写入文本的中文数据,读取到内存中并显示。但是考虑到,Reader是抽象类,我们只能使用其子类 InputStreamReader。
InputStreamReader 是字节流通向字符流的桥梁(解码):它使用指定的 charset 读取字节并将其解码为字符。
- 它使用的字符集可以由名称指定或显式给定,或者可以接受平台默认的字符集。
- InputStreamReader的构造方法:
public InputStreamReader(InputStream in)
public InputStreamReader(InputStream in,String charsetName)
read方法:
-
public int read()
读取单个字符
作为整数读取的字符,范围在 0 到 65535 之间(两个字节) (0x00-0xffff),如果已到达流的末尾,则返回 -1 -
public int read(char[] cbuf)
1.从字符输入流中,读取一个字符数组的字符数据
2.读取的字符数,如果已到达流的末尾,则返回 -1 -
public int read(char[] cbuf,int off ,int len)
1.从字符输入流中,读取多个字符,这些字符,从字符数组的第offset开始的位置填充,最多填充len个字符
2.读取的字符数,如果已到达流的末尾,则返回 -1
private static void myCode() throws IOException {
Reader reader = new InputStreamReader(new FileInputStream("F://test//test06.txt"));
//1.
char c = (char)reader.read();
System.out.println(c);
//2.
char[] cs=new char[1024];
int len=reader.read(cs);
System.out.println(new String(cs,0,len));
//3.
int length=reader.read(cs,5,cs.length-5);
System.out.println(new String(cs,5,length));
}
··························································································································································································································
字符流的复制练习
把当前项目目录下的a.txt内容复制到d:\b.txt中
private static void myCopy(String srcpath, String destpath) throws IOException {
Reader reader = new InputStreamReader(new FileInputStream(srcpath));
Writer writer = new OutputStreamWriter(new FileOutputStream((destpath)));
char[] cs = new char[1024];
int len;
while ((len = reader.read(cs)) != -1) {
writer.write(cs, 0, len);
}
reader.close();
writer.close();
}
字符流完成: 字符流专门用来处理文本数据,除了文本数据之外的其他数据,一律用字节流来处理。
用字符流,复制图片和视频是否ok? 不行
- a. 对于图片和视频数据,它们都有自己特殊的编码格式,它们所使用的编码格式和我们之前讲的基于字符集,对字符数据进行的编解码,没有任何关系
- b. 所以,当我们使用字符输入流,来读取图片和视频数据的时候,当字符输入流试图对,图片和视频数据,进行基于字符集进行解码的时候,会发现有一些二进制数值,无法对应到字符集中的字符(遇到不认识的字符)
- c. 此时,字符输入流,要么丢弃这些不认识码值,要么把这些没有在字符集中匹配到的编码值替换成特殊编号对应的字符???
其实,这意味着,字符输入流,在读取视频或图片数据的时候,就已经修改了原来的视频和图片
··························································································································································································································
在使用转化流的时候,通常需要经过2步:
创建底层的字节流
包装字节流,指定字符集并完成转换流的创建
稍显麻烦,而大多数时候,我们所依赖的字符集都是本地字符集,所以,为了简化我们的书写,转换流提供了对应的子类。
不是说创建字符流对象时不需要字节流对象了,而是这样写系统帮我们简化了写法,但在底层仍然是之前的实现
FileWriter:
用来写入字符文件的便捷类。此类的构造方法假定默认字符编码和默认字节缓冲区大小都是可接受的。我们无法指定字符编码方式和缓冲区大小
FileWriter构造方法:
public FileWriter(String fileName) 根据给定的文件名构造一个 FileWriter 对象。
public FileWriter(String fileName, boolean append) //实现,对字符文件的追加写入
根据给定的文件名以及指示是否附加写入数据的 boolean 值来构造 FileWriter 对象
append - 一个 boolean 值,如果为 true,则将数据写入文件末尾处,而不是写入文件开始处。
FileReader:
用来读取字符文件的便捷类。此类的构造方法假定默认字符编码和默认字节缓冲区大小都是适当的
FileReader构造方法
public FileReader(String fileName)
public FileReader(File file)
注意:
字符流的简化类 VS 转化流
1. 字符流的简化类,创建该类对象,很方便,但是简化类不能指定,编解码所使用的字符集
2. 转化流,创建转化流对象,不太方便,但是转化流可以指定自己编解码所使用的字符集
FileWriter和FileReader实现文本文件的复制
public class simplify {
public static void main(String[] args) throws IOException {
// 创建流对象
Reader reader = new FileReader("a.txt");
Writer writer = new FileWriter("c.txt");
//复制文本文件
int len;
char[] charBuffer = new char[1024];
while ((len = reader.read(charBuffer)) != -1) {
writer.write(charBuffer, 0, len);
}
reader.close();
writer.close();
}
}
··························································································································································································································
同在字节流中引入缓冲流的原因相同,出于效率的考虑,在字符流中,我们同样引入缓冲流。
1. 出于字符数据传输效率的考虑
2. 缓冲流中定义了Reader和Writer中没有定义的方法
BufferedWriter构造方法:
BufferedWriter(Writer out)
创建一个使用默认大小输出缓冲区的缓冲字符输出流。
BufferedReader构造方法:
BufferedReader(Reader in)
创建一个使用默认大小输入缓冲区的缓冲字符输入流。
BufferedWriter
void newLine()
写入一个行分隔符
BufferedReader
String readLine()
读取一个文本行,该文本数据是包含该行内容的字符串,但是不包含任何行终止符,如:/r、/n
如果已到达流末尾,则返回 null
public static void main(String[] args) throws IOException {
//按行复制文本文件内容
//不能父类引用指向子类对象了,因为我要调用子类独有的方法
BufferedWriter bw = new BufferedWriter(
new OutputStreamWriter(new FileOutputStream("line.txt")));
BufferedReader br = new BufferedReader(
new InputStreamReader(new FileInputStream("a.txt")));
String lineStr;
// 按行复制文本文件内容
while ((lineStr = br.readLine()) != null) {
bw.write(lineStr);
//利用newLine()向一行数据的末尾,添加换行符
bw.newLine();
}
bw.close();
br.close();
}
··························································································································································································································
我们系统中默认的txt文件的编码方式是GBK,如果我们读取一个txt文件后,对其用UTF-8解码,得到的就是乱码。
标准输入输出流
-
标准输入流
System类中的字段in
代表了系统的标准输入,默认的输入设备是键盘
System.in的类型是InputStream,字节流 -
标准输出流
System类中的字段out
代表了系统的标准输出,默认的输出设备是显示器
System.out的类型是PrintStream,是OutputStream的子类,字节流
练习:利用System.in 完成Scanner的nextLine()的功能。 核心思路: 利用BufferedReader 的 readLine()方法
public static void main(String[] args) throws IOException {
//练习:利用System.in 完成Scanner的nextLine()的功能。 核心思路: 利用BufferedReader 的 readLine()方法
//BufferedReader需要传入一个字符流,而System.in是字节流
//InputStreamReader转化流正好可以将字节流转化为字符流
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
String s;
// 说明,对于标准输入流而言,其read方法是一个阻塞方法。
// 1. 当执行BufferedReader的readLine方法的时候,其实readLine还是基于System.in的read方法读取键盘输入数据的
// 2. 其实,当没有键盘输入数据的时候,此时执行标准输入流的read方法,read不会执行完,
// 而是在执行过程中的某一步阻塞,等着我们键盘输入数据
//如何让键盘输入的循环终止呢? 自己定义键盘输入的结束规则, 比如当我输入"886"
// "null" ""
while ((s = br.readLine()) != null) {
System.out.println("接收到了:" + s);
if ("886".equals(s)) {
break;
}
}
System.out.println("finished input");
br.close();
//myCode();
}
··························································································································································································································
作业:
- 从磁盘上读取一个文本文件(如某个java源代码),分别统计出文件中英文字母、空格、数字字符的个数。(注意文本文件中的数字指的是数字字符!!!)
package com.cskaoyan;
import java.io.*;
/**
* @author shihao
* @create 2020-04-29 10:43
* <p>
* 1.从磁盘上读取一个文本文件(如某个java源代码),
* 分别统计出文件中英文字母、空格、数字字符的个数。(注意文本文件中的数字指的是数字字符!!!)
*/
public class Homework01 {
public static void main(String[] args) throws IOException {
File file = new File("F:\\test\\test02.txt");
func(file);
}
private static void func(File file) throws IOException {
Reader reader = new FileReader(file);
int len;
char[] cs = new char[1024];
int countLetter = 0;
int countDigit = 0;
int countSpace = 0;
while ((len = reader.read(cs)) != -1) {
for (int i = 0; i < len; i++) {
if ((cs[i] > 'a' && cs[i] < 'z') || (cs[i] > 'A' && cs[i] < 'Z')) {
countLetter++;
} else if (cs[i]>'0'&&cs[i]<'9') {
countDigit++;
} else if(cs[i] == ' '){
countSpace++;
}
}
}
System.out.println("字母出现的次数:"+countLetter);
System.out.println("数字出现的次数:"+countDigit);
System.out.println("空格出现的次数:"+countSpace);
}
}
- 在一个磁盘的文件里保存26个英文小写字母(乱序),将他们读入内存中,进行排序,把排好顺序的数再重新追加写到磁盘的该文件中。
关于这道题:我们系统中默认的txt文件的编码方式是GBK,如果直接在操作系统中新建一个txt文件,编码方式是GBK。如果我们读取一个txt文件后,IDEA对其用UTF-8解码,得到的就是乱码。
所以,如何新建一个用UTF-8编码的文件?
打开test09.txt文件,文件----另存为-----编码方式-----UTF-8
package com.cskaoyan;
import java.io.*;
import java.util.Arrays;
/**
* @author shihao
* @create 2020-04-29 10:44
* <p>
* 2.在一个磁盘的文件里保存26个英文小写字母(乱序),
* 将他们读入内存中,进行排序,把排好顺序的数再重新追加写到磁盘的该文件中。
*/
public class Homework02 {
public static void main(String[] args) throws IOException {
File file = new File("F:\\test\\test09.txt");
func(file);
}
private static void func(File file) throws IOException {
BufferedReader br = new BufferedReader(new InputStreamReader
(new FileInputStream(file)));
BufferedWriter bw = new BufferedWriter(new OutputStreamWriter
(new FileOutputStream(file, true)));
int len;
char[] cs = new char[1024];
while ((len = br.read(cs)) != -1) {
System.out.println(len);
bw.newLine();
//冒泡排序
char[] ret;
ret = boubleSort(cs, len);
bw.write(ret, 0, len);
}
br.close();
bw.close();
}
private static char[] boubleSort(char[] cs, int len) {
char temp;
boolean flag = false;
for (int i = len - 1; i >= 1; i--) {
flag = false;
for (int j = 1; j <= i; j++) {
if (cs[j] < cs[j - 1]) {
temp = cs[j - 1];
cs[j - 1] = cs[j];
cs[j] = temp;
flag = true;
}
}
if (flag == false)
break;
}
return cs;
}
}
老师代码:
package com.cskaoyan;
import java.io.*;
import java.util.Arrays;
/**
* @author shihao
* @create 2020-04-29 12:19
*/
public class test {
public static void main(String[] args) {
Reader reader = null;
Writer writer = null;
try {
reader = new FileReader("F:\\test\\test09.txt");
//题目中指明了是26个英文小写字母
char[] buffer = new char[26];
reader.read(buffer);
//利用Arrays工具类的sort方法,对buffer数组中的英文字符排序
//注意,英文字符的顺序,恰好和英文字符对应的编码值对应a最小,z最大
Arrays.sort(buffer);
System.out.println(Arrays.toString(buffer));
//实现追加写入
writer = new FileWriter("F:\\test\\test09.txt", true);
//先写入一个换行符
writer.write('\n');
//在一次写入排好序的数组
writer.write(buffer);
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
//关闭输入流
try {
reader.close();
} catch (IOException e) {
e.printStackTrace();
}
//关闭输出流
try {
writer.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}