最常用
FileInputStream
FileInputStream fis = new FileInputStream(filePath); // filePath是自定义路径str
// 指定编码格式
BufferedReader br = new BufferedReader(new InputStreamReader(fis, "utf-8"));
String line = null;
while ((line = br.readLine()) != null) {
//System.out.println(line);
//break;
}
fis.close();
下面两种方法需要用Charset对象指定编码方法。Charset文档在 这里 搜索可得。
Stream
Charset c = Charset.forName("UTF-8");
//System.out.println(Charset.isSupported("UTF-8"));
try(Stream<String> stream = Files.lines(Paths.get(filePath), c)) {
List<String> raws = new ArrayList<>();
raws = stream.collect(Collectors.toList());
for (String s:raws) {
//System.out.println(s);
}
} catch(IOException e) {
e.printStackTrace();
}
实际上,对于stream,还可以在中间做一些处理:
//1. filter line 3
//2. convert all content to upper case
//3. convert it into a List
list = stream
.filter(line -> !line.startsWith("line3"))
.map(String::toUpperCase)
.collect(Collectors.toList());
传统方法
BufferedReader
Charset c = Charset.forName("UTF-8");
// 不能用字符串,要用Charset对象来指定编码方式
try(BufferedReader br = Files.newBufferedReader(Paths.get(filePath), c)) {
List<String> list = br.lines().collect(Collectors.toList());
for (String s : list) {
//System.out.println(s);
}
} catch (IOException e) {
e.printStackTrace();
}
Scanner
Scanner比BufferedReader更慢,因为Scanner对输入数据进行正则解析,而BufferedReader只是简单地读取字符序列。且BufferedReader的缓冲区大小为8KB,Scanner的缓冲区大小为1KB。
try(Scanner scanner = new Scanner(new File(filePath), "UTF-8")) {
while(scanner.hasNext()) {
//System.out.println(scanner.nextLine().toString());
}
} catch(IOException e) {
e.printStackTrace();
}
在我当前项目的测试中,读大概五千行内容时,如果需要逐行处理,速度是按文章顺序减慢的。所以,仍以前两种为主。