Java 大文件读取_java linenumberreader 大文件读取-CSDN博客

本文链接：https://blog.csdn.net/wawawawawawaa/article/details/120950627

背景：传统的应用程序在读取文件的时候如下：

Files.readLines(new File(path), Charsets.UTF_8); 
FileUtils.readLines(new File(path));

实际上是使用BufferedReader或者其子类LineNumberReader来读取的。

上面这样读取文件会有什么问题呢？上面的方式是将文件读进内存进行处理，如果文件比较大的话，会发生程序的oom错误。

解决方法：

1. 使用文件流方式；一行一行地读取

FileInputStream inputStream = null; 
Scanner sc = null; 
try { 
 inputStream = new FileInputStream(path); 
 sc = new Scanner(inputStream, UTF-8); 
 while (sc.hasNextLine()) {
  String line = sc.nextLine(); 
  // System.out.println(line); 
  } 
}catch(IOException e){
  logger.error(e);
}finally {
  if (inputStream != null) { 
  inputStream.close(); 
  } 
  if (sc != null) {
    sc.close();
   }
}

该方案将会遍历文件中的所有行，允许对每一行进行处理，而不保持对它的引用。总之没有把它们存放在内存中！

2. Apache Commons IO流：使用Commons IO库实现，利用该库提供的自定义LineIterator

LineIterator it = FileUtils.lineIterator(theFile, UTF-8); 
try {
 while (it.hasNext()) {
 String line = it.nextLine(); 
 // do something with line 
  } 
} finally {
 LineIterator.closeQuietly(it);
}

该方案由于整个文件不是全部存放在内存中，这也就导致相当保守的内存消耗。