如何使用Java逐行读取大文本文件？

最新推荐文章于 2024-06-03 10:32:34 发布

p15097962069

最新推荐文章于 2024-06-03 10:32:34 发布

阅读量2k

点赞数

文章标签： java performance file-io io garbage-collection

原文链接：https://de.sofbug.com/question/Ocd7

版权

我需要使用Java逐行读取大约5-6 GB的大型文本文件。

我如何快速做到这一点？

#1楼

这是一个示例，该示例具有完整的错误处理并支持Java 7之前的字符集规范。使用Java 7，您可以使用try-with-resources语法，从而使代码更简洁。

如果只需要默认字符集，则可以跳过InputStream并使用FileReader。

InputStream ins = null; // raw byte-stream
Reader r = null; // cooked reader
BufferedReader br = null; // buffered for readLine()
try {
    String s;
    ins = new FileInputStream("textfile.txt");
    r = new InputStreamReader(ins, "UTF-8"); // leave charset out for default
    br = new BufferedReader(r);
    while ((s = br.readLine()) != null) {
        System.out.println(s);
    }
}
catch (Exception e)
{
    System.err.println(e.getMessage()); // handle exception
}
finally {
    if (br != null) { try { br.close(); } catch(Throwable t) { /* ensure close happens */ } }
    if (r != null) { try { r.close(); } catch(Throwable t) { /* ensure close happens */ } }
    if (ins != null) { try { ins.close(); } catch(Throwable t) { /* ensure close happens */ } }
}

这是Groovy版本，具有完整的错误处理：

File f = new File("textfile.txt");
f.withReader("UTF-8") { br ->
    br.eachLine { line ->
        println line;
    }
}

#2楼

java-8发布后（2014年3月），您将可以使用流：

try (Stream<String> lines = Files.lines(Paths.get(filename), Charset.defaultCharset())) {
  lines.forEachOrdered(line -> process(line));
}

打印文件中的所有行：

try (Stream<String> lines = Files.lines(file, Charset.defaultCharset())) {
  lines.forEachOrdered(System.out::println);
}

#3楼

在Java 8中，您可以执行以下操作：

try (Stream<String> lines = Files.lines (file, StandardCharsets.UTF_8))
{
    for (String line : (Iterable<String>) lines::iterator)
    {
        ;
    }
}

一些注意事项： Files.lines返回的流（与大多数流不同）需要关闭。由于这里提到的原因，我避免使用forEach() 。奇怪的代码(Iterable<String>) lines::iterator将Stream转换为Iterable。

#4楼

在Java 7中：

String folderPath = "C:/folderOfMyFile";
Path path = Paths.get(folderPath, "myFileName.csv"); //or any text file eg.: txt, bat, etc
Charset charset = Charset.forName("UTF-8");

try (BufferedReader reader = Files.newBufferedReader(path , charset)) {
  while ((line = reader.readLine()) != null ) {
    //separate all csv fields into string array
    String[] lineVariables = line.split(&