For log processing my application needs to read text files line by line.
First I used the function readLine() of BufferedReader but I read on the internet that BufferedReader is slow when reading files.
Afterwards I tried to use FileInputStream together with a FileChannel and MappedByteBuffer but in this case there's no function similar to readLine() so I search my text for a line-break and process it:
try {
FileInputStream f = new FileInputStream(file);
FileChannel ch = f.getChannel( );
MappedByteBuffer mb = ch.map(FileChannel.MapMode.READ_ONLY, 0L, ch.size());
byte[] bytes = new byte[1024];
int i = 0;
while (mb.hasRemaining()) {
byte get = mb.get();
if(get == '\n') {
if(ra.run(new String(bytes)))
cnt++;
for(int j = 0; j<=i; j++)
bytes[j] = 0;
i = 0;
}
else
bytes[i++] = get;
}
} catch(Exception ex) {
ex.printStackTrace();
}
I know this is probably not a good way to implement it but when I just read the text-file in bytes it is 3 times faster then using BufferedReader but calling new String(bytes) creates a new String and makes the program even slower then when using a BufferedReader.
So I wanted to ask what is the fastest way to read a text-file line by line? Some say BufferedReader is the only solution to this problem.
P.S.: ra is an instance of RunAutomaton from the dk.brics.Automaton library.
解决方案
I very much doubt that BufferedReader is going to cause a significant overhead. Adding your own code is likely to be at least as inefficient, and quite possibly wrong too.
For example, in the code that you've given you're calling new String(bytes) which is always going to create a string from 1024 bytes, using the platform default encoding... not a good idea. Sure, you clear the array afterwards, but your strings are still going to contain a bunch of '\0' characters - which means a lot of wasted space, apart from anything else. You should at least restrict the portion of the byte array the string is being created from (which also means you don't need to clear the array afterwards).
Have you actually tried using BufferedReader and found it to be too slow? You should usually write the simplest code which will meet your goals first, and then check whether it's fast enough... especially if your only reason for not doing so is an unspecified resource you "read on the internet". DO you want me to find hundreds of examples of people spouting incorrect performance suggestions? :)
As an alternative, you might want to look at Guava's overload of Files.readLines() which takes a LineProcessor.