java怎么获取行分隔符,如何在Java中使用不同的行分隔符处理文件？

最新推荐文章于 2022-12-17 14:29:54 发布

weixin_39635432

最新推荐文章于 2022-12-17 14:29:54 发布

阅读量176

点赞数

关键词由CSDN通过智能技术生成

I have a huge file (more than 3GB) that contains a single long line in the following format.

"1243@818@9287@543"

Then the data I want to analyze is separated with "@". My idea is to change the default end of line

character used by Java ans set "@".

I'm trying with the following code using "System.setProperty("line.separator", "@");" but is not working, since is printing the complete line and for this test I'd like as output.

1243

818

9287

543

How can I change the default line separator to "@"?

package test;

import java.io.BufferedReader;

import java.io.File;

import java.io.FileNotFoundException;

import java.io.FileReader;

import java.io.IOException;

public class Test {

public static void main(String[] args) throws FileNotFoundException, IOException {

System.setProperty("line.separator", "@");

File testFile = new File("./Mypath/myfile");

BufferedReader br = new BufferedReader(new FileReader(testFile));

for(String line; (line = br.readLine()) != null; ) {

// Process each the line.

System.out.println(line);

}

Thanks in advance for any help.

解决方案

Then the data I want to analyze is separated with "@". My idea is to

change the default end of line character used by Java ans set "@".

I wouldn't do that as it might break God knows what else that is depending on line.separator.

As for why this doesn't work, I'm sorry to say this is a case of RTFM not being done. This is what the Javadocs for BufferedReader.readLine has to say:

public String readLine()

throws IOException

Reads a line of text. A line is considered to be terminated by any one of a line feed ('\n'), a carriage return ('\r'), or a carriage return followed immediately by a linefeed.

Returns: A String containing the contents of the line, not including any line-termination characters, or null if the end of the stream has been reached

Throws: IOException - If an I/O error occurs

The API docs for the readLine() method clearly says that it looks for '\n' or '\r'. It does not say it depends on line.separator.

The line.separator property is only for developing API's that need a portable, platform-independent mechanism that identifies line separators. That is all. This system property is not for controlling the internal mechanisms of Java's IO classes.

I think you are over-complicating things. Just do it the old fashion way by reading n-number of characters (say 1024KB) on a buffer, and scan for each '@' delimiter. That introduces complications such as normal cases where data between '@' delimiters get split between buffers.

So, I would suggest just read one character off the buffered reader (this is not that bad and does not typically hit IO excessively since the buffered reader does... tada... buffering for you.)

Pump each character to a string builder, and every time you find a '@' delimiter, you flush the content of the string builder to standard output or whatever (since that would represent a datum off your '@' file.)

Get the algorithm to work correctly first. Optimize later. This is the pseudo-code below, no guarantees there are no compilation errors. You should be able to trivially flesh it out in syntactically correct Java:

File testFile = new File("./Mypath/myfile");

int buffer_size = 1024 * 1024

BufferedReader br = new BufferedReader(new FileReader(testFile), buffer_size);

StringBuilder bld = StringBuilder();

int c = br.read();

while(c != -1){

char z = (char)c;

if(z == '@'){

System.out.println(bld);

if(bld.length() > 0){

bld.delete(0, bld.length() - 1);

}

} else {

bld.append(z);

}

weixin_39635432

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
java怎么获取行分隔符,如何在Java中使用不同的行分隔符处理文件？

I have a huge file (more than 3GB) that contains a single long line in the following format."1243@818@9287@543"Then the data I want to analyze is separated with "@". My idea is to change the default e...
复制链接

扫一扫