I need to process the following file on Unix and Windows:
a;b
c;d;e;f;g
c;d;e;f;g
c;d;e;f;g
a;b
c;d;e;f;g
c;d;e;f;g
c;d;e;f;g
a;b
a;b
c;d;e;f;g
c;d;e;f;g
c;d;e;f;g
i need to process a;b that contain a block of data underneath.
e.g. the third a;b shouldn't be processed.
currently i am delimiting by using the following regular expression this type of text in a file using Java scanner:
Scanner fileScanner = new Scanner(file);
try{
fileScanner.useDelimiter(Pattern.compile("^$", Pattern.MULTILINE));
while(fileScanner.hasNext()){
String line;
while ((line = fileScanner.nextLine()).isEmpty());
InputStream is = new ByteArrayInputStream(fileScanner.next().getBytes("UTF-8"));
...
This will still delegate for the third a;b the empty input into the ByteArrayInputStream.
Hoe may i check if the first line of fileScanner.next() is an empty line and then execute nextLine() statement and a following a continue statement?
解决方案
Use regex pattern
(?m)^(?:.+(?:\\r?\\n|\\Z)){2,}
which matches two or more non-empty lines, or other words two or more (?:...){2,} lines that contain one or more characters .+ followed by new line \\r?\\n or (?:...|...) end of string \\Z.
Multiline modifier (?m) means that ^ matches a beginning of each line, not just the beginning of the string.
Demo:
String str = "...";
Pattern p = Pattern.compile("(?m)^(?:.+(?:\\r?\\n|\\Z)){2,}");
Matcher m = p.matcher(str);
while (m.find()) {
String match = m.group();
System.out.println(match);
}