小编典典
不幸的是,Java正则表达式类不提供匹配结果的流,仅提供一种splitAsStream()方法,但您不想拆分。
但是,您可以自己为其创建通用帮助程序类:
public final class PatternStreamer {
private final Pattern pattern;
public PatternStreamer(String regex) {
this.pattern = Pattern.compile(regex);
}
public Stream results(CharSequence input) {
List list = new ArrayList<>();
for (Matcher m = this.pattern.matcher(input); m.find(); )
list.add(m.toMatchResult());
return list.stream();
}
}
然后,您可以使用flatMap()以下代码来简化代码:
private static final PatternStreamer quoteRegex = new PatternStreamer("\"([^\"]*)\"");
public static void main(String[] args) throws Exception {
String inFileName = "c:\\exec.log";
String outFileName = "c:\\exec_quoted.txt";
try (Stream stream = Files.lines(Paths.get(inFileName))) {
Set dataSet = stream.flatMap(quoteRegex::results)
.map(r -> r.group(1))
.collect(Collectors.toSet());
Files.write(Paths.get(outFileName), dataSet);
}
}
由于您一次只能处理一条线路,因此暂时List没有问题。如果输入字符串很长并且匹配很多,那么a
Spliterator是一个更好的选择。请参阅如何创建正则表达式匹配流?
2020-11-23