java正则表达式性能,Java正则表达式提供任何性能优势?

In Java, when we try to do pattern matching using a regular expression. e.g. take a input string and use regular expression to find out if it is numeric. If not, throw an exception.

In this case, I understand, using regex makes the code less verbose than if we were to take each character of the string, check if it is a number and if not throw an exception.

But I was under the assumption that regex also makes the process more efficient. IS this true? I cannot find any evidence on this point. How is regex doing the match behind the scenes? IS it not also iterating over the string and checking each character one by one?

解决方案

Just for fun, I have run this micro benchmark. The results of the last run (i.e. post JVM warm up / JIT) are below (results are fairly consistent from one run to another anyway):

regex with numbers 123

chars with numbers 33

parseInt with numbers 33

regex with words 123

chars with words 34

parseInt with words 733

In other words, chars is very efficient, Integer.parseInt is as efficient as char IF the string is a number, but awfully slow if the string is not a number. Regex is in between.

Conclusion

If you parse a string into a number and you expect the string to be a number in general, using Integer.parseInt is the best solution (efficient and readable). The penalty you get when the string is not a number should be low if it is not too frequent.

ps: my regex is maybe not optimal, feel free to comment.

public class TestNumber {

private final static List numbers = new ArrayList<>();

private final static List words = new ArrayList<>();

public static void main(String args[]) {

long start, end;

Random random = new Random();

for (int i = 0; i < 1000000; i++) {

numbers.add(String.valueOf(i));

words.add(String.valueOf(i) + "x");

}

for (int i = 0; i < 5; i++) {

start = System.nanoTime();

regex(numbers);

System.out.println("regex with numbers " + (System.nanoTime() - start) / 1000000);

start = System.nanoTime();

chars(numbers);

System.out.println("chars with numbers " + (System.nanoTime() - start) / 1000000);

start = System.nanoTime();

exception(numbers);

System.out.println("exceptions with numbers " + (System.nanoTime() - start) / 1000000);

start = System.nanoTime();

regex(words);

System.out.println("regex with words " + (System.nanoTime() - start) / 1000000);

start = System.nanoTime();

chars(words);

System.out.println("chars with words " + (System.nanoTime() - start) / 1000000);

start = System.nanoTime();

exception(words);

System.out.println("exceptions with words " + (System.nanoTime() - start) / 1000000);

}

}

private static int regex(List list) {

int sum = 0;

Pattern p = Pattern.compile("[0-9]+");

for (String s : list) {

sum += (p.matcher(s).matches() ? 1 : 0);

}

return sum;

}

private static int chars(List list) {

int sum = 0;

for (String s : list) {

boolean isNumber = true;

for (char c : s.toCharArray()) {

if (c < '0' || c > '9') {

isNumber = false;

break;

}

}

if (isNumber) {

sum++;

}

}

return sum;

}

private static int exception(List list) {

int sum = 0;

for (String s : list) {

try {

Integer.parseInt(s);

sum++;

} catch (NumberFormatException e) {

}

}

return sum;

}

}

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值