带有正则表达式模式的Google Guava Cache

最近我看到了一个关于Google Guava的精彩演讲 ,我们在我们的项目中得出结论,使用它的缓存功能真的很有趣。 让我们看一下regexp Pattern类及其编译功能 。 在代码中经常可以看到,每次使用正则表达式时,程序员都会使用相同的参数重复调用上述Pattern.compile()函数,从而一次又一次地编译相同的正则表达式。 但是,可以做的是缓存此类编译的结果–让我们看一下RegexpUtils实用程序类:

RegexpUtils.java

package pl.grzejszczak.marcin.guava.cache.utils;

import com.google.common.cache.CacheBuilder;
import com.google.common.cache.CacheLoader;
import com.google.common.cache.LoadingCache;

import java.util.concurrent.ExecutionException;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

import static java.lang.String.format;

public final class RegexpUtils {

    private RegexpUtils() {
        throw new UnsupportedOperationException("RegexpUtils is a utility class - don't instantiate it!");
    }

    private static final LoadingCache<String, Pattern> COMPILED_PATTERNS =
            CacheBuilder.newBuilder().build(new CacheLoader<String, Pattern>() {
                @Override
                public Pattern load(String regexp) throws Exception {
                    return Pattern.compile(regexp);
                }
            });

    public static Pattern getPattern(String regexp) {
        try {
            return COMPILED_PATTERNS.get(regexp);
        } catch (ExecutionException e) {
            throw new RuntimeException(format("Error when getting a pattern [%s] from cache", regexp), e);
        }
    }

    public static boolean matches(String stringToCheck, String regexp) {
        return doGetMatcher(stringToCheck, regexp).matches();
    }

    public static Matcher getMatcher(String stringToCheck, String regexp) {
        return doGetMatcher(stringToCheck, regexp);
    }

    private static Matcher doGetMatcher(String stringToCheck, String regexp) {
        Pattern pattern = getPattern(regexp);
        return pattern.matcher(stringToCheck);
    }

}

如您所见,如果没有找到,则使用带有CacheBuilder的Guava的LoadingCache来填充具有新编译模式的缓存。 由于缓存已编译的模式(如果已经进行了编译),将不会再次重复(在我们的情况下,因为我们没有任何到期设置)。 现在一个简单的测试

GuavaCache.java

package pl.grzejszczak.marcin.guava.cache;

import com.google.common.base.Stopwatch;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import pl.grzejszczak.marcin.guava.cache.utils.RegexpUtils;

import java.util.regex.Pattern;

import static java.lang.String.format;

public class GuavaCache {
    private static final Logger LOGGER = LoggerFactory.getLogger(GuavaCache.class);
    public static final String STRING_TO_MATCH = "something";

    public static void main(String[] args) {
        runTestForManualCompilationAndOneUsingCache(1);
        runTestForManualCompilationAndOneUsingCache(10);
        runTestForManualCompilationAndOneUsingCache(100);
        runTestForManualCompilationAndOneUsingCache(1000);
        runTestForManualCompilationAndOneUsingCache(10000);
        runTestForManualCompilationAndOneUsingCache(100000);
        runTestForManualCompilationAndOneUsingCache(1000000);
    }

    private static void runTestForManualCompilationAndOneUsingCache(int firstNoOfRepetitions) {
        repeatManualCompilation(firstNoOfRepetitions);
        repeatCompilationWithCache(firstNoOfRepetitions);
    }

    private static void repeatManualCompilation(int noOfRepetitions) {
        Stopwatch stopwatch = new Stopwatch().start();
        compileAndMatchPatternManually(noOfRepetitions);
        LOGGER.debug(format("Time needed to compile and check regexp expression [%d] ms, no of iterations [%d]", stopwatch.elapsedMillis(), noOfRepetitions));
    }

    private static void repeatCompilationWithCache(int noOfRepetitions) {
        Stopwatch stopwatch = new Stopwatch().start();
        compileAndMatchPatternUsingCache(noOfRepetitions);
        LOGGER.debug(format("Time needed to compile and check regexp expression using Cache [%d] ms, no of iterations [%d]", stopwatch.elapsedMillis(), noOfRepetitions));
    }

    private static void compileAndMatchPatternManually(int limit) {
        for (int i = 0; i < limit; i++) {
            Pattern.compile("something").matcher(STRING_TO_MATCH).matches();
            Pattern.compile("something1").matcher(STRING_TO_MATCH).matches();
            Pattern.compile("something2").matcher(STRING_TO_MATCH).matches();
            Pattern.compile("something3").matcher(STRING_TO_MATCH).matches();
            Pattern.compile("something4").matcher(STRING_TO_MATCH).matches();
            Pattern.compile("something5").matcher(STRING_TO_MATCH).matches();
            Pattern.compile("something6").matcher(STRING_TO_MATCH).matches();
            Pattern.compile("something7").matcher(STRING_TO_MATCH).matches();
            Pattern.compile("something8").matcher(STRING_TO_MATCH).matches();
            Pattern.compile("something9").matcher(STRING_TO_MATCH).matches();
        }
    }

    private static void compileAndMatchPatternUsingCache(int limit) {
        for (int i = 0; i < limit; i++) {
            RegexpUtils.matches(STRING_TO_MATCH, "something");
            RegexpUtils.matches(STRING_TO_MATCH, "something1");
            RegexpUtils.matches(STRING_TO_MATCH, "something2");
            RegexpUtils.matches(STRING_TO_MATCH, "something3");
            RegexpUtils.matches(STRING_TO_MATCH, "something4");
            RegexpUtils.matches(STRING_TO_MATCH, "something5");
            RegexpUtils.matches(STRING_TO_MATCH, "something6");
            RegexpUtils.matches(STRING_TO_MATCH, "something7");
            RegexpUtils.matches(STRING_TO_MATCH, "something8");
            RegexpUtils.matches(STRING_TO_MATCH, "something9");
        }
    }

}

我们正在运行一系列测试,并检查它们的执行时间。 请注意,由于应用程序不是独立运行的,因此这些测试的结果并不精确,因此许多条件都可能影响执行时间。 我们有兴趣显示一定程度的问题,而不是显示准确的执行时间。 对于给定的迭代次数(1,10,100,1000,10000,100000,1000000),我们要么编译10个正则表达式,要么使用Guava的缓存检索已编译的Pattern,然后将它们与要匹配的字符串进行匹配。 这些是日志:

pl.grzejszczak.marcin.guava.cache.GuavaCache:34 Time needed to compile and check regexp expression [1] ms, no of iterations [1]
pl.grzejszczak.marcin.guava.cache.GuavaCache:40 Time needed to compile and check regexp expression using Cache [35] ms, no of iterations [1]
pl.grzejszczak.marcin.guava.cache.GuavaCache:34 Time needed to compile and check regexp expression [1] ms, no of iterations [10]
pl.grzejszczak.marcin.guava.cache.GuavaCache:40 Time needed to compile and check regexp expression using Cache [0] ms, no of iterations [10]
pl.grzejszczak.marcin.guava.cache.GuavaCache:34 Time needed to compile and check regexp expression [8] ms, no of iterations [100]
pl.grzejszczak.marcin.guava.cache.GuavaCache:40 Time needed to compile and check regexp expression using Cache [3] ms, no of iterations [100]
pl.grzejszczak.marcin.guava.cache.GuavaCache:34 Time needed to compile and check regexp expression [10] ms, no of iterations [1000]
pl.grzejszczak.marcin.guava.cache.GuavaCache:40 Time needed to compile and check regexp expression using Cache [10] ms, no of iterations [1000]
pl.grzejszczak.marcin.guava.cache.GuavaCache:34 Time needed to compile and check regexp expression [83] ms, no of iterations [10000]
pl.grzejszczak.marcin.guava.cache.GuavaCache:40 Time needed to compile and check regexp expression using Cache [33] ms, no of iterations [10000]
pl.grzejszczak.marcin.guava.cache.GuavaCache:34 Time needed to compile and check regexp expression [800] ms, no of iterations [100000]
pl.grzejszczak.marcin.guava.cache.GuavaCache:40 Time needed to compile and check regexp expression using Cache [279] ms, no of iterations [100000]
pl.grzejszczak.marcin.guava.cache.GuavaCache:34 Time needed to compile and check regexp expression [7562] ms, no of iterations [1000000]
pl.grzejszczak.marcin.guava.cache.GuavaCache:40 Time needed to compile and check regexp expression using Cache [3067] ms, no of iterations [1000000]

您可以在此处在Guava / Cache目录下找到源,或转到URL https://bitbucket.org/gregorin1987/too-much-coding/src


翻译自: https://www.javacodegeeks.com/2013/04/google-guava-cache-with-regular-expression-patterns.html

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值