在上一篇文章elasticsearch painless最强教程中,已经介绍了一些painless的基本例子,喜欢动脑子的同学应该已经看到了plainless或者说是elasticsearch的script强大之处了。我在另外一篇文章如何使用logstash更新已有的elasticsearch记录里面介绍的用logstash更新数据的功能,在简单的需求下,完全可以用elasticsearch的painless script代替。但为什么我们还需要用logstsah,那是因为在elasticsearch中使用正则表达式去匹配,提取数据以完成数据transformation的功能,其代价是非常昂贵的。
painless除了上篇文章提到的基本功能之外,其实还支持正则表达式。但painless却是默认关闭正则表达式功能的。我们先看看官方是怎么说的:
Regexes are disabled by default because they circumvent Painless’s protection against long running and memory hungry scripts. To make matters worse even innocuous looking regexes can have staggering performance and stack depth behavior. They remain an amazing powerful tool but are too scary to enable by default. To enable them yourself set script.painless.regex.enabled: true in elasticsearch.yml. We’d like very much to have a safe alternative implementation that can be enabled by default so check this space for later developments!
可以看到,painless的最高原则是要运行得快,而用正则表达式去执行搜索的动作则会大量的消耗cpu和内存资源,有可能极大的降低painless的效率,所以官方是默认禁止这个功能的(其实在平时的query中,官方也不建议我们使用regex去搜索)。若要打开,我们必须手动的在elasticsearch的配置文件elasticsearch.yml中加入:
script.painless.regex.enabled: