本文以7.10版本elasticsearch源码为示例(jdk需要14及以上)
插件分类
elasticsearch把插件抽象成多个分类,不同分类的插件作用不同,具体分类可以看 org.elasticsearch.plugins.Plugin 类:
/**
* <ul>
* <li>{@link ActionPlugin}
* <li>{@link AnalysisPlugin}
* <li>{@link ClusterPlugin}
* <li>{@link DiscoveryPlugin}
* <li>{@link IngestPlugin}
* <li>{@link MapperPlugin}
* <li>{@link NetworkPlugin}
* <li>{@link RepositoryPlugin}
* <li>{@link ScriptPlugin}
* <li>{@link SearchPlugin}
* <li>{@link ReloadablePlugin}
* </ul>
*/
public abstract class Plugin implements Closeable {
// 省略。。。
}
插件类型介绍【官网】:https://www.elastic.co/guide/en/elasticsearch/plugins/master/intro.html
以下摘自网上:
- ActionPlugin:Rest api接口请求插件。开发者可以开发自身需要的rest命令,也可以对rest请求进行增加处理。如果Elasticsearch内置的命令如_all,cat,/cat/health等rest命令无法满足需求,开发者可以自己开发需要的rest命令。
- AnalysisPlugin:分析插件,扩展索引分析功能,用于增强ES自身分析功能的不足,例如大家熟知的IK分词插件。
- IngestPlugin:预处理插件。在数据索引之前进行预处理,例如根据ip来增加地理信息的geoip processor plugin。
- MapperPlugin:映射插件。增强ES的数据类型。
- NetworkPlugin:网络传输插件。
- ScriptPlugin:脚本插件。主要用于扩展ES的脚本功能,比如自定义方法打分,让ES支持其他脚本语言。
- SearchPlugin:查询插件。扩展ES的查询功能。
写一个filter插件
如果我们需要对文本做处理,那么我们写的插件应该定义成AnalysisPlugin类型;我们知道elasticsearch提供了很多内置的插件,可以看看这个类
org.elasticsearch.analysis.common.CommonAnalysisPlugin
这个类注册了很多常用的分析器、分词器、过滤器、分词过滤器,自定义插件可以学习里面的写法。
接下来我们来写一个对字符加密的插件,在参看了icu插件的部分源码(icu_normalizer)后,按照下面步骤:
新增一个 EncPlugin 类
package com.xx.plugin.es.enc;
import com.xx.plugin.es.enc.character.EncCharFilterFactory;
import com.xx.plugin.es.enc.token.EncCharTokenFilterFactory;
import org.elasticsearch.index.analysis.CharFilterFactory;
import org.elasticsearch.index.analysis.TokenFilterFactory;
import org.elasticsearch.indices.analysis.AnalysisModule;
import org.elasticsearch.plugins.AnalysisPlugin;
import org.elasticsearch.plugins.MapperPlugin;
import org.elasticsearch.plugins.Plugin;
import java.util.HashMap;
import java.util.Map;
import java.util.TreeMap;
public class EncPlugin extends Plugin implements AnalysisPlugin, MapperPlugin {
/**
* CharFilter
*
* @return
*/
@Override
public Map<String, AnalysisModule.AnalysisProvider<CharFilterFactory>> getCharFilters() {
Map<String, AnalysisModule.AnalysisProvider<CharFilterFactory>> filters = new TreeMap<>();
filters.put("enc_filter", EncCharFilterFactory::new);
return filters;
}
}
这个类可以通过实现 AnalysisPlugin,重写方法向es返回CharFilters、TokenFilters、Tokenizers、Analyzers等。关于这几个概念的关系可以查看 :
-
【Elastic知识简报】normalizer与analyzer的区别】 https://developer.aliyun.com/article/1082061
-
【Elasticsearch中什么是 tokenizer、analyzer、filter】 https://www.cnblogs.com/a-du/p/16272901.html
EncCharTokenFilterFactory
package com.xx.plugin.es.enc.character;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.env.Environment;
import org.elasticsearch.index.IndexSettings;
import org.elasticsearch.index.analysis.AbstractCharFilterFactory;
import org.elasticsearch.index.analysis.NormalizingCharFilterFactory;
import java.io.Reader;
/**
* 参考:
* org.elasticsearch.analysis.common.PatternReplaceCharFilterFactory
* org.elasticsearch.analysis.common.MappingCharFilterFactory
*/
public class EncCharFilterFactory extends AbstractCharFilterFactory implements NormalizingCharFilterFactory {
private EncNormalizer normalizer = null;
public EncCharFilterFactory(IndexSettings indexSettings, String name) {
super(indexSettings, name);
normalizer = new EncNormalizerImpl();
}
public EncCharFilterFactory(IndexSettings indexSettings, Environment environment, String name, Settings settings) {
super(indexSettings, name);
normalizer = new EncNormalizerImpl();
}
@Override
public Reader create(Reader reader) {
return new EncCharFilter(reader, normalizer);
}
}
EncTokenNormalizer
一个空类,逻辑后续再实现。
package com.xx.plugin.es.enc.character;
public class EncNormalizerImpl extends EncNormalizer{
}
EncCharFilter
package com.xx.plugin.es.enc.character;
import org.apache.lucene.analysis.charfilter.BaseCharFilter;
import org.apache.lucene.analysis.pattern.PatternReplaceCharFilter;
import java.io.IOException;
import java.io.Reader;
import java.io.StringReader;
import java.util.Objects;
import java.util.stream.Stream;
/**
* 参考:
*
* @see PatternReplaceCharFilter
*/
public class EncCharFilter extends BaseCharFilter {
private Reader transformedInput;
public EncCharFilter(Reader in, EncNormalizer encNormalizer) {
super(in);
}
@Override
public int read(char[] cbuf, int off, int len) throws IOException {
// 逻辑具体实现,源码看起来有点绕,还没有看明白逻辑;待后续实现 这里只暂时简单打印
System.out.println("");
System.out.println("--------------------------------------------------");
// Buffer all input on the first call.
System.out.printf("transformedInput == null:%s%n", transformedInput == null);
if (transformedInput == null) {
fill();
}
String str = new String(cbuf);
System.out.printf("cbuf>>>:%s length:%s off>>>:%s len>>>:%s", str, str.length(), off, len);
int read = transformedInput.read(cbuf, off, len);
System.out.println(" read>>>" + read);
return read;
}
private String fill() throws IOException {
StringBuilder buffered = new StringBuilder();
char[] temp = new char[1024];
for (int cnt = input.read(temp); cnt > 0; cnt = input.read(temp)) {
buffered.append(temp, 0, cnt);
}
String newStr = this.processPattern(buffered).toString();//+ (buffered.length() > 0 ? "" : tail);
if (Objects.equals(newStr, "110")) {
Stream.of(Thread.currentThread().getStackTrace()).forEach(System.out::println);
}
transformedInput = new StringReader(newStr);
return newStr;
}
@Override
public int read() throws IOException {
if (transformedInput == null) {
fill();
}
return transformedInput.read();
}
@Override
protected int correct(int currentOff) {
return Math.max(0, super.correct(currentOff));
}
/**
* Replace pattern in input and mark correction offsets.
*/
CharSequence processPattern(CharSequence input) {
return input;
}
}
配置文件
在 resources 目录下新增两个配置文件
- plugin-descriptor.properties,这里使用 maven-assembly-plugin 插件,文件里变量定义在pom中
classname=${elasticsearch.plugin.classname}
name=${elasticsearch.plugin.name}
description=${project.description}
version=${project.version}
elasticsearch.version=${elasticsearch.version}
java.version=${maven.compiler.target}
- plugin-security.policy 这个是插件申请权限的配置。以下是示例,根据自己的实际情况设置
grant {
permission java.security.AllPermission;
};
在assembly目录新增打包配置文件,打包成zip文件
<?xml version="1.0"?>
<assembly>
<dependencySets>
<dependencySet>
<outputDirectory>enc-filter</outputDirectory>
<useProjectArtifact>true</useProjectArtifact>
<useTransitiveFiltering>true</useTransitiveFiltering>
</dependencySet>
</dependencySets>
<fileSets>
<fileSet>
<directory>${project.basedir}/config</directory>
<outputDirectory>config</outputDirectory>
</fileSet>
</fileSets>
<files>
<file>
<fileMode>0755</fileMode>
<filtered>true</filtered>
<outputDirectory>enc-filter/</outputDirectory>
<source>${project.basedir}/src/main/resources/plugin-descriptor.properties</source>
</file>
<file>
<fileMode>0755</fileMode>
<filtered>true</filtered>
<outputDirectory>enc-filter/</outputDirectory>
<source>${project.basedir}/src/main/resources/plugin-security.policy</source>
</file>
</files>
<formats>
<format>zip</format>
</formats>
<id>plugin-develop</id>
<includeBaseDirectory>false</includeBaseDirectory>
</assembly>
目录结构如下
打包
注意:plugin-descriptor.properties和plugin-security.policy 不能 打进zip包
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.example</groupId>
<artifactId>enc-plugin</artifactId>
<version>1.0-SNAPSHOT</version>
<properties>
<maven.compiler.source>14</maven.compiler.source>
<maven.compiler.target>14</maven.compiler.target>
<elasticsearch.version>7.10.1</elasticsearch.version>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
<elasticsearch.plugin.classname>com.xx.plugin.es.enc.EncPlugin</elasticsearch.plugin.classname>
<elasticsearch.plugin.name>enc_plugin</elasticsearch.plugin.name>
<project.description>this is a test</project.description>
</properties>
<dependencies>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>${elasticsearch.version}</version>
<scope>provided</scope>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.6</version>
<configuration>
<appendAssemblyId>false</appendAssemblyId>
<descriptors>
<descriptor>${basedir}/src/main/assembly/plugin.xml</descriptor>
</descriptors>
<outputDirectory>${project.build.directory}/releases/</outputDirectory>
</configuration>
<executions>
<execution>
<goals>
<goal>single</goal>
</goals>
<phase>package</phase>
</execution>
</executions>
</plugin>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<groupId>org.apache.maven.plugins</groupId>
<version>3.8.1</version>
<configuration>
<encoding>${project.build.sourceEncoding}</encoding>
<source>${maven.compiler.target}</source>
<target>${maven.compiler.target}</target>
</configuration>
</plugin>
</plugins>
<resources>
<resource>
<directory>src/main/resources</directory>
<excludes>
<exclude>*.properties</exclude>
<exclude>*.policy</exclude>
</excludes>
<filtering>false</filtering>
</resource>
</resources>
</build>
</project>
安装
执行 mvn clean install 后获得 elasticsearch-encode-plugin-1.0-SNAPSHOT.zip文件
将文件复制到es的plugin目录,然后解压、删掉原压缩文件,最终结果如下:
验证
启动es
./elasticsearch-7.10.1/bin/elasticsearch
出现如上图,则表示加载到了自定义的filter
todo:
1、新增索引
2、写入数据
本文自定义插件demo上传到了 【gitee】 https://gitee.com/aqu415/elasticsearch-encode-plugin
参考:
https://cloud.tencent.com/developer/article/2213726