ElasticSearch 全文检索（下）

最新推荐文章于 2024-04-03 11:12:31 发布

梦得溪

最新推荐文章于 2024-04-03 11:12:31 发布

阅读量798

点赞数 1

分类专栏： Java开发

本文链接：https://blog.csdn.net/pujiaolin/article/details/79874711

版权

Java开发专栏收录该内容

5 篇文章 0 订阅

订阅专栏

一、封装Elasticsearch客户端

在上篇博客中介绍了 Elasticsearch的Java客户端怎么使用，但是在实际项目中应用，需要进行合理封装，下面主要讲一下在Spring Boot项目中封装。

1.封装配置参数对象 ElasticsearchConfig

@Configuration
@ConfigurationProperties(
        prefix = "elasticsearch.config"
)
public class ElasticsearchConfig {
    private String hostname;
    private int port;
    private String certPassword;
    private String trustStorePath;
    private String keyStorePath;
    ...

然后在application.properties 文件中对应参数进行配置

使用ConfigurationProperties 别忘了pom中加入 spring-boot-configuration-processor依赖

2.将elasticsearch客户端封装为spring bean

目的是统一的客户端调用，方便使用时注入，调用时同一个对象（单例）。在相关spring容器管理的类中 @Bean注解创建客户端的方法即可（参考上篇）

@Configuration
public class ElasticsearchBean {

    private final ElasticsearchConfig elasticsearchConfig;

    @Autowired
    public ElasticsearchBean(ElasticsearchConfig elasticsearchConfig) {
        this.elasticsearchConfig = elasticsearchConfig;
    }

    /**
     * 根据指定jks文件路径生成keyStore
     */
    private KeyStore loadKeyStore(String keyStorePath) throws KeyStoreException, CertificateException, NoSuchAlgorithmException, IOException {
        KeyStore keyStore = KeyStore.getInstance("jks");
        InputStream is = ElasticsearchBean.class.getResourceAsStream(keyStorePath);
        keyStore.load(is,elasticsearchConfig.getCertPassword().toCharArray());
        return keyStore;
    }

    /**
     * 封装 buildSSLContext
     */
    private SSLContext buildSSLContext() throws CertificateException, NoSuchAlgorithmException, KeyStoreException, IOException, UnrecoverableKeyException, KeyManagementException {
        SSLContextBuilder sslBuilder = SSLContexts.custom()
                .loadTrustMaterial(loadKeyStore(elasticsearchConfig.getTrustStorePath()), null)
                .loadKeyMaterial(loadKeyStore(elasticsearchConfig.getKeyStorePath()),
                        elasticsearchConfig.getCertPassword().toCharArray());
        return sslBuilder.build();
    }

    /**
     * @return 封装 RestClient
     */
    @Bean(destroyMethod = "close")
    public RestHighLevelClient restClient(){
        RestClientBuilder builder = RestClient.builder(new HttpHost(elasticsearchConfig.getHostname(), elasticsearchConfig.getPort(), "https"));
        SSLContext sslContext = buildSSLContext();
        builder.setHttpClientConfigCallback(httpClientBuilder ->
                    httpClientBuilder.setSSLContext(sslContext));
        return new RestHighLevelClient(builder);
    }
}

这里也可以通过 RestClientBuilder 设置一些监听回调的方法，超时设置等

这里使用了 RestHighLevelClient ，所以注意添加依赖
elasticsearch-rest-high-level-client

3.封装Elasticsearch 操作service

包括一些对文档的增删改查操作，对于elasticsearch的索引，类型等根据自己情况设置，我这里是统一的，所以用了常量进行定义

@Component
public class ElasticsearchService {

    private final RestHighLevelClient restClient;

    @Autowired
    public ElasticsearchService(RestHighLevelClient restClient) {
        this.restClient = restClient;
    }

    /**
     * 保存doc
     * @param id 文档id
     * @param documentJson 文档json
     * @throws IOException IO异常
     */
    private void saveDocument(String id,String documentJson) throws IOException {
        IndexRequest request = new IndexRequest(
                ElasticsearchConstant.INDEX_NAME,
                ElasticsearchConstant.INDEX_TYPE,
                id);
        request.source(documentJson, XContentType.JSON);
        restClient.index(request);
    }

    /**
     * 根据id获取文档
     * @param id id
     * @throws IOException IO异常
     */
    private GetResponse getDocument(String id) throws IOException {
        GetRequest getRequest = new GetRequest(
                ElasticsearchConstant.INDEX_NAME,
                ElasticsearchConstant.INDEX_TYPE,
                id);
        return restClient.get(getRequest);
    }

    /**
     * 删除文档
     * @param id id
     * @return 结果
     * @throws IOException IO异常
     */
    private DeleteResponse deleteDocument(String id) throws IOException {
        DeleteRequest request = new DeleteRequest(
                ElasticsearchConstant.INDEX_NAME,
                ElasticsearchConstant.INDEX_TYPE,
                id);
        return restClient.delete(request);
    }

    /**
     * 更新文档内容
     * @param id id
     * @param contentJson json内容
     * @return 结果
     * @throws IOException IO异常
     */
    private UpdateResponse updateDocument(String id,String contentJson)throws IOException{
        UpdateRequest updateRequest = new UpdateRequest(
                ElasticsearchConstant.INDEX_NAME,
                ElasticsearchConstant.INDEX_TYPE,
                id);
        updateRequest.doc(contentJson, XContentType.JSON);
        return restClient.update(updateRequest);
    }

}

将要存入的内容封装为统一对象（ContentWord），存入该对象的json即可

 /**
     * 保存内容
     * @param contentWord 内容
     * @throws IOException IO异常
     */
    public void saveContent(ContentWord contentWord) throws IOException {
        saveDocument(contentWord.getId(), Utility.toJson(contentWord));
    }

增删改，都比较简单，下面来讲讲查询(高亮搜索，分页查询)

封装搜索方法

  /**
     * 查询所有文档
     * @param highlightBuilder 高亮封装
     * @param query 查询体
     * @param from 开始数
     * @param size 查询数
     * @return 查询结果
     * @throws IOException IO异常
     */
   private SearchResponse searchDocument(HighlightBuilder highlightBuilder,QueryBuilder query,int from,int size) throws IOException {
       SearchRequest searchRequest = new SearchRequest(ElasticsearchConstant.INDEX_NAME);
       searchRequest.types(ElasticsearchConstant.INDEX_TYPE);
       SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
       if (highlightBuilder != null) {
           searchSourceBuilder.highlighter(highlightBuilder);
       }
       searchSourceBuilder.query(query);
       searchSourceBuilder.from(from);
       searchSourceBuilder.size(size);
       searchRequest.source(searchSourceBuilder);
       return restClient.search(searchRequest);
   }

高亮封装

/**
     * 封装高亮查询字段
     * @param fieldName 字段名
     * @return 高亮字段体
     */
    private HighlightBuilder.Field makeHighlightContent(String fieldName){
        HighlightBuilder.Field highlightContent = new HighlightBuilder.Field(fieldName);
        highlightContent.highlighterType("unified");
        highlightContent.fragmentSize(FRAGMENT_SIZE);
        highlightContent.numOfFragments(FRAGMENT_NUM);
        return highlightContent;
    }

这里可以设置很多属性，比如一次搜索的 碎片（根据关键字检索的上下文片段） 总数，碎片大小（每个片段字符数），高亮时使用的标签(默认 em 标签)，高亮类型等等，我这里设置了碎片总数和碎片大小

高亮搜索

/**
     * 高亮分页搜索
     * @param page 页码（从0开始）
     * @param size 每页数量
     * @param text 搜索文本
     * @param fieldNames 搜索字段名
     * @return 搜索结果
     * @throws IOException IO异常
     */
    public SearchResponse searchHighlight(int page, int size,String text,String... fieldNames)throws IOException {
        int from = page * size;
        HighlightBuilder highlightBuilder = new HighlightBuilder();
        for (String fieldName:fieldNames){
            highlightBuilder.field(makeHighlightContent(fieldName));
        }
        return searchDocument(highlightBuilder,QueryBuilders.multiMatchQuery(text,fieldNames)
                .fuzziness(Fuzziness.AUTO),from,size);
    }

一般来说搜索都是 多字段 的比如搜索博客关键字，同时根据关键字搜索：标题，内容，组织，作者等等字段进行匹配。

二、搜索使用

1.封装碎片转为String

    /**
     * 高亮碎片转为string
     * @param fragments 碎片数组
     * @return 字符串
     */
    private String fragmentsToString(Text[] fragments){
        return Arrays.stream(fragments).map(Text::string)
                .collect(Collectors.joining("\n"));
    }

2.将搜索结果获取的SearchHit转为我们封装的统一对象（ContentWord）

需要搜索的字段（高亮处理的字段）重新赋值，其他普通字段不管，直接使用存储时的值。

    /**
     * 搜索结果转为 ContentWord 对象
     * @param searchHit 搜索结果
     * @return ContentWord
     */
    private ContentWord fromSearchHit(SearchHit searchHit){
        String json = searchHit.getSourceAsString();
        ContentWord contentWord = Utility.toClass(json,ContentWord.class);
        Map<String,HighlightField> highlightFields = searchHit.getHighlightFields();
        if (!highlightFields.isEmpty()){
            HighlightField titleField = highlightFields.get("title");
            if (titleField != null){
                contentWord.setTitle(fragmentsToString(titleField.fragments()));
            }
            HighlightField contentField = highlightFields.get("content");
            if (contentField != null){
                contentWord.setContent(fragmentsToString(contentField.fragments()));
            }
        }
        return contentWord;
    }

这里我只检索了两个字段（高亮处理）title 和 content

3.搜索内容

    /**
     * 搜索内容
     * @param keyWord 关键字
     * @param pageable 分页信息
     * @return 结果
     * @throws IOException EX
     */
    public Page<ContentWord> searchContent(String keyWord,Pageable pageable) throws IOException {
        SearchResponse searchResponse = elasticsearchService.searchHighlight(pageable.getPageNumber(), 
                pageable.getPageSize(),keyWord, "title","content");
        List<ContentWord> list = Arrays.stream(searchResponse.getHits().getHits())
                .map(this::fromSearchHit).collect(Collectors.toList());
        return new PageImpl<>(list,pageable,searchResponse.getHits().getTotalHits());
    }

三、最后

Elasticsearch内容还是很多的，如Elasticsearch服务集群，配置分词器，与Hadoop结合使用，用作服务日志分析，接口调用情况分析等。这次先到这里吧。我这里只是简单的从0到1介绍了实际项目中入门使用。相关的博客也很多了，但是我用到的时候，有些问题没有在中文技术论坛中找到，所以就写了这两篇博客。

梦得溪

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
ElasticSearch 全文检索（下）

在上篇博客中介绍了 Elasticsearch的Java客户端怎么使用，但是在实际项目中应用，需要进行合理封装，下面主要讲一下在Spring Boot项目中封装，以及对搜索结果进行数据处理。
复制链接

扫一扫