ElasticSearch的简单使用

野生开发者

已于 2022-10-13 14:32:46 修改

阅读量906

点赞数

分类专栏： # ELK 文章标签： elasticsearch java 大数据

于 2022-10-12 17:47:13 首次发布

本文链接：https://blog.csdn.net/qgnczmnmn/article/details/127285843

版权

ELK 专栏收录该内容

2 篇文章 1 订阅

订阅专栏

文章目录

本文基于ElasticSearch 8.x以上版本来介绍，Java版本需要使用1.8以上版本
ElasticSearch8.x以上的版本与之前的版本略有不同，针对Java客户端，官方进行了修改，弃用了客户端RestHighLevelClient，提供了新版的客户端ElasticSearchClient，具体的配置参见文中ElasticSearch配置类。

1.ElasticSearch的安装

在Windows环境下安装ElasticSearch可以直接从官网下载压缩包解压即可，下载时顺便也可以将Kibana下载下来：
ElasticSearch官网下载地址：ElasticSearch
Kibana官网下载地址：Kibana
下载以后解压，双击打开ElasticSearch.bat和kibana.bat即可。

kibana用来作为ElasticSearch的可视化工具，如果想进行汉化，可以在kibana文件夹下的kibana.yml中添加i18n.locale: "zh-CN"配置即可。
ElasticSearch的默认端口为9200，Kibana的默认端口为5601

2.SpringBoot中集成ElasticSearch

2.1.添加Maven依赖

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
        </dependency>
        <dependency>
        	<groupId>jakarta.json</groupId>
        	<artifactId>jakarta.json-api</artifactId>
        </dependency>

2.2.添加ElasticSearch的配置类

@Configuration
public class ElasticSearchConfig {

    @Value("${spring.data.elasticsearch.custom.cluster-nodes}")
    private String clusterNodes ;
    
    @Bean
	public ElasticsearchClient elasticsearchClient() {
    	HttpHost httpHost=HttpHost.create(clusterNodes);
		RestClient restClient = RestClient.builder(httpHost).build();
        ElasticsearchTransport transport = new RestClientTransport(restClient, new JacksonJsonpMapper());
        ElasticsearchClient client = new ElasticsearchClient(transport);
        return client;
	
		// 如果设置了密码，可以按照如下配置
		/* 
		CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
        credentialsProvider.setCredentials(AuthScope.ANY, new UsernamePasswordCredentials("账号", "密码"));
        RestClient restClient = RestClient.builder(new HttpHost("xx.xx.xx.xx",9200)).setHttpClientConfigCallback(httpAsyncClientBuilder -> {
            httpAsyncClientBuilder.setDefaultCredentialsProvider(credentialsProvider);
            return httpAsyncClientBuilder;
        }).build();
        ElasticsearchTransport transport = new RestClientTransport(restClient, new JacksonJsonpMapper());
        return new ElasticsearchClient(transport); 
        */
	}	
}

Note:

在SpringBoot中使用ElasticSearch时需要添加依赖jakarta.json，具体原因参见官网：jakarta-json

yaml中的配置内容如下：

spring:
   data:
      elasticsearch:
         custom:
           cluster-nodes: 127.0.0.1:9200

2.3. ElasticSearch的操作使用

@Service
public class ElasticSearchService {	
	@Autowired
	private ElasticsearchClient client;
	
	public List<ESProduction> searchByKeyword() throws ElasticsearchException, IOException{
        // 使用title字段进行分词查询（match）
        Query titleQuery=MatchQuery.of(m->m.field("title").query("手"))._toQuery();
        // 使用category字段进行精确查询（term）
        Query categoryQuery=TermQuery.of(m->m.field("category.keyword").value("华为"))._toQuery();   
        // 多条件查询（MultiMatch）
        Query multiFieldQuery=MultiMatchQuery.of(m->m.fields("title","category").query("机"))._toQuery();   
		// 设置关键词高亮
        Highlight highlight=Highlight.of(x->x//
        		.preTags("<font color='red'>")//
        		.postTags("</font>")//
        		.fields("*", f->f));
        // 构建查询
        SearchRequest searchRequest=new SearchRequest//
        		.Builder()//
        		.index("shopping")//
				.query(titleQuery)//
        		//.query(categoryQuery)//
        		//.query(multiFieldQuery)//
        		.highlight(highlight)//
        		.build();
        SearchResponse<ESProduction> response=client.search(searchRequest,ESProduction.class);
//      List<ESProduction> esProductions=response.hits().hits().stream().map(Hit::source).collect(Collectors.toList());
        List<ESProduction> esProductions= response.hits().hits().stream().map(x->{
        	ESProduction esProduction=new ESProduction();
        	// 这里的x.highlight()是用于获取被高亮的内容
        	esProduction.setTitle(x.highlight().get("title").toArray()[0].toString());
        	esProduction.setCategory(x.source().getCategory());
        	esProduction.setPrice(x.source().getPrice());
        	return esProduction;
        }).collect(Collectors.toList());
        
		return esProductions;
	}	
}

Note：

在上述使用category字段进行精确查询时，这里需要配置成category.keyword，如果只写category的话，就会查询不到数据，为什么会产生这种状况？
原因分析：
因为在存入数据不设置映射（mapping）的时候，ElasticSearch会默认的帮助设置字段（field）的数据类型，并且会附带产生一个keyword字段，类型也为keyword；在ElasticSearch中字符类型的数据包含两种：text和keyword，这两种主要的区别在于text会进行分词，而keyword不进行分词，用于精确查询，所以当使用category进行查询时输入的字符串会被进行分词，这时再使用term来查询时就会出现查询不到的情况。
解决办法：
知道了问题的原因，那么就可以知道解决问题的办法有两种：
1.使用字段.keyword来进行检索，这种方式不会破坏ElasticSearch默认建立的映射；
2.为对应的文档添加映射（mapping），让我们要搜索的字段不分词处理；但是需要注意一点，已经添加mapping的字段是不能进行修改的，应该先建立一个index再进行修改；

实体类内容如下：

@Data
public class ESProduction implements Serializable{
	private static final long serialVersionUID = 1L;	
	private String title;
	private String category;
	private String price;
}

ElasticSearch中的数据内容以及映射信息如下图所示：

在这里插入图片描述

参考博文：

ES8(Java API Client)查询详解

3. Windows上构建ElasticSearch集群

准备在Windows上搭建两个节点的ElasticSearch集群，首先在上文单机ElasticSearch所在的文件夹下进行备份，这样就构建了两个节点出来：
在这里插入图片描述

3.1.修改ElasticSearch的配置文件

修改节点文件夹/config/elasticsearch.yml配置文件，第一个节点的配置内容如下：

# 设置集群名称
cluster.name: my-application
# 设置节点名称，集群内要唯一
node.name: node-9200
# 设置节点类型
# 以下配置说明当前节点即可以作为主节点的候选节点，也可以作为Data节点来存储数据
node. Roles: [master, data]
# 设置通信地址、端口以及节点之间通信的端口
network. Host: localhost
http.port: 9200
# 设置节点之间交互的端口，默认：9300
transport. Port: 9300
# 配置跨域参数
http.cors.enabled: true
http.cors.allow-origin: "*"
# 配置该节点会与哪些候选地址进行通信
discovery.seed_hosts: ["localhost:9301"]
# 集群初始化的提供的master候选地址，第一次启动时将从该列表中获取master
cluster.initial_master_nodes: ["node-9200", "node-9201"]

第二个节点的配置如下：

# 设置集群名称
cluster.name: my-application
# 设置节点名称，集群内要唯一
node.name: node-9201
# 设置节点类型
# 以下配置说明当前节点即可以作为主节点的候选节点，也可以作为Data节点来存储数据
node. Roles: [master, data]
# 设置通信地址、端口以及节点之间通信的端口
network. Host: localhost
http.port: 9201
# 设置节点之间交互的端口，默认：9300
transport. Port: 9301
# 配置跨域参数
http.cors.enabled: true
http.cors.allow-origin: "*"
# 配置该节点会与哪些候选地址进行通信
discovery.seed_hosts: ["localhost:9300"]
# 集群初始化的提供的master候选地址，第一次启动时将从该列表中获取master
cluster.initial_master_nodes: ["node-9200", "node-9201"]

3.2. 启动和测试集群

3.2.1. 启动集群

集群启动之前需要先删除ElasticSearch对应文件夹下data文件夹中的所有内容

点击每一个ElasticSearch节点文件夹下bin/elasticsearch.bat，启动即可。

3.2.2. 测试集群状况

根据配置，可以在浏览器中输入http://localhost:9200/_cluster/health来查看集群的状况，或者在Kibana中查看也可以，如下图：
在这里插入图片描述

在查询结果中status参数有三种情况：green、yellow、red，每种结果对应的情况如下：

green：最健康的状态，代表所有的主分片和副本分片都可用；
yellow：所有的主分片可用，但是部分副本分片不可用；
red：部分主分片不可用（此时查询部分数据仍然可以查到，遇到这种情况，还是赶快解决比较好）

之前在配置的过程中遇到因为存在未分配片unassigned_shards而导致status为yellow的情况，解决办法可以参考博文：elasticsearch 未分配分片unassigned导致集群为Yellow状态修复

如果想在查询结果中包含索引的状态，可以使用GET /_cluster/health?level=indices来查询。

3.3. SpringBoot中集成ElasticSearch集群

在SpringBoot中使用单节点的ElasticSearch与使用ElasticSearch集群，主要的区别在于配置文件上和配置类的差别，SpringBoot配置文件application.yml修改如下：

spring:
   data:
      elasticsearch:
         custom:
           cluster-nodes: 127.0.0.1:9200,127.0.0.1:9201

ElasticSearch的配置类如下：

@Configuration
public class ElasticSearchConfig {

    @Value("${spring.data.elasticsearch.custom.cluster-nodes}")
    private String clusterNodes ;
    
    @Bean
	public ElasticsearchClient elasticsearchClient() {
    	//HttpHost httpHost=HttpHost.create(clusterNodes);
    	String[] hosts = this.clusterNodes.split(",");
        HttpHost[] httpHosts = new HttpHost[hosts.length];
        for(int i=0;i<hosts.length;i++) {
            String host = hosts[i].split(":")[0];
            int port = Integer.parseInt(hosts[i].split(":")[1]);
            httpHosts[i] = new HttpHost(host, port, "http");
        }
		RestClient restClient = RestClient.builder(httpHosts).build();
        ElasticsearchTransport transport = new RestClientTransport(restClient, new JacksonJsonpMapper());
        ElasticsearchClient client = new ElasticsearchClient(transport);
        return client;
 }