Springboot - ElasticSearch 查询总结(持续更新)

目录

官方文档

pom文件

elasticsearchTemplate

1、termQuery对象进行字符的精确匹配查询

2、boolQuery查询

3、嵌套查询

4、matchQuery用于文本类型字段的搜索

5、query与filter

6、es局部更新数据(Kibana)

7、批量新增

8、批量更新

9、删除索引部分数据

10、给已存在的索引新增字段

11、查询某个字段长度大于多少

12、nested 聚合(对内部字段聚合之后,对外部字段聚合)

13、ES/ElasticSearch 聚合查询时报错:too_many_buckets_exception

14、setting

15、新增或删除别名

16、ES高亮查询

17、索引模板

18、routing

19、top_hits聚合(聚合后获取数据详情)

20、elasticsearch-dump 迁移es数据

21、_reindex旧索引数据转移到新索引


官方文档

https://www.elastic.co/guide/cn/elasticsearch/guide/current/full-body-search.html

pom文件

<!-- elasticsearch -->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>

elasticsearchTemplate

BoolQueryBuilder 当多条件查询的时候可以用它来做拼接,它的should和must相当于mysql中的or和and

termQueryBuilder 有参构造的参数一:字段名,参数二:值查询,表示查询满足该字段的值的文档

MatchQueryBuilder 有参构造的参数一:字段名,参数二:将用户输入的关键字进行分词然后再去查询

1、termQuery对象进行字符的精确匹配查询

es的termQuery对象构造查询语句,精确查询 type= “bird” 的鸟类信息

QueryBuilders.termQuery("type", "bird");

相当于sql语句:

select * from biological where type = 'bird';

2、boolQuery查询

构造boolQuery的对象,在boolQuery对象里面添加逻辑判断条件。

boolquery嵌套的条件有以下类型:

(1) must: 条件必须满足,相当于 and   

(2) should: 条件可以满足也可以不满足,相当于 or

(3) must_not: 条件不需要满足,相当于 not

BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
boolQuery.should(QueryBuilders.termQuery("type", "bird"));
boolQuery.should(QueryBuilders.termQuery("type", "plant"));
boolQuery.must(QueryBuilders.termQuery("name", "demo"));

相当于sql语句:

select * from biological where (type = 'bird' OR type = 'plant') AND (name = 'demo');

wildcardQuery 相当于 like

boolQuery.must(QueryBuilders.wildcardQuery("scientificname",searchMessage+"*"));

PS:数据有空格或者符号,查询会失效

3、嵌套查询

sql语句:

select * from biological where (type = 'bird' AND name = 'test') OR (name = 'demo');
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
boolQuery.should(
    boolQuery.must.QueryBuilders.termQuery("type", "bird")
             .must.QueryBuilders.termQuery("name", "test"));
boolQuery.should(QueryBuilders.termQuery("name", "demo"));

4、matchQuery用于文本类型字段的搜索

matchQuery会将搜索条件按照标准分词器的规则分词,分完词之后分别搜索匹配项。

public Page<NameDataList> NameDataList(String typeId, String searchMessage, int offset, HttpServletRequest request) {
		TaxondataQuery query = new TaxondataQuery();
		
		query.setPage(offset/10);
		query.setQueryString(searchMessage);

		// 复合查询
		BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
		boolQuery.must(QueryBuilders.termQuery("type", typeId));
		//boolQuery.filter(QueryBuilders.rangeQuery("pageSize"));

		// 以下为查询条件, 使用 must query 进行查询组合
		MultiMatchQueryBuilder matchQuery = QueryBuilders.multiMatchQuery(query.getQueryString(), "scientificname",
				"chinesename");
		boolQuery.must(matchQuery);

		PageRequest pageRequest = PageRequest.of(query.getPage(), query.getSize());

		NativeSearchQuery searchQuery = new NativeSearchQueryBuilder().withQuery(boolQuery)
				.withHighlightFields(
						new HighlightBuilder.Field("scientificname"),
						new HighlightBuilder.Field("chinesename"))
				.withPageable(pageRequest).build();
		Page<NameDataList> NameDataLists = elasticsearchTemplate.queryForPage(searchQuery, NameDataList.class, extResultMapper);

		return NameDataLists;
}

5、query与filter

query 四种子句:must,filter,should,mustNot

BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
		boolQuery.must(QueryBuilders.termQuery("chinesename", "云雀"));
		SearchQuery searchQuery = new NativeSearchQueryBuilder()
				.withQuery(boolQuery)  
                .build();
List<NameDataList> NameDataLists = elasticsearchTemplate.queryForList(searchQuery, NameDataList.class);
System.out.println(NameDataLists.toString());

filter比query快

query的时候,会先比较查询条件,然后计算分值,最后返回文档结果;
而filter则是先判断是否满足查询条件,如果不满足,会缓存查询过程(记录该文档不满足结果);
满足的话,就直接缓存结果
综上所述,filter快在两个方面:
    1.对结果进行缓存
    2.避免计算分值

相关学习:吃透 | Elasticsearch filter和query的不同_铭毅天下(公众号同名)-CSDN博客_elasticsearch filter和query区别

6、es局部更新数据(Kibana)

POST 索引名称/_doc/文档ID/_update
{
  "doc":{
    "source_id" : 1369907879588814852
  }
}

7、批量新增

PUT 索引名称/_bulk?refresh
{"index":{"_id": "1"}}
{"it":1627959130532,"larea" : ["其它"]}
public class EsTest {
	
	public void saveAll() {
		BulkRequest bulkRequest = new BulkRequest();
	    IndexRequest indexRequest = new IndexRequest("INDEX_NAME").id("自定义ID")
					.source(GSON.toJson("doc"), XContentType.JSON);
		bulkRequest.add(indexRequest);
		try {
			if (bulkRequest.numberOfActions() > 0) {
				log.info(key + ":数据保存到ES数量:{}", bulkRequest.numberOfActions());
				//刷新策略需要修改
				bulkRequest.setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE);
				BulkResponse response = restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT);
				if (response.hasFailures()) {
					log.error(key + "ES保存失败:{}", response.buildFailureMessage());
				}
			}
		} catch (Exception e) {
			log.error("ES保存失败异常:{}", e);
		}
	}
	
	public void saveAsync() {
		BulkRequest bulkRequest = new BulkRequest();
	    IndexRequest indexRequest = new IndexRequest("INDEX_NAME").id("自定义ID")
					.source(GSON.toJson("doc"), XContentType.JSON);
		bulkRequest.add(indexRequest);
		//刷新策略需要修改
		bulkRequest.setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE);
		restHighLevelClient.bulkAsync(bulkRequest, RequestOptions.DEFAULT, new ActionListener<BulkResponse>() {
			@Override
			public void onResponse(BulkResponse bulkItemResponses) {
				for (BulkItemResponse bulkItemResponse : bulkItemResponses) {
					if (bulkItemResponse.isFailed()) {//判断当前操作是否失败
						//获取失败对象,拿到了failure对象,想怎么玩都行
						BulkItemResponse.Failure failure = bulkItemResponse.getFailure();
						log.error("[dblog] 向elasticsearch插入数据失败:",failure.getCause());
						if(failure.getStatus() == RestStatus.BAD_REQUEST) {
							log.error("[dblog] id=" + bulkItemResponse.getId() + "为非法的请求!");
							continue;
						}
					}
				}
				log.info("INDEX ELASTICSEARCH SUCCESS BULK SIZE: {}", bulkItemResponses.getItems().length);
			}

			@Override
			public void onFailure(Exception e) {
				log.error("INDEX ELASTICSEARCH ERROR: {}", e);
			}
		});
	}
	
}

8、批量更新

es版本是5.5前,script中是inline字段
es版本是6.5后,script中是source字段

脚本:

语言沙盒所需插件
painlessyes内置
groovyno内置
javascriptnolang-javascript
pythonnolang-javascript
语言沙盒所需插件

关键字”script”: 标志以脚本的方式修改文档

“lang”: 表示以何种脚本语言进行修改, “painless”表示以es内置的脚本语言进行修改. 此外es还支持多种脚本语言, 如Python, js等等

“inline”:指定脚本内容 “ctx”代表es上下文, _source 代表文档

POST /索引名称/_update_by_query
{
  "query": {
    "match": {
      "cont": "Hong Kong"
    }
  },
  "script": {
    "lang": "painless",
    "source": "ctx._source.msg_id = params.msg_id",
    "params": {
      "msg_id": "fe01ce2a7fbac8fafaed7c982a04e229"
    }
  }
}

相关链接:https://blog.csdn.net/qq330983778/article/details/103539418

9、删除索引部分数据

POST 索引名称/_delete_by_query
{
  "query": {
    "range": {
      "insert_time": {
        "lte": "now-15d"
      }
    }
  }
}

删除索引全部数据

POST 索引名称/_delete_by_query
{
  "query": { 
    "match_all": {
    }
  }
}

10、给已存在的索引新增字段

PUT /索引名称/_mapping/
{
  "properties": {
    "media_id": {
      "type": "keyword",
      "ignore_above": 256
    },
    "media_name": {
      "type": "keyword",
      "ignore_above": 256
    }
  }
}

11、查询某个字段长度大于多少

使用 filter 过滤

GET /索引名称/_search
{
  "_source": "cont",
  "size": 100, 
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "_ch": {
              "value": "24"
            }
          }
        },
        {
          "term": {
            "lang": {
              "value": "en"
            }
          }
        }
      ],
      "filter": {
        "regexp": {
          "cont": {
            "value": ".{100,}"
          }
        }
      }
    }
  }
}

12、nested 聚合(对内部字段聚合之后,对外部字段聚合)

{
  "size": 0,
  "aggregations": {
    "aggByNest": {
      "nested": {
        "path": "nested_name"
      },
      "aggregations": {
        "termsAgg": {
          "terms": {
            "field": "nested_name.uid",
            "size": 10
          },
          "aggregations": {
            "reverse_path": {
              "reverse_nested": {},
              "aggregations": {
                "cardinalityAgg": {
                  "cardinality": {
                    "field": "title.keyword"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

Java代码:

NestedAggregationBuilder aggByNest = AggregationBuilders
                .nested(CommonConstant.AGG_NEST, CommonConstant.ES_NEST)
                .subAggregation(AggregationBuilders.terms(CommonConstant.AGG_TERMS)
                        .size(10).field(CommonConstant.ES_NEST_NAME)
                        .subAggregation(AggregationBuilders.reverseNested(CommonConstant.AGG_REVERSE_PATH)
                                .subAggregation(cardinalityAggregationBuilder(CommonConstant.AGG_CARDINALITY,CommonConstant.ES_TITLE))));

13、ES/ElasticSearch 聚合查询时报错:too_many_buckets_exception

原因:在做es的聚合查询时,当数据基数非常大,或者查询语句不合理会导致es的查询很吃力,甚至出现以下错误。但有时候确实需要这么查询,这个时候需要修改max_buckets的阈值。

解决方案

1、setting里设置:search.max_buckets ,设置大一点够用就行

PUT /_cluster/settings
{"persistent": {"search.max_buckets": 200000}}

2、或者增加查询条件避免过多的数据聚合查询(比如增加开始结束时间等) 

14、setting

    "settings" : {
      "index" : {
        "max_result_window" : "100000",
        "refresh_interval" : "10s",
        "number_of_shards" : "5",
        "translog" : {
          "flush_threshold_size" : "1024mb",
          "sync_interval" : "30s",
          "durability" : "async"
        },
        "number_of_replicas" : "1"
      }
    }

15、新增或删除别名

es没有修改别名的操作,只能先删除后添加

1、新增别名

POST _aliases
{
  "actions": [
    {
      "add": {
        "index": "index_name",
        "alias": "index_read"
      }
    }
  ]
}

2、删除别名

POST _aliases
{
  "actions": [
    {
      "remove": {
        "index": "index_name",
        "alias": "index_read"
      }
    }
  ]
}

3、is_write_index修改为false

POST _aliases
{
  "actions": [
    {
      "add": {
        "index": "index_name",
		"alias": "index_write",
        "is_write_index": false
      }
    }
  ]
}

16、ES高亮查询

public class EsTest {
	
	private final static String htmlMarkFirst = "<span style=\"color:red;\">";

	private final static String htmlMarkLast = "</span>";
	
	private SearchSourceBuilder queryBuilderByKeyword(SearchCondition condition) {
		SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
		final BoolQueryBuilder queryMustBuilder = QueryBuilders.boolQuery();
		String mcObjectCode = condition.getMcObjectCode();
		String keyword = condition.getKeyword();
		if (StrUtil.isNotBlank(mcObjectCode)) {
			String mustKey = keyword.trim().replaceAll(" +"," OR ");
			Index index = Index.valueOf(mcObjectCode);
			HighlightBuilder highlightBuilder = new HighlightBuilder();
			queryMustBuilder.should(QueryBuilders.queryStringQuery(mustKey).field("author_name"));
			queryMustBuilder.should(QueryBuilders.queryStringQuery(mustKey).field("author_desc"));
			highlightBuilder.field("author_name").field("author_desc")
				.preTags(htmlMarkFirst).postTags(htmlMarkLast);
		}
        searchSourceBuilder.query(queryMustBuilder).highlighter(highlightBuilder);
//		log.info("request builder: {}", searchSourceBuilder.toString());
		return searchSourceBuilder;
	}
	
	
	public SearchHit[] query(SearchCondition condition) {
		try {
            SearchRequest searchRequest = new SearchRequest(CommonConstant.INDEX_NAME).source(
					queryBuilderByKeyword(condition));
			SearchResponse search = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
			return search.getHits().getHits();
		} catch (Exception e) {
			log.error("SEARCH CONDITION: {} ELASTICSEARCH ERROR: {}", condition, e);
			return null;
		}
	}	
	
	public static void main(String[] args) {
		SearchHit[] searchHits = elasticsearchQuery.query(condition);
		for (SearchHit hit : hits) {
			Map<String, HighlightField> highlightFields = hit.getHighlightFields();
			if (ObjectUtil.isNotEmpty(highlightFields.get("author_name"))) {
				hit.getSourceAsMap().put("author_name",
				highlightFields.get("author_name").getFragments()[0].toString());
			}
			if (ObjectUtil.isNotEmpty(highlightFields.get("author_desc"))) {
				hit.getSourceAsMap().put("author_desc",
				highlightFields.get("author_desc").getFragments()[0].toString());
			}
		}
	}

}

17、索引模板

PUT %3Cdemo-%7Bnow%2Fd%7D-000001%3E
{
  "aliases": {
    "demo_write": {
      "is_write_index": true
    }
  }
}

PUT /_template/demo_template
{
  "index_patterns": [
    "demo-*"
  ],
  "aliases": {
    "demo_read": {}
  },
  "settings": {
    "index": {
      "max_result_window": "100000",
      "refresh_interval": "5s",
      "number_of_shards": "5",
      "translog": {
        "flush_threshold_size": "1024mb",
        "sync_interval": "30s",
        "durability": "async"
      },
      "number_of_replicas": "1"
    }
  },
  "mappings": {
    "properties": {
      "demo_url": {
        "type": "keyword"
      },
      "demo_id": {
        "type": "keyword"
      },
      "demo_type": {
        "type": "short"
      },
      "pt": {
        "type": "date"
      },
      "labels": {
        "type": "nested",
        "properties": {
          "user_id": {
            "ignore_above": 256,
            "type": "keyword"
          },
          "score": {
            "index": false,
            "store": false,
            "type": "float",
            "doc_values": false
          }
        }
      }
    }
  }
}

18、routing

玩转Elasticsearch routing功能 - Elastic 中文社区

19、top_hits聚合(聚合后获取数据详情)

{
  "from": 0,
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "title": "十一"
          }
        }
      ]
    }
  },
  "aggregations": {
    "termsAgg": {
      "terms": {
        "field": "title.keyword",
        "size": 50,
        "min_doc_count": 1,
        "shard_min_doc_count": 0,
        "show_term_doc_count_error": false
      },
      "aggregations": {
        "topHitsAgg": {
          "top_hits": {
            "from": 0,
            "size": 1,
            "version": false,
            "seq_no_primary_term": false,
            "explain": false,
            "sort": [
              {
                "hot_value": {
                  "order": "desc"
                }
              }
            ]
          }
        }
      }
    }
  }
}

Java代码:

		BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery()
				.must(QueryBuilders.matchQuery("title", condition.getTitle()));
		SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder().query(boolQueryBuilder);

		TermsAggregationBuilder termsAggregationBuilder = termsAggregationBuilder("termsAgg",
				"title.keyword", 50).
				.subAggregation(topHitsAggregationBuilder("topHitsAgg", 1)));
		SearchSourceBuilder aggregation = searchSourceBuilder.aggregation(termsAggregationBuilder);
		SearchRequest searchRequest = searchRequestBuilder(aggregation);
		try {
			SearchResponse search = client.search(searchRequest, RequestOptions.DEFAULT);
			ParsedTerms parsedTermsPlatform = search.getAggregations().get("termsAgg");
			List<TrendingResponse> collect = parsedTermsPlatform.getBuckets().stream().map(p -> {
				ParsedTopHits parsedTopHits = p.getAggregations().get("topHitsAgg");
				SearchHit[] hits = parsedTopHits.getHits().getHits();
				String id = hits[0].getId();
				Map<String, Object> sourceAsMap = hits[0].getSourceAsMap();
				sourceAsMap.put("id", id);
				return mapper.map(sourceAsMap, TrendingResponse.class);
			}).collect(Collectors.toList());
			responseList.addAll(collect1);
			return responseList;
		} catch (IOException e) {
			log.error("SEARCH IS ERROR: ", e);
			return Response.error("查询失败");
		}
		
		public TermsAggregationBuilder termsAggregationBuilder(String termsAggName, String fieldName, Integer size) {
			return AggregationBuilders.terms(termsAggName).field(fieldName).size(size).executionHint("map");
		}
		
		public TopHitsAggregationBuilder topHitsAggregationBuilder(String topHitsName, Integer size) {
			return AggregationBuilders.topHits(topHitsName).sort("time", SortOrder.DESC).size(size);
		}

20、elasticsearch-dump 迁移es数据

使用ElasticSearch-dump进行数据迁移、备份_刘李404not found的博客-CSDN博客_elasticsearch-dump文章目录一、安装1.1 安装 node1.2 安装 Elasticdump二、使用2.1 `elasticdump` 使用方法:2.2 `multielasticdump` 使用方法三、实战3.1 迁移3.1.1 在线迁移3.1.2 离线迁移3.2 备份3.2.1 单索引3.2.2 全索引四、脚本elasticsearch-dump文档:https://github.com/elasticsearch-dump/elasticsearch-dump一、安装1.1 安装 node注:node 版本不低https://blog.csdn.net/qq_39680564/article/details/118539979

21、_reindex旧索引数据转移到新索引

POST _reindex
{
  "source": {
    "index": "test_data_1"
  },
  "dest": {
    "index": "test_data_2"
  }
}

相关推荐:

elasticsearch(es)查询api,结果集排序/分页/范围查询;es查询某个字段不为null且不为空;分组聚合distinct_好大的月亮的博客-CSDN博客_es查询某个字段不为空查询某个字段不等于空字符串,must_not反向查找,不等于匹配值的结果集查询某个字段的值不等于空字符串GET aunt/aunt_type/_search{ "query": { "bool": { "must_not": [ { "term": { "auntUserId": { "value": "" } } }https://blog.csdn.net/weixin_43944305/article/details/119560604

Elasticsearch Java Rest Client API 整理总结 (二) —— SearchAPI - ReyCG - 博客园[TOC] 引言 在 "上一篇" 中主要介绍了 Document API,本节中讲解 Search APIs Java High Level REST Client 支持下面的 Shttps://www.cnblogs.com/reycg-blog/p/9946821.html

 es聚合优化:

一次ES性能优化,我发现了搞大数据的真相…… - 知乎作者介绍 李猛,数据领域专家,Elastic Stack国内顶尖实战专家,国内首批Elastic官方认证工程师21人之一。2012年入手Elasticsearch,对Elastic Stack技术栈开发、架构、运维、源码、算法等方面有深入实战经验。负…https://zhuanlan.zhihu.com/p/462391323

  • 10
    点赞
  • 34
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值