ES入门和简单使用
文章目录
项目背景:
项目中有一部分涉及到IM,所有的消息都是以一个JSON串的方式存储到一张表中。表的结构大概如下
id | session_id | Message |
---|---|---|
1 | 1 | [json] |
产品提了个需求,可以通过消息关键字来搜索会话。但是目前的表是Json格式的,所以直接用sql语句搜索是不友好的。鉴于此,引入ElasticSearch (ES)。
官方地址
https://www.elastic.co/guide/en/elasticsearch/reference/6.0/getting-started.html
目前已经更新到7.8版本,我们项目里面使用的是7.6
官网有具体的安装、配置和使用教程,想深入学习的可以仔细阅读官方文档
简单介绍
Elasticsearch是面向文档(document oriented)的,这意味着它可以存储整个对象或文档(document)。然而它不仅仅是存储,还会**索引(index)**每个文档的内容使之可以被搜索。在Elasticsearch中,你可以对文档(而非成行成列的数据)进行索引、搜索、排序、过滤。这种理解数据的方式与以往完全不同,这也是Elasticsearch能够执行复杂的全文搜索的原因之一。
同传统数据库的对比
Relational DB -> Databases -> Tables -> Rows -> Columns
Elasticsearch -> Indices -> Types -> Documents -> Fields
Elasticsearch集群可以包含多个索引(indices)(数据库),每一个索引可以包含多个类型(types)(表),每一个类型包含多个文档(documents)(行),然后每个文档包含多个字段(Fields)(列)。
简单使用
1. query
单条件单字段匹配
"query": {
"match": {
"event": { //字段名称
"query": "BitSea 352439384", //搜索字段的内容,空格可以分词
"operator": "and" //and表示分词并,都要有,还有个 默认是or 的关系
}
}
}
对应JAVA代码
//查询条件
QueryBuilder queryBuilder = QueryBuilders
.matchQuery("event", "BitSea 352439384").operator(Operator.AND);
单条件多字段
"query": {
{
"multi_match": {
"query": "搜索内容",
"fields": [ "字段1", "字段2" ]
}
}
}
java代码
QueryBuilder queryBuilder = QueryBuilders.multiMatchQuery("BitSea 352439384", "field1", "field2").operator(Operator.AND);
多条件多字段
"query": {
"bool": {
"should": [
{"match": { "event": "TEXT"}},
{"match": { "FIELD": "TEXT"}}
]
}
}
java代码
BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
queryBuilder.should(QueryBuilders.matchQuery("cname", keyword));
queryBuilder.should(QueryBuilders.matchQuery("industry", keyword));
2 Aggregations 聚合搜索
Aggregations我是用来去重,和查总数的。 它的方法比较多,很强大。
继续以上面的来举例子
去重
{
"query": {
"match": {
"event": {
"query": "BitSea 352439384",
"operator": "and"
}
}
},
"aggs": {
"name": { //去重自定义字段名
"terms": {
"field": "session_id",
"size": 10 //查询数量的大小
}
}
},
"size": 1,
"from": 0
}
aggs和query是同级别的,上面这段的意思是,以session_id
字段对query的结果进行去重,然后查出前10条,来看JAVA代码
//聚合筛选器,主要用来分页和去重
AggregationBuilder aggregation = AggregationBuilders
.terms("去重字段名字").field("session_id").size(10);
//新建自定义查询
QueryBuilder queryBuilder = QueryBuilders
.matchQuery("event", "BitSea 352439384").operator(Operator.AND);
查总数
"aggs": {
"count": { //查询总数
"cardinality": {
"field":"session_id"
}
},
"测试": {
"terms": {
"field": "session_id",
"size":10,
"order":{
"_key": "desc"
}
}
}
}
java代码
AggregationBuilder aggregation = AggregationBuilders.cardinality("count").field("session_id");
3. 查询
查询条件都梳理好了后,进行查询,java代码如下
SearchSourceBuilder ssb = new SearchSourceBuilder()
.query(queryBuilder) //组合查询条件
.sort(sortBuilder) //排序条件
.aggregation(aggregation) //上面的去重
.aggregation(aggregation2); //上面的查总数
SearchRequest sr = new SearchRequest(new String[]{"cs_session_session_event_sharding_alias"}); //自己索引名称
sr.source(ssb);
SearchResponse searchResponse = esClient.search("searchCluster", sr); //获取查询结果
获取结果
Terms sessionIds = searchResponse.getAggregations().get("去重字段名字"); //括号里是刚才自己起的名字,总共有多少个SessionIds
Cardinality cardinality = searchResponse.getAggregations().get("总数");
System.out.println(cardinality.getValue());
for (Terms.Bucket entry : sessionIds.getBuckets()) {
String key = String.valueOf(entry.getKey()); //每个session
long docCount = entry.getDocCount(); //每个击中多少条
System.out.println("key: "+key+ "doc_count:" + docCount);
}
特别注意
特殊字符
title+-&&||!(){}[]^\"~*?:\\
这些都需要转义
你需要引用一个jar,lucene-queryparser,按自己的es版本去添加
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-queryparser</artifactId>
<version>5.4.1</version>
</dependency>
使用QueryParser.escape(String query)方法处理字符即可,
分词器的使用
https://blog.csdn.net/ahwsk/article/details/101270300
题外话
ES7.x版本还支持 sql查询,更推荐这中方式,但是比较复杂的逻辑有时候不太支持,具体的sql关键字可以到官网上查。
public static List<String> searchLike(EsClient esClient, Map<String, Object> queryParams, String uniqueField, @Nullable Integer size) {
//去重
AggregationBuilder doWeight = AggregationBuilders.terms(unique)
.field(uniqueField) //去重字段
.order(BucketOrder.key(false)) //排序方式
.size(size==null? 500:size); //查询多少条
//查总数
// AggregationBuilder sum = AggregationBuilders.cardinality(count).field(uniqueField);
if(queryParams.isEmpty()) { throw new BizException("Es查询入参为空");}
QueryBuilder queryBuilder = null;
for (Map.Entry<String, Object> entry : queryParams.entrySet()) {
queryBuilder = QueryBuilders.matchPhraseQuery("event", QueryParser.escape(entry.getValue().toString()));
// queryBuilder = QueryBuilders.matchQuery(entry.getKey(), QueryParser.escape(entry.getValue().toString())).operator(Operator.AND);
}
SortBuilder sortBuilder = SortBuilders.fieldSort(uniqueField).order(SortOrder.DESC);
SearchSourceBuilder ssb = new SearchSourceBuilder()
.query(queryBuilder).sort(sortBuilder).aggregation(doWeight);
SearchRequest sr = new SearchRequest(new String[]{EsIndexConstant.eventIndex});
sr.source(ssb);
List<String> result = Lists.newArrayList();
try {
SearchResponse searchResponse = esClient.search("searchCluster", sr);
Terms terms = searchResponse.getAggregations().get(unique);
// Cardinality cardinality = searchResponse.getAggregations().get(count);
for (Terms.Bucket bucket : terms.getBuckets()) {
result.add(String.valueOf(bucket.getKey()));
}
} catch (Exception e) {
log.error("查询es出错,入参{},错误{}", queryParams.toString(), e.getMessage());
e.printStackTrace();
}
return result;
}
public static List<String> searchMatch(EsClient esClient, Map<String, Object> queryParams, String uniqueField, @Nullable Integer size) {
//去重
AggregationBuilder doWeight = AggregationBuilders.terms(unique)
.field(uniqueField) //去重字段
.order(BucketOrder.key(false)) //排序方式
.size(size==null? 500:size); //查询多少条
if(queryParams.isEmpty()) { throw new BizException("Es查询入参为空");}
QueryBuilder queryBuilder = null;
for (Map.Entry<String, Object> entry : queryParams.entrySet()) {
// queryBuilder = QueryBuilders.matchPhraseQuery("event", QueryParser.escape(entry.getValue().toString()));
queryBuilder = QueryBuilders.matchQuery(entry.getKey(), QueryParser.escape(entry.getValue().toString())).operator(Operator.AND);
}
List<String> result = Lists.newArrayList();
SortBuilder sortBuilder = SortBuilders.fieldSort(uniqueField).order(SortOrder.DESC);
SearchSourceBuilder ssb = new SearchSourceBuilder()
.query(queryBuilder).sort(sortBuilder).aggregation(doWeight);
SearchRequest sr = new SearchRequest(new String[]{EsIndexConstant.eventIndex});
sr.source(ssb);
try {
SearchResponse searchResponse = esClient.search("searchCluster", sr);
Terms terms = searchResponse.getAggregations().get(unique);
// Cardinality cardinality = searchResponse.getAggregations().get(count);
for (Terms.Bucket bucket : terms.getBuckets()) {
result.add(String.valueOf(bucket.getKey()));
}
} catch (Exception e) {
log.error("查询es出错,入参{},错误{}", queryParams.toString(), e.getMessage());
e.printStackTrace();
}
return result;
}
各种具体示例可参考 https://www.cnblogs.com/ghj1976/p/5293250.html