docker下安装
安装elasticsearch
docker pull elasticsearch
docker run -d --restart=always -p 9200:9200 -v ~/elasticsearch/data:/usr/share/elasticsearch/data --name elasticsearch elasticsearch
安装kibana
docker pull kibana
docker run -d -p 5601:5601 --link elasticsearch:elasticsearch -e ELASTICSEARCH_URL=http://elasticsearch:9200 --name kibana kibana
安装ElasticSearch Head
chrome商店搜索ElasticSearch Head
概念
- 类比mysql
增删改查都是用的json
elastic | mysql |
---|---|
index | database |
type | table |
document | row |
file | column |
mapping | schema |
everything | index |
PUT index/type | insert |
DELETE index/type | delete |
PUT index/type | update |
GET index/type | select |
- 每个index有10个shard,对应5个primary shade,5个replicate shard
集群健康检查
GET /index/health
基础操作
java中使用ElasticsearchCrudRepository来操作,复杂的使用ElasticsearchTemplate
- 新增三条记录
PUT /ecommerce/product/1
{
"name": "gaolujie yagao",
"desc": "gaoxiao meibai",
"price": 30,
"producer": "gaolujie producer",
"tags": [
"meibai",
"fangzhu"
]
}
PUT /ecommerce/product/2
{
"name": "jiajieshi yagao",
"desc": "youxiao fangzhu",
"price": 25,
"producer": "jiajieshi producer",
"tags": [
"fangzhu"
]
}
PUT /ecommerce/product/3
{
"name": "zhonghua yagao",
"desc": "caoben zhiwu",
"price": 40,
"producer": "zhonghua producer",
"tags": [
"qingxin"
]
}
- 修改一条记录
修改的时候必须带上所有的字段,否则只会剩下更新的字段
PUT /ecommerce/product/1
{
"name": "gaolujie yagao",
}
- 查询一条记录
GET /ecommerce/product/1
{
"_index": "ecommerce",
"_type": "product",
"_id": "1",
"_version": 3, //版本号,每次修改以后+1
"found": true,
"_source": { //详细内容
"name": "gaolujie yagao",
"desc": "gaoxiao meibai",
"price": 31,
"producer": "gaolujie producer",
"tags": [
"meibai",
"fangzhu"
]
}
}
- 删除
DELETE /ecommerce/product/3
多种搜索方式
1.query string search
查询所有
GET /ecommerce/product/_search
返回信息:
{
"took": 1, //耗时:ms
"timed_out": false, //是否超时
"_shards": { //分片.
"total": 5, //因为我们只有一台机器,所以默认是五个primary
"successful": 5,
"failed": 0
},
"hits": {
"total": 2, //结果的数量 2个document
"max_score": 1, //最大的分数
"hits": [
{
"_index": "ecommerce",
"_type": "product",
"_id": "2",
"_score": 1, //分数
"_source": { //详情
"name": "jiajieshi yagao",
"desc": "youxiao fangzhu",
"price": 25,
"producer": "jiajieshi producer",
"tags": [
"fangzhu"
]
}
},
{
"_index": "ecommerce",
"_type": "product",
"_id": "1",
"_score": 1,
"_source": {
"name": "gaolujie yagao",
"desc": "gaoxiao meibai",
"price": 31,
"producer": "gaolujie producer",
"tags": [
"meibai",
"fangzhu"
]
}
}
]
}
}
条件查询
GET /ecommerce/product/_search?q=name:yagao&sort=price:desc
2.query DSL
DSL:Domain Specified Language,特定领域的语言
http request body:请求体,可以用json的格式来构建查询语法,比较方便,可以构建各种复杂的语法,比query string search肯定强大多了
基础语法
GET /ecommerce/product/_search
1.查询
query,要查询的条件全部放到query参数里面
{
"query":{
...
}
}
1.1. 查询所有
"match_all":{
}
使用elasticsearchTemplate的java代码
term是代表完全匹配,也就是精确查询,搜索前不会再对搜索词进行分词,所以我们的搜索词必须是文档分词集合中的一个
TermsBuilder:构造聚合函数
AggregationBuilders:创建聚合函数工具类
BoolQueryBuilder:拼装连接(查询)条件
QueryBuilders:简单的静态工厂”导入静态”使用。主要作用是查询条件(关系),如区间\精确\多值等条件
NativeSearchQueryBuilder:将连接条件和聚合函数等组合
SearchQuery:生成查询
elasticsearchTemplate.query:进行查询
Aggregations:Represents a set of computed addAggregation.代表一组添加聚合函数统计后的数据
Bucket:满足某个条件(聚合)的文档集合
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withQuery(matchAllQuery()).build();
return elasticsearchTemplate.queryForList(searchQuery, Product.class);
1.2. 匹配对应字段
"mathc":{
"FIELD": "TEXT" //字段:值
}
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withQuery(matchQuery("name","gaolujie")).build();
return elasticsearchTemplate.queryForList(searchQuery, Product.class);
2.分页
和query同级
{
"query": {
"match_all": {}
},
"from":1, //从第几条开始 (从0开始)
"size":1 //大小
}
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withQuery(matchQuery("name","gaolujie"))
.withPageable(new PageRequest(1,1)).build(); //分页
return elasticsearchTemplate.queryForList(searchQuery, Product.class);
3.排序
和query同级
{
"sort": [
{
"FIELD": {
"order": "desc"
}
}
]
}
SortBuilder sortBuilder = SortBuilders.fieldSort("price") //排序字段
.order(SortOrder.DESC); //排序方式
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withQuery(matchQuery("name", "gaolujie"))
.withSort(sortBuilder)
.build();
return elasticsearchTemplate.queryForList(searchQuery, Product.class);
4.指定查询的字段
{
"_source": ["name","price"] //结果只显示name和price
}
String[] include = {"name", "price"};
FetchSourceFilter fetchSourceFilter = new FetchSourceFilter(include, null); //两个参数分别是要显示的和不显示的
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withSourceFilter(fetchSourceFilter)
.build();
return elasticsearchTemplate.queryForList(searchQuery, Product.class);
3.query filter
{
"bool":{
}
}
query的子句
Bool查询现在包括四种子句:must,filter,should,must_not
filter比query快
query的时候,会先比较查询条件,然后计算分值,最后返回文档结果;
而filter则是先判断是否满足查询条件,如果不满足,会缓存查询过程(记录该文档不满足结果);
满足的话,就直接缓存结果
综上所述,filter快在两个方面:
1.对结果进行缓存
2.避免计算分值
- must
bool的子句
{
"query": {
"bool": {
"must": {
"match":{
"price":31 //只支持单字段 }
}
}
}
}
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withFilter(
boolQuery()
.must(matchQuery("price",80)) //结构类似.bool -> must ->match
)
.build();
return elasticsearchTemplate.queryForList(searchQuery, Product.class);
- filter
过滤
{
"query": {
"bool": {
"must": {
"match":{
"name":"yagao" }
},
"filter" : {
"range" : {
"price" : { "gt" : 6 } //价格大于6的 }
}
}
}
}
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withFilter(
boolQuery()
.must(
matchQuery("name", "heiren")
)
.filter(
rangeQuery("price")
.gt(6)
)
)
.build();
return elasticsearchTemplate.queryForList(searchQuery, Product.class);
4.full-text search(全文检索)
5.phrase search(短语搜索)
{
"query" : {
"match_phrase": {
"FIELD": "PHRASE"
}
}
}
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withQuery(
matchPhraseQuery("name","heiren")
)
.build();
return elasticsearchTemplate.queryForList(searchQuery, Product.class);
6.highlight search(高亮搜索结果)
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withQuery(
matchPhraseQuery("name", "heiren")
)
.withHighlightFields(new HighlightBuilder.Field("name"))
.build();
highlight 与 query同级
{
"query" : {
"match" : {
"producer" : "producer"
}
},
"highlight": { //高亮显示
"fields" : { //字段
"producer" : {}
}
}
}
返回结果:
....
{
"_index": "ecommerce",
"_type": "product",
"_id": "2",
"_score": 0.25811607,
"_source": {
"name": "jiajieshi yagao",
"desc": "youxiao fangzhu",
"price": 25,
"producer": "jiajieshi producer",
"tags": [
"fangzhu"
]
},
"highlight": { //高亮
"producer": [
"jiajieshi <em>producer</em>"
]
}
},
....
聚合分析
5.x后对排序,聚合这些操作用单独的数据结构(fielddata)缓存到内存里了,需要单独开启
下层agg是起的名字的子句
PUT my_index/_mapping/_doc
{
"properties": {
"my_field": {
"type": "text",
"fielddata": true
}
}
}
这里我们需要
PUT /ecommerce/_mapping/product
{
"properties": {
"tags": {
"type": "text",
"fielddata": true
}
}
}
需求1.: 对名称中包含yagao的商品,计算每个tag下的商品数量
使用aggs,与query同级
GET /ecommerce/product/_search
{
"size": 0,
"query": {
"match": {
"name": "yagao"
}
},
"aggs": { //聚合操作
"all_tags": { //聚合名称,自己起名字
"terms": {
"field": "tags" //聚合的字段
}
}
}
}
返回结果
"aggregations": { //聚合结果
"all_tags": { //查询时候起的名字
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [ //结果存放的地方
{
"key": "fangzhu", //tag
"doc_count": 2 //次数
},
{
"key": "meibai",
"doc_count": 1
}
]
}
}
Map<String, Long> map = new HashMap<>();
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withQuery(
matchQuery("name", "heiren")
)
.addAggregation(
AggregationBuilders.terms("all_tags")
.field("tags")
)
.build();
Aggregations aggregations = elasticsearchTemplate.query(searchQuery,
new ResultsExtractor<Aggregations>() {
@Override
public Aggregations extract(SearchResponse response) {
return response.getAggregations();
}
});
StringTerms modelTerms = (StringTerms) aggregations.asMap().get("all_tags");
for (Terms.Bucket actionTypeBucket : modelTerms.getBuckets()) {
//actionTypeBucket.getKey().toString()聚合字段的相应名称,actionTypeBucket.getDocCount()相应聚合结果
map.put(actionTypeBucket.getKey().toString(),
actionTypeBucket.getDocCount());
}
return map;
需求2:先分组,再算每组的平均值,计算每个tag下的商品的平均价格
GET /ecommerce/product/_search
{
"size": 0,
"aggs": { //先聚合
"group_by_tags": {
"terms": {
"field": "tags"
},
"aggs": { //再聚合
"avg_price": {
"avg": { //平均值函数
"field": "price"
}
}
}
}
}
}
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withQuery(
matchQuery("name", "heiren")
)
.addAggregation(
AggregationBuilders
.terms("group_by_tags")
.field("tags")
.subAggregation(
AggregationBuilders
.avg("price")
.field("price")
)
)
.build();
return elasticsearchTemplate.query(searchQuery, response -> {
//直接返回es原封结果到前端
return JSONObject.parseObject(response.toString());
});
需求3:计算每个tag下的商品的平均价格,并且按照平均价格降序排序
注意,排序在外层聚合函数中
GET /ecommerce/product/_search
{
"size": 0,
"aggs": {
"group_by_tags": {
"terms": {
"field": "tags",
"order": { //排序order
"avg_price": "desc" //对应下面的字段
}
},
"aggs": {
"avg_price": {
"avg": {
"field": "price"
}
}
}
}
}
}
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withQuery(
matchQuery("name", "heiren")
)
.addAggregation(
AggregationBuilders
.terms("group_by_tags")
.field("tags")
.order(
Terms.Order.aggregation("price", false)
)
.subAggregation(
AggregationBuilders
.avg("price")
.field("price")
)
)
.build();
return elasticsearchTemplate.query(searchQuery, response -> {
//直接返回es原封结果到前端
return JSONObject.parseObject(response.toString());
});
需求4:按照指定的价格范围区间进行分组,然后在每组内再按照tag进行分组,最后再计算每组的平均价格
GET /ecommerce/product/_search
{
"size": 0,
"aggs": {
"group_by_price": { //先对价格进行分组
"range": { //区间函数
"field": "price",
"ranges": [
{
"from": 0,
"to": 20
},
{
"from": 20,
"to": 40
},
{
"from": 40,
"to": 50
}
]
},
"aggs": { //染回对tag进行分组
"group_by_tags": {
"terms": {
"field": "tags"
},
"aggs": {
"average_price": { //再求平均值
"avg": {
"field": "price"
}
}
}
}
}
}
}
}
//先分区,再分组,在求平均值
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withQuery(
matchQuery("name", "heiren")
)
.addAggregation(
AggregationBuilders
.range("range_by_price")
.field("price") //写代码的时候遇到的问题
.addRange(0, 20)
.addRange(20, 40)
.addRange(40, 100)
.subAggregation(
AggregationBuilders
.terms("group_by_tags")
.field("tags")
.order(
Terms.Order.aggregation("price", false)
)
.subAggregation(
AggregationBuilders
.avg("price")
.field("price")
)
)
)
.build();
return elasticsearchTemplate.query(searchQuery, response -> {
//直接返回es原封结果到前端
return JSONObject.parseObject(response.toString());
});
上面的注释的地方是自己写代码的时候粗心造成的问题.如果不标注filed的话会报错.报错信息:
org.elasticsearch.search.aggregations.AggregationExecutionException: could not find the appropriate value context to perform aggregation [range_by_price]
原因是: missing target field parameter on aggregat