Elasticsearch02

加包辣条多放辣椒

于 2023-07-01 10:59:42 发布

阅读量137

点赞数

分类专栏：技术专栏文章标签： elasticsearch

本文链接：https://blog.csdn.net/qq_45393339/article/details/131488123

版权

技术专栏专栏收录该内容

20 篇文章 0 订阅

订阅专栏

1.elasticsearch的查询

1.1基本查询

基本查询的语法格式

GET /索引库名/_search
{
    "query":{
        "查询类型":{
        	"查询条件":"查询条件值"
        }
    }
}

查询类型：
- 例如： match_all ， match ， term ， range 等等。
查询条件：查询条件会根据类型的不同，写法也有差异，后面详细讲解。

1.1.1查询所有(match_all)

query ：代表查询对象
match_all ：代表查询所有

在这里插入图片描述

took：查询花费时间，单位是毫秒
time_out：是否超时
_shards：分片信息 hits：搜索结果总览对象
- total：搜索到的总条数
- max_score：所有结果中文档得分的最高分
- hits：搜索结果的文档对象数组，每个元素是一条搜索到的文档信息
  - _index：索引库
  - _type：文档类型
  - _id：文档id
  - _score：文档得分
  - _source：文档的源数据

1.1.2匹配查询（match）

先插入一条数据

PUT /test/goods/3
{
    "title":"小米电视4A",
    "images":"http://image.test.com/12479122.jpg",
    "price":3899.00
}

or的关系

GET /test/_search
{
        "query":{
            "match":{
            	"title":"小米电视"
        }
    }
}

在这里插入图片描述

and关系

GET /test/_search
{
    "query":{
        "match": {
            "title": {
            	"query": "小米电视",
            	"operator": "and"
            }
        }
    }
}

在这里插入图片描述

本例中，只有同时包含`小米`和`电视`的词条才会被搜索到

使用分词后的百分比

GET /test/_search
{
    "query":{
        "match":{
            "title":{
                "query":"小米曲面电视",
                "minimum_should_match": "75%"
            }
        }
    }
}

分词后，只要有超过75%的词条匹配上就可以了

1.1.3多字段查询（multi_match）

在title和subTitle两个字段中查询小米这个词

GET /test/_search
{
    "query":{
        "multi_match": {
            "query": "小米",
            "fields": [ "title", "subTitle" ]
        }
    }
}

1.1.4词条匹配(term)

term 查询被用于精确值匹配，这些精确值可能是数字、时间、布尔或者那些未分词的字符串

GET /test/_search
{
    "query":{
        "term":{
        	"price":2699.00
        }
    }
}

1.1.5多词条精确匹配(terms)

terms 查询和 term 查询一样，但它允许你指定多值进行匹配

GET /test/_search
{
    "query":{
        "terms":{
        	"price":[2699.00,2899.00,3899.00]
        }
    }
}

1.2结果过滤

默认情况下，elasticsearch在搜索的结果中，会把文档中保存在 _source 的所有字段都返回。
如果我们只想获取其中的部分字段，我们可以添加 _source 的过滤

1.2.1直接指定字段

GET /test/_search
{
    "_source": ["title","price"],
    "query": {
        "term": {
        	"price": 2699
        }
    }
}

在这里插入图片描述

1.2.2指定includes和excludes

includes：来指定想要显示的字段
excludes：来指定不想要显示的字段

GET /test/_search
{
    "_source": {
    	"excludes": ["images"]
    },
    "query": {
        "term": {
        	"price": 2699
        }
    }
}

1.3高级查询

1.3.1布尔组合（bool)

bool 把各种其它查询通过 must （与）、 must_not （非）、 should （或）的方式进行组合

GET /test/_search
{
    "query":{
        "bool":{
            "must": { "match": { "title": "大米" }},
            "must_not": { "match": { "title": "电视" }},
            "should": { "match": { "title": "手机" }}
        }
    }
}

1.3.2范围查询(range)

range 查询找出那些落在指定区间内的数字或者时间

GET /test/_search
{
    "query":{
        "range": {
            "price": {
                "gte": 1000.0,
                "lt": 2800.00
            }
        }
    }
}

在这里插入图片描述

1.3.3模糊查询(fuzzy)

插入一条测试数据
偏差的编辑距离不得超过2

POST /test/goods/4
{
    "title":"apple手机",
    "images":"http://image.com/12479122.jpg",
    "price":6899.00
}

GET /test/_search
{
    "query": {
        "fuzzy": {
        	"title": "appla"
        }
    }
}
GET /test/_search
{
    "query": {
            "fuzzy": {
                "title": {
                    "value":"app2a",
                    "fuzziness":2
            }
        }
    }
}

1.3.4过滤查询(filter)

所有的查询都会影响到文档的评分及排名。如果我们需要在查询结果中进行过滤，并且不希望过滤条件影响评分，那么就不要把过滤条件作为查询条件来用。而是使用filter方式

GET /test/_search
{
    "query":{
        "bool":{
            "must":{ "match": { "title": "小米手机" }},
            "filter":{
            	"range":{"price":{"gt":2000.00,"lt":3800.00}}
            }
        }
    }
}

1.3.5排序查询

sort 可以让我们按照不同的字段进行排序，并且通过 order 指定排序的方式

GET /test/_search
{
    "query": {
        "match": {
        	"title": "小米手机"
        }
},
	"sort": [
	{
        "price": {
        	"order": "desc"
        	}
        }
    ]
}

GET /test/_search
{
	"query":{
		"bool":{
			"must":{ "match": { "title": "小米手机" }},
            "filter":{
            	"range":{"price":{"gt":2000,"lt":3000}}
            }
        }
    },
    "sort": [
        { "price": { "order": "desc" }},
        { "_score": { "order": "desc" }}
    ]
}

2.聚合aggregations

聚合可以让我们极其方便的实现对数据的统计、分析。例如：

什么品牌的手机最受欢迎？
这些手机的平均价格、最高价格、最低价格？
这些手机每月的销售情况如何？

Elasticsearch中的聚合，包含多种类型，最常用的两种，一个叫`桶`，一个叫`度量`：

2.1基本概念

2.1.1桶(bucket)

桶(bucket)的作用，是按照某种方式对数据进行分组，每一组数据在ES中称为一个`桶`，例如我们根据国籍对人划分，可以得到`中国桶`、`英国桶`，`日本桶`……或者我们按照年龄段对人进行划分：0~10,10~20,20~30,30~40等。

Date Histogram Aggregation：根据日期阶梯分组，例如给定阶梯为周，会自动每周分为一组
Histogram Aggregation：根据数值阶梯分组，与日期类似
Terms Aggregation：根据词条内容分组，词条内容完全匹配的为一组
Range Aggregation：数值和日期的范围分组，指定开始和结束，然后按段分组

bucket aggregations 只负责对数据进行分组，并不进行计算，因此往往bucket中往往会嵌套另一种聚合：metrics aggregations即度量

2.1.2度量

分组完成以后，我们一般会对组中的数据进行聚合运算，例如求平均值、最大、最小、求和等，这些在ES中称为度量

Avg Aggregation：求平均值
Max Aggregation：求最大值
Min Aggregation：求最小值
Percentiles Aggregation：求百分比
Stats Aggregation：同时返回avg、max、min、sum、count等
Sum Aggregation：求和
Top hits Aggregation：求前几
Value Count Aggregation：求总数

2.1.3为了测试，导入测试数据

1.创建索引

PUT /cars
{
    "settings": {
        "number_of_shards": 1,
        "number_of_replicas": 0
    },
    "mappings": {
        "transactions": {
            "properties": {
                "color": {
                	"type": "keyword"
                },
                "make": {
                	"type": "keyword"
                }
            }
        }
    }
}

在ES中，需要进行聚合、排序、过滤的字段其处理方式比较特殊，因此不能被分词。这里我们将color和make这两个文字类型的字段设置为keyword类型，这个类型不会被分词，将来就可以参与聚合

2.导入数据

index 如果文档不存在就创建，如果文档存在就更新

POST /cars/transactions/_bulk
{ "index": {}}
{ "price" : 10000, "color" : "red", "make" : "honda", "sold" : "2019-10-28"
}
{ "index": {}}
{ "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2019-11-05"
}
{ "index": {}}
{ "price" : 30000, "color" : "green", "make" : "ford", "sold" : "2019-05-
18" }
{ "index": {}}
{ "price" : 15000, "color" : "blue", "make" : "toyota", "sold" : "2019-07-
02" }
{ "index": {}}
{ "price" : 12000, "color" : "green", "make" : "toyota", "sold" : "2019-08-
19" }
{ "index": {}}
{ "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2019-11-05"
}
{ "index": {}}
{ "price" : 80000, "color" : "red", "make" : "bmw", "sold" : "2019-01-01" }
{ "index": {}}
{ "price" : 25000, "color" : "blue", "make" : "ford", "sold" : "2019-02-12"
}

2.2聚合为桶

我们按照汽车的颜色 color 来划分桶

GET /cars/_search
{
    "size" : 0,
    "aggs" : {
        "popular_colors" : {
            "terms" : {
            	"field" : "color"
            }
        }
    }
}

size：查询条数，这里设置为0，因为我们不关心搜索到的数据，只关心聚合结果，提高效率
aggs：声明这是一个聚合查询，是aggregations的缩写
- popular_colors：给这次聚合起一个名字，任意。
  - terms：划分桶的方式，这里是根据词条划分
    - field：划分桶的字段

{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 8,
        "max_score": 0,
        "hits": []
    },
    "aggregations": {
        "popular_colors": {
        	"doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": "red",
                    "doc_count": 4
                },
                {
                    "key": "blue",
                    "doc_count": 2
                },
                {
                    "key": "green",
                    "doc_count": 2
                }
          ]
       }
    }
}

hits：查询结果为空，因为我们设置了size为0
aggregations：聚合的结果
popular_colors：我们定义的聚合名称
buckets：查找到的桶，每个不同的color字段值都会形成一个桶
- key：这个桶对应的color字段的值
- doc_count：这个桶中的文档数量
通过聚合的结果我们发现，目前红色的小车比较畅销！

2.3桶内度量

每种颜色汽车的平均价格是多少

GET /cars/_search
{
    "size" : 0,
    "aggs" : {
        "popular_colors" : {
            "terms" : {
            	"field" : "color"
            },
            "aggs":{
            	"avg_price": {
            		"avg": {
            			"field": "price"
        			}
        		}
        	}
        }
    }
}

aggs：我们在上一个aggs(popular_colors)中添加新的aggs。可见度量也是一个聚合
avg_price：聚合的名称
avg：度量的类型，这里是求平均值
field：度量运算的字段

2.4桶内嵌套桶

我们想统计每种颜色的汽车中，分别属于哪个制造商，按照make字段再进行分桶

GET /cars/_search
{
    "size" : 0,
    "aggs" : {
        "popular_colors" : {
        	"terms" : {
        		"field" : "color"
        	},
            "aggs":{
            	"avg_price": {
            		"avg": {
            			"field": "price"
            		}
            	},
            	"maker":{
            		"terms":{
            			"field":"make"
            		}
            	}
            }
        }
    }
}

原来的color桶和avg计算我们不变
maker：在嵌套的aggs下新添一个桶，叫做maker
terms：桶的划分类型依然是词条
filed：这里根据make字段进行划分

2.5阶梯分桶Histogram

每隔5000为一个桶

GET /cars/_search
{
    "size":0,
        "aggs":{
        "price":{
            "histogram": {
                "field": "price",
                "interval": 5000
            }
        }
    }
}

中间有大量的文档数量为0 的桶
增加一个参数min_doc_count为1，来约束最少文档数量为1

GET /cars/_search
{
    "size":0,
    "aggs":{
        "price":{
            "histogram": {
                "field": "price",
                "interval": 5000,
                "min_doc_count": 1
            }
        }
    }
}

3.Spring Data Elasticsearch

3.1创建es工程

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>

3.2yml文件配置

spring:
	data:
		elasticsearch:
			client:
				reactive:
					endpoints: 192.168.200.129:9200
# 以下两个属性在新版本中已经不建议使用，9300属于elasticsearch集群各个节点之间的通讯接口，推荐使用9200的RestLevelClient操作。
# cluster-name: elasticsearch
# cluster-nodes: 192.168.200.129:9300
    elasticsearch:
        rest:
        	uris: 192.168.200.129:9200

3.3实体类

实体类

//indexName设置要构建的索引的名称，shards：主分片数量，replicas：副本分片数量
@Document(indexName = "item",shards = 1,replicas = 0)
public class Goods {

    @Id   //主键设置
    private int id;
    //映射属性设置
    @Field(type = FieldType.Text,analyzer = "ik_max_word")
    private String title;
    @Field(type = FieldType.Text,analyzer = "ik_max_word")
    private String sell_point;
    @Field(type = FieldType.Double)
    private double price;
    @Field(type = FieldType.Integer)
    private int num;
    @Field(type = FieldType.Keyword)
    private String image;
    //@Field(type = FieldType.Date,format = DateFormat.year_month_day)
    @Field(type = FieldType.Date,format = DateFormat.custom,pattern = "uuuu-MM-dd'T'HH:mm:ss.SSSX")
    private Date createTime;
    @Field(type = FieldType.Text, analyzer = "ik_max_word")
    private String category;
    @Field(type = FieldType.Text, analyzer = "ik_max_word")
    private String brand;

    public Goods() {
    }

    public Goods(int id, String title, String sell_point, double price, int num, String image, Date createTime, String category, String brand) {
        this.id = id;
        this.title = title;
        this.sell_point = sell_point;
        this.price = price;
        this.num = num;
        this.image = image;
        this.createTime = createTime;
        this.category = category;
        this.brand = brand;
    }

    public int getId() {
        return id;
    }

    public void setId(int id) {
        this.id = id;
    }

    public String getTitle() {
        return title;
    }

    public void setTitle(String title) {
        this.title = title;
    }

    public String getSell_point() {
        return sell_point;
    }

    public void setSell_point(String sell_point) {
        this.sell_point = sell_point;
    }

    public double getPrice() {
        return price;
    }

    public void setPrice(double price) {
        this.price = price;
    }

    public int getNum() {
        return num;
    }

    public void setNum(int num) {
        this.num = num;
    }

    public String getImage() {
        return image;
    }

    public void setImage(String image) {
        this.image = image;
    }

    public Date getCreateTime() {
        return createTime;
    }

    public void setCreateTime(Date createTime) {
        this.createTime = createTime;
    }

    public String getCategory() {
        return category;
    }

    public void setCategory(String category) {
        this.category = category;
    }

    public String getBrand() {
        return brand;
    }

    public void setBrand(String brand) {
        this.brand = brand;
    }

    @Override
    public String toString() {
        return "Goods{" +
                "id=" + id +
                ", title='" + title + '\'' +
                ", sell_point='" + sell_point + '\'' +
                ", price=" + price +
                ", num=" + num +
                ", image='" + image + '\'' +
                ", createTime=" + createTime +
                ", category='" + category + '\'' +
                ", brand='" + brand + '\'' +
                '}';
    }
}

3.4编写测试类Controller

@RestController
public class EsController {
    
    @Autowired
    private ElasticsearchRestTemplate elasticsearchRestTemplate;
    
    //校验连接操作对象是否可用
    @RequestMapping("/access")
    public String access(){
        System.out.println(elasticsearchRestTemplate);
        return "access success";
    }
    
    //创建索引，并设置映射
    @RequestMapping("/create")
    public String create(){
    	//创建索引，会根据Goods类的@Document注解信息来创建，返回是否操作成功标识
    	boolean b =elasticsearchRestTemplate.indexOps(Goods.class).create();
    	//配置映射，会根据Goods类中的id、Field等字段来自动完成映射，返回是否操作成功标识
    	boolean b1 =
 elasticsearchRestTemplate.indexOps(Goods.class).putMapping(Goods.class);
   		return "create success";
    }
}

3.5运行测试

在这里插入图片描述

3.6Repository文档操作

Spring Data 的强大之处，就在于你不用写任何DAO处理，自动根据方法名或类的信息进行 CRUD操作。只要你定义一个接口，然后继承Repository提供的一些子接口，就能具备各种基本的CRUD功能。

3.6.1创建如下Repository接口

//创建elasticsearch操作类
public interface GoodsRepository extends
ElasticsearchRepository<Goods,Long> {
}

3.6.2新增文档

在EsController.java中编写。

@Autowired
private GoodsRepository goodsRepository;

@RequestMapping("/insert")
public String insertOne() {
	Goods goods = new Goods(1, "金立手机", "88K镀金工艺", 998, 100,
"http://fastdfs/jinli.jpg", new Date(), "手机", "金立");
    goodsRepository.save(goods);
    return "insert one success";
}

在这里插入图片描述

3.6.3批量新增

@RequestMapping("/insertMany")
public String insertMany() {
    Goods goods1 = new Goods(2, "小米", "徕卡相机", 1998, 100,
"http://fastdfs/xiaomi.jpg", new Date(), "手机", "小米");
    Goods goods2 = new Goods(3, "华为", "鸿蒙系统", 2998, 100,
"http://fastdfs/huawei.jpg", new Date(), "手机", "华为");
    Goods goods3 = new Goods(4, "手机", "世界我最牛逼", 3998, 100,
"http://fastdfs/shouji.jpg", new Date(), "手机", "手机");
    Goods goods4 = new Goods(5, "苹果", "苹果生态圈", 4998, 100,
"http://fastdfs/apple.jpg", new Date(), "手机", "苹果");
    List<Goods> list = new ArrayList<>();
    list.add(goods1);
    list.add(goods2);
    list.add(goods3);
    list.add(goods4);
    goodsRepository.saveAll(list);
    return "insert many success";
}

3.6.4基本查询

/**
* 查询一条数据
*/
@RequestMapping("/findOne")
public Goods findOne(){
    Optional<Goods> optional = goodsRepository.findById(2L);
    Goods goods = optional.get();
    return goods;
}
/**
* 查询所有数据
*/
@RequestMapping("/findAll")
public Iterable<Goods> findAll(){
    Iterable<Goods> iterable = goodsRepository.findAll();
    return iterable;
}

3.6.5自定义方法

Spring Data 的另一个强大功能，是根据方法名称自动实现功能。比如：你的方法名叫做：findByTitle，那么它就知道你是根据title查询，然后自动帮你完成，无需写实现类。

当然，方法名称要符合一定的约定

在这里插入图片描述

public interface GoodsRepository extends
ElasticsearchRepository<Goods,Long> {
    //自定义方法
    public List<Goods> findByTitle(String title);
}

/**
* 自定义方法查询
*/
@RequestMapping("/findByTitle")
public List<Goods> findByTitle(){
    List<Goods> list = goodsRepository.findByTitle("手机");
    return list;
}

加包辣条多放辣椒

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Elasticsearch02

Spring Data 的另一个强大功能，是根据方法名称自动实现功能。比如：你的方法名叫做：findByTitle，那么它就知道你是根据title查询，然后自动帮你完成，无需写实现类。当然，方法名称要符合一定的约定//自定义方法 public List < Goods > findByTitle(String title);/*** 自定义方法查询List < Goods > list = goodsRepository . findByTitle("手机");
复制链接

扫一扫

专栏目录