Elasticsearch基础入门

最新推荐文章于 2024-04-30 09:53:09 发布

Qiu-F

最新推荐文章于 2024-04-30 09:53:09 发布

阅读量142

点赞数

分类专栏： Elasticsearch 文章标签： elasticsearch

原文链接：https://mp.weixin.qq.com/s?__biz=MzA3MjQ1MTQzMQ==&mid=2247490657&idx=1&sn=fd56391ee51d5ccef1873b04ea4e0060&chksm=9f1f4869a868c17ffffc38899b2d387aad002da3713c246047d1c7b3b70bad97c3a9ca1d30b5&mpshare=1&scene=23&srcid=0808VofXdcGj07miqJgyp2w4&sharer_sharetim

版权

Elasticsearch 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

ES的核心概念

ElasticSearch 是面向文档的非关系型数据库！

Relational DB	Elasticsearch
数据库（database）	索引（indices）
表（tables）	类型（types）
行（rows）	文档（documents）
字段（columns）	字段（fields）

elasticsearch(集群)中可以包含多个索引(数据库)
每个索引中可以包含多个类型(表)
每个类型下又包含多个文档(行)
每个文档中又包含多个字段(列)

ES基础操作

IK 分词器

ik_max_word 是细粒度分词，会穷尽一个语句中所有分词可能

ik_smart 是粗粒度分词，优先匹配最长词，不会有重复的数据

// 粗粒度分词
GET _analyze
{
  "analyzer":"ik_smart",
  "text": "念念不忘，必有回响"
}

// 细粒度分词
GET _analyze
{
  "analyzer":"ik_max_word",
  "text": "念念不忘，必有回响"
}

自定义词库

进入elasticsearch/plugins/ik/config目录
新建一个my.dic文件，编辑内容：

念念不忘
必有回响

修改IKAnalyzer.cfg.xml（在config 目录下）

<properties>
   <comment>IK Analyzer 扩展配置</comment>
   <!-- 用户可以在这里配置自己的扩展字典 -->
   <entry key="ext_dict">my.dic</entry>
   <!-- 用户可以在这里配置自己的扩展停止词字典 -->
   <entry key="ext_stopwords"></entry>
</properties>

修改完配置重新启动 elasticsearch，再次测试！

基本 Rest 命令说明

method	url地址	描述
PUT	localhost:9200/索引名称/类型名称/文档id	创建文档（指定文档 id ）
POST	localhost:9200/索引名称/类型名称	创建文档（随机文档 id ）
POST	localhost:9200/索引名称/类型名称/文档id/_update	修改文档
DELETE	localhost:9200/索引名称/类型名称/文档id	删除文档
GET	localhost:9200/索引名称/类型名称/文档id	通过文档id查询文档
POST	localhost:9200/索引名称/类型名称/_search	查询所有数据

描述	字段类型
字符串类型	text、keyword
数值类型	long、integer、short、byte、double、float、half_float、scaled_float
日期类型	date
布尔值类型	boolean
二进制类型	binary

关于索引的基本操作

在 ES7 中新增索引并指定字段类型

put /student
{
  "settings":{
    "number_of_shards":3,
    "number_of_replicas":2
  },
  "mappings":{
    "properties":{
      "id":{"type":"long"},
      "name":{"type":"text","analyzer":"ik_smart"},
      "text":{"type":"text","analyzer":"ik_max_word"}
    }
  }
}
 
put /test1
{
  "settings":{
    "number_of_shards":3,
    "number_of_replicas":2
  },
  "mappings":{
    "properties":{
      "id":{"type":"long"},
      "name":{"type":"text"},
      "text":{"type":"text"}
    }
  }
}

输出如下，说明创建成功了

{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "test"
}

// 如果索引不存在，则会自动创建，并配置默认字段类型
PUT /test2/_doc/1
{
  "name":"大数据梦想家",
  "age":21,
  "birthday":"2000-02-06"
}

GET 查看

// 查看索引信息
GET /test

// 查看
GET /test2/_doc/1

UPDATE 修改

// put或post 覆盖方式
// 但如果不小心遗漏字段，会造成数据丢失
PUT /test2/_doc/1
{
  "name":"大数据梦想家1",
  "age":21,
  "birthday":"2000-02-06"
}

使用 _update 更新文档

如何在 name 不消失的情况下更新 age 呢？用 _update

// 请求1
POST student/_doc/2
{
  "name": "李四"
}

// 请求2: 新增 age
POST student/_doc/2/_update
{
  "doc": {
    "age": 10
  }
}

// 查询
GET student/_doc/2

// 查询结果
{
  "_index" : "student",
  "_type" : "_doc",
  "_id" : "2",
  "_version" : 6,
  "_seq_no" : 6,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "name" : "李四",
    "age" : 10
  }
}

// POST student/_doc/2/_update 这种方式会报错：
// Deprecation: [types removal] Specifying types in document update requests is deprecated, use the endpoint /{index}/_update/{id} instead.

所以，建议用下面的方法：

POST student/_update/2
{
  "doc": {
    "age": 11
  }
}

使用 _update_by_query 更新文档

POST student/_update_by_query
{
  "query": { 
    "match": {
      "_id": 2
    }
  },
  "script": {
    "source": "ctx._source.age = 20"
  }
}

DELETE删除索引

DELETE /test

关于文档的基本操作

接下来我们学习关于文档的基本操作，首先先重新创建一个新的索引，并添加一些数据

PUT /test/_create/1
{
  "name":"爱丽丝",
  "age":21,
  "desc":"在最美的年华，做最好的自己！",
  "tags":["技术宅","温暖","思维活跃"]
}

PUT /test/_create/2
{
  "name":"张三",
  "age":23,
  "desc":"法外狂徒",
  "tags":["渣男","交友"]
}

PUT /test/_create/3
{
  "name":"路人甲",
  "age":24,
  "desc":"不可描述",
  "tags":["靓仔","网游"]
}

PUT /test/_create/4
{
  "name":"爱丽丝学Java",
  "age":25,
  "desc":"技术成就自我！",
  "tags":["思维敏捷","喜欢学习"]
}

PUT /test/_create/5
{
  "name":"爱丽丝学Python",
  "age":26,
  "desc":"人生苦短，我用Python！",
  "tags":["好学","勤奋刻苦"]
}

PUT test/_create/6
{
  "name":"大数据老K",
  "age":25,
  "desc":"技术成就自我！",
  "tags":["男","学习","技术"]
}

PUT test/_create/7
{
  "name":"Python女侠",
  "age":26,
  "desc":"人生苦短，我用Python！",
  "tags":["靓女","勤奋学习","善于交际"]
}

打开elasticsearch-head界面，确保我们的数据成功添加到了 ES

查询全部数据

POST /test/_search

根据ID查询

通过 GET 命令，我们可以搜索到指定 id 的文档信息

GET test/_doc/1

条件查询

GET test/_search?q=name:张三

复杂操作搜索 select（排序，分页，高亮，模糊查询，精准查询！）

构建一个查询

GET test/_search
{
  "query":{
    "match": {
      "name": "爱丽丝"
    }
  }
}

默认的话，ES会查询出文档的所有字段，限制显示的字段，只想要部分的字段

实现 select 字段名 from test 效果

// 在查询中，通过 _source 来控制仅返回 name 和 desc 属性
GET test/_search
{
  "query":{
    "match": {
      "name": "爱丽丝"
    }
  },
  "_source":["name","desc"]
}

排序查询

注意：在排序的过程中，只能使用可排序的属性（数字、日期、ID）进行排序，其它类型都不可以

// 降序
GET test/_search
{
  "query":{
    "match": {
      "name": "爱丽丝"
    }
  },
  "sort": [
    { 
      "age": 
      { 
        "order": "desc"
        }
    
     }
   ]
}
// 升序查询，只需要将desc换成了asc即可

分页查询

GET test/_search
{
    "query":{
        "match_all": {}
    },
    "from":0,  // 从第n条开始
    "size":4   // 返回n条数据
}

布尔查询

多条件查询，例如查询name为爱丽丝，age=25岁

MySQL	Elasticsearch
and	must
or	should
not	must_not

filter 条件过滤查询，过滤条件的范围用 range 表示，其余操作如下 :

标识	功能
gt	大于
gte	大于等于
lt	小于
lte	小于等于

// must 使用
GET test/_search
{
  "query":{
    "bool": {
      "must":[
         {
          "match":{
            "name":"爱丽丝"
          }
        },
        {
          "match":{
            "age":25
          }
        }
      ]
    }
  }
}
// must、should、must_not 对应更换即可

Filter过滤，查询 name 为爱丽丝，age 大于 24 的数据，需要使用filter过滤

GET test/_search
{
  "query":{
    "bool":{
      "must": [
        {
          "match": {
            "name": "爱丽丝"
          }
        }
      ],
      "filter": [
        {
          "range": {
            "age": {
              "gt": 24
            }
          }
        }
      ]
    }
  }
}

查询 age 在24到26之间的数据

GET test/_search
{
  "query":{
    "bool":{
      "filter": [
        {
          "range": {
            "age": {
              "gte": 24,
              "lte": 26
            }
          }
        }
      ]
    }
  }
}

短语检索

现在需要查询tags中包含“男”的数据

GET alice/user/_search
{
  "query":{
    "match":{
      "tags":"男"
    }
  }
}

匹配多个标签

// 观察返回的结果，可以发现只要满足一个标签就能返回这个数据了
GET alice/user/_search
{
  "query":{
    "match":{
      "tags":"男 学习"
    }
  }
}

精确查询

term查询是直接通过倒排索引指定的词条进程精确查找

关于分词：

term ，不经过分词，直接查询精确的值
match，会使用分词器解析！（先分析文档，然后再通过分析的文档进行查询！）

// 创建一个索引，并指定类型
PUT test2
{
  "mappings": {
    "properties": {
    
      "name":{
        "type": "text"
      },
    "desc":{
      "type":"keyword"
     }
    }
  }
}

// 插入数据
PUT test2/_doc/1
{
  "name":"爱丽丝学大数据name",
  "desc":"爱丽丝学大数据desc"
}

PUT test2/_doc/2
{
  "name":"爱丽丝学大数据name2",
  "desc":"爱丽丝学大数据desc2"
}

text 和 keyword 类型区别

text类型会被分析器进行分析后匹配查询
keyword类型不会被分析器处理

先精准查询text类型的字段

// text 会被分析器分析查询
GET test2/_search         
{
  "query": {
    "term": {
        "name": "爱"
      }
    }
}
// 查询结果，2条数据都能匹配到

精准查询keyword类型的字段

// keyword 不会被分析所以直接查询 
GET test2/_search          
{
  "query": {
    "term": {
        "desc": "爱"
      }
    }
}
//  查询不到结果

查找多个精确值

为了方便测试，再添加如下数据

PUT test2/_doc/3
{
  "t1":"22",
  "t2":"2021-03-01"
}

PUT test2/_doc/4
{
  "t1":"33",
  "t2":"2021-03-01"
}

GET test2/_search
{
  "query": {
    "bool":{
      "should": [
        {
          "term": {
            "t1":"22"
          }
        },
        {
          "term": {
            "t1":"33"
          }
        }
      ]
    }
  }
}

可以发现2条数据也都能查到，证明就算是term精确查询，也能够查询多个值

当然，除了 bool 查询之外，下面这种方式也同样是可以的

GET test2/_search
{
  "query":{
    "terms":{
      "t1":["22","33"]
    }
  }
}

高亮显示

可以通过highlight属性，来对我们查询的结果的指定字段做高亮显示

GET test/_search
{
  "query":{
     "match": {
       "name": "爱丽丝"
     }
  },
  "highlight":{
    "fields": {
      "name": {}
    }
  }
}

观察返回结果，可以发现搜索相关的结果，被加上了高亮标签<em>

自定义高亮样式

需要在pre_tags中定义标签的前缀，post_tags中定义后缀

GET test/_search
{
  "query":{
     "match": {
       "name": "爱丽丝"
     }
  },
  "highlight":{
    "pre_tags": "<b class='key' style='color:red'>", 
    "post_tags": "</b>",
    "fields": {
      "name": {}
    }
  }
}

学习内容回顾

简单匹配
条件匹配
精确匹配
区间范围匹配
匹配字段过滤
多条件查询
高亮查询

https://www.cnblogs.com/jizhong/p/12102100.html
https://www.cnblogs.com/subendong/p/7667558.html
https://blog.csdn.net/Bobdragery/article/details/106842984
https://blog.csdn.net/u014475245/article/details/107184892/
https://www.bilibili.com/video/BV17a4y1x7zq?p=7&spm_id_from=pageDriver
https://www.letianbiji.com/elasticsearch/es7-add-update-doc.html

Qiu-F

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Elasticsearch基础入门

ES的核心概念ElasticSearch 是面向文档的非关系型数据库！Relational DBElasticsearch数据库（database）索引（indices）表（tables）类型（types）行（rows）文档（documents）字段（columns）字段（fields）elasticsearch(集群)中可以包含多个索引(数据库)每个索引中可以包含多个类型(表)每个类型下又包含多个文档(行)每个文档中又包含多个字段(列)
复制链接

扫一扫