elasticsearch基础

最新推荐文章于 2023-10-15 00:49:19 发布

kgduu

最新推荐文章于 2023-10-15 00:49:19 发布

阅读量263

点赞数

分类专栏： elasticsearch

本文链接：https://blog.csdn.net/xiexingshishu/article/details/115189243

版权

elasticsearch 专栏收录该内容

34 篇文章 5 订阅

订阅专栏

本文详细介绍了Elasticsearch的基本操作，包括如何进行增删改查，以及索引的配置与管理。内容涵盖添加、获取、更新和删除文档的curl命令，同时讲解了写一致性、自动创建索引的配置，以及映射配置中的数组类型、日期类型和字段映射。此外，还探讨了分析器的使用以及批量操作数据的方法。

摘要由CSDN通过智能技术生成

1、基本概念

2、增删改查

添加

curl -XPUT 'http://host:port/{index}/{type}/{id}' -d '{json数据}'

curl -XPOST 'http://host:port/{index}/{type}' -d '{json数据}' //id自动生成

curl -XPUT 'http://host:port/{index}/{type}/{id}'?version=版本号&version_type=external' -d '{json数据}' 添加数据时指定版本号

获取

curl -XGET 'http://host:port/{index}/{type}/{id}[?pretty]'

更新

curl -XPOST 'http://host:port/{index}/{type}/{id}/_update' -d '{json数据}',这种方式json包含两个字段，一个是script，另外一个是params

其中json数据形如

{
   "script": {
       "inline": "ctx._source.content = params.new_content",
       "params": {
           "new_content": "test"
       }
   }
}

另外一种形式是通过使用doc段

“doc":{

"content":"value"

}

不存在更新时

curl -XPOST 'http://host:port/{index}/{type}/{id}/_update' -d '{json数据}',这种方式json包含两个字段，一个是script，另外一个是upsert

添加新字段

curl -XPOST 'http://host:port/{index}/{type}/{id}/_update' -d '{json数据}',这种方式json包含两个字段，一个是doc，另外一个是doc_as_upsert

删除

curl -XDELETE 'http://host:port/{index}/{type}/{id}'

curl -XDELETE 'http://host:port/{index}/{type}/{id}'?version=n(指定版本号删除)

查询

（1）查询索引下所有的

curl -XGET 'localhost:9200/{index}/_search?pretty'

（2）多索引查询

curl -XGET 'localhost:9200/{index1,index2}/_search?pretty'

对于多索引查询时，如果索引不存在时，会报错，可以添加参数ignore_unavailable=true来避免

curl -XGET 'localhost:9200/{index1,index2}/_search?pretty&ignore_unavailable=true'

（3）所有索引查询

curl -XGET 'localhost:9200/_search?pretty'或者curl -XGET 'localhost:9200/_all?pretty'

（4）类型查询

curl -XGET 'localhost:9200/{index}/{type}/_search?pretty'

查询分析

使用_analyze

curl -XGET 'localhost:9200/{index}/_analyze?pretty&field={field}' -d 'data'

查询参数

参数	说明
q	指明需要匹配的查询，形式为q={field}:{data}
df	当没有指明默认的搜索字段
analyzer	指定分析器名
default_operator	取值为AND或者OR，指定查询的默认布尔操作符
explain	布尔值，指定为true时会结果会返回explain相关信息
fields	默认会返回index,type,文档标识id,score和_source。可以通过修改fields，字段间用逗号隔开
sort	自定义排序。没有指定时默认是_score降序。指定形式为sort={field}:{desc\|asc}
timeout	指定查询超时时间，形式为timeout=5s
size和from	指定结果窗口，相当于limit offset
terminate_after	指定从每个分片取的最大记录数。形式为terminate_after=100
ignore_unavailable	布尔值，指定是否忽略不存在的索引
search_type	指定搜索类型。没有指定时默认是query_then_fetch。可取的值为dfs_query_then_fetch和query_then_fetch
lowercase_expanded_terms	布尔值。指定term为小写
analyze_wildcard	布尔值。指定通配查询和前缀查询是否做分析。默认是false

3、索引

3.1 配置分片数及副本数

在elasticsearch.yml中设置,默认number_of_shards是5，number_of_replicas是1

index.number_of_shards

index.number_of_replicas

3.2 写一致性

在elasticsearch.yml中配置

action.write_consistency,可用的取值为

quorum	要求总活动分片的1半加1成功
one	只要求一个写成功
all	要求所有的写成功

3.3 索引创建

3.3.1索引自动创建

在elasticsearch.yml中设置action.auto_create_index，既可以设置布尔值（表示是否开启），也可以设置模式来指定索引名

action.auto_create_index:true(false)或者+logs*,-*

3.3.2 索引配置

3.3.3 索引删除

3.4 映射配置

es会自动类型检查，用引号包含的认为是字符串类型，true,false认为是布尔类型，数字时认为是数值类型

在创建索引时，可以用index.mapper.dynamic设置映射配置开关

es7.x中不再支持指定类型映射

3.4.1 数组类型自动检查

使用numeric_detection

对索引添加映射配置

curl -XPUT 'http://localhost:9200/users' -d '{"mappings":{"numeric_detection":true}}' -H 'Content-Type:application/json'

然后添加字段

curl -XPUT 'http://localhost:9200/users/_doc/1' -d '{"name":"User 1", "age":"20"}' -H 'Content-Type:application/json'

查看映射配置

3.4.2 日期类型自动检查

使用date_detection来作为开关，值为布尔值

使用dynamic_date_formats来设置日期格式,值为字符串数组

首先设置日期格式映射

curl -XPUT 'http://localhost:9200/blog' -d '{"mappings":{"dynamic_date_formats":["yyyy-MM-dd hh:mm"]}}' -H 'Content-Type:application/json'

然后添加数据

curl -XPUT 'http://localhost:9200/blog/_doc/1' -d '{"name":"Test", "test_field":"2015-10-01 11:22"}' -H 'Content-Type:application/json'

然后查看映射配置

3.4.3 索引结构映射

使用如下形式来定义映射结构，其中id,name,published,contents表示字段，根据实际情况定义

{
"mappings":{
"properties":{
"id":{"type":"long"},
"name":{"type":"text"},
"published":{"type":"date"},
"contents":{"type":"text"}
}
}
}

将其定义到posts.json文件中

使用curl -XPUT 'http://localhost:9200/posts' -d @posts.json -H 'Content-Type:application/json'来定义mapping，其中需要使用PUT方法

field定义，包含type,store,index

核心类型

类型包含以下

String
Number	integer,float,double,long
Date
Boolean
Binary

核心类型公共属性

index_name	在索引中存放的名字
index	分析方式，可取值为analyzed(默认值),no,not_analyzed(针对字符串)
store	是否存到索引中，可取值为yes,no(默认值)，no时是存到_source中
doc_values	可取值为true,false,true时对于非分词字段存在磁盘中，false时存在数据缓存中
boost	用于定义字段在文档中重要性，值越大越重要。默认值是1
null_value	用于指定字段不是文档中一部分时，需要写入到索引中的值。默认是忽略
copy_to	用于指定原始值需要复制到的目的字段
include_in_all	用于指定_all中是否包含此字段

字符串属性

term_vector	可取值为no(默认值),yes,with_offsets, with_positions,with_positions_offsets。用于指定是否计算字段的的索引词项向量
analyzer	用于索引及搜索的分析器的名字
search_analyzer	用于处理查询字符串中部分的分析器的名字
norms.enabled	指定是否需要为此字段加载norms,默认值是true
norms.loading	可取值为eager,lazy,用于指定norms加载方式
position_offset_gap	默认值0.用于指定相同名字字段实例的距离
index_options	持有分词项结构的索引选项。可取值为docs,freqs,positions,offsets.对于分析字段默认值为positions，索引但无需分析的默认值为docs
ignore_above	用于定义字段最大大小。超过大小的分析器会忽略

数值属性

precision_step	定义生成的分词项个数。值越小，生成的越多
coerce	默认值为true,是否需要将字符串转为数字
ignore_malformed	可取值为true,false。为true时忽略错误形式的值。

日期属性

format	定义日期格式，默认值为dateOptionalTime
precision_step	定义分词项数。
numeric_resolution	定义日期以数值表示时的时间单位
ingore_malformed	可取true,false。是否需要忽略错误格式的值。

多字段

用于不同处理方式使用不同字段情况。使用fields属性，其子属性下定义属性

其他类型

包含ip,token_count

3.4.4 分析器

开箱即用的分析器有

standard	方便于欧洲语言的分析器
simple	基于非字母字符作分词，并且转为小写
whitespace	基于空白字符作分词
stop	与simple相似，会根据字符符作些过滤
keyword	只传递提供的值
pattern	基于正则表达式作分割
language	用于使用特定语言
snowball	类似于standard,但提供主干生成算法