es数据库原生操作

最新推荐文章于 2024-04-27 15:20:09 发布

冰码达

最新推荐文章于 2024-04-27 15:20:09 发布

阅读量1.2k

点赞数 1

分类专栏：笔记文章标签： elasticsearch python

本文链接：https://blog.csdn.net/bslydhs/article/details/104616706

版权

笔记专栏收录该内容

2 篇文章 0 订阅

订阅专栏

将mysql数据导入到es数据库

增

json文件格式

curl -XPOST -H Content-Type:application/json -s localhost:19200/phone/user_info/_bulk --data-binary @foo.json
{“index”: {“ _ index”：“ a2”，“ _ type”：“ log”， “_id”: 1}} # 如果添加的路径里面已经包含index和type就不要在写了，’_id‘可有可无
{“phoneNumber”: “54810043”, “phoneCode”: “852”, “phomeImsi”: “454120609000009”, “username”: “”, “cardID”: “”, “address”: “”, “registerTime”: “”, “birthday”: “”, “sex”: 0, “company”: “”, “country”: “CN”, “city”: “HK”, “email”: “”, “type”: “HK_MOBILE”}

直接添加

PUT /website/blog/123/_create { … }

删

删除单个文档

DELETE /website/blog/123

仅删除数据

查询删除：
curl -u用户名:密码 -XPOST ‘192.168.0.201:9200/quality_control/my_type/_delete_by_query?refresh&slices=5&pretty’ -H ‘Content-Type: application/json’
-d’{
“query”: {
“match_all”: {}
}
}’

改

修改单个文档

不会覆盖原有的内容，只是新建了一个文档
PUT /website/blog/123
{
“title”: “My first blog entry”,
“text”: “I am starting to get the hang of this…”,
“date”: “2014/01/02”
}

局部修改单个文档

POST /website/blog/1/_update

查

查看所有索引

http://192.168.144.231:9200/_cat/indices?v

查询时，最好加上sort，这样就不会计算相关性
curl -X ‘😕/:/ ?<QUERY_STRING>’ -d ‘’

VERB适当的 HTTP 方法或谓词 : GET、POST、PUT、HEAD 或者 DELETE。

PROTOCOLhttp 或者 https（如果你在 ElasticSearch 前面有一个https 代理）

HOSTElasticSearch 集群中任意节点的主机名，或者用 localhost 代表本地机器上的节点。
PORT运行 ElasticSearch HTTP 服务的端口号，默认是 9200 。
PATHAPI 的终端路径（例如 _count 将返回集群中文档数量）。
Path 可能包含多个组件，例如：_cluster/stats 和 _nodes/stats/jvm 。
QUERY_STRING任意可选的查询字符串参数 (例如 pretty 将格式化地输出 JSON 返回值，使其更容易阅读，scroll=1m是滚屏)
BODY一个 JSON 格式的请求体 (如果请求需要的话）

##使用postman工具
http://localhost:9200/test/table_user
test为_index table_user为_type
将header设置为 Content-Type application/json
查询的话就在路径的后面添/_search?timeout=10ms timeout可以有可以没有，设置超时
data:
查询所有
{
“query”: {
“match_all”: {}
}
}

查询符合某个条件的， must就是必须满足数组中的所有条件，should满足其中一个，must_not就是必须不满足；filter子句（查询）必须出现在匹配的文档中。但是与must查询的分数不同的是将被忽略。Filter子句在filter上下文中执行，这意味着计分被忽略，并且子句被考虑用于缓存。再里面一层match是分词匹配，必须是一整个单词，不然匹配不出来，
{
“query”: {
“bool”: {
“must”: [
{
“match”: {“name”: “admin”}
}, {
“match”: {“state”: 1}
}
],
“must_not”: [ ],
“should”: [ ]
}
},
“from”: 0,
“size”: 50,
“sort”: [ { “account_number” ：“asc” }],
“aggs”: { }
}

term是完全匹配，
“terms”: { “tag”: [ “search”, “full_text”, “nosql” ] } terms可以匹配多个数值
match:“hello world” 包含hello或者world的可以匹配出来
match_phrase:{content:“hello world”} hello world必须挨边
match_phrase:{content:“hello world”, slop: 2} hello hh world，hello world中间间隔的次数小于2
“prefix”: { “name”: “a” } 以a开头的
“match_phrase_prefix” : { “name” : “hello world bl” } hello world必须挨边,然后紧跟一个以bl开头的单词，当然也可以用slop
“range”: { “age”: {“gte”: 20, “lt”: 30 }} range是范围查询 gt大于，gte大于等于，lt小于，lte小于等于
“exists”: { “field”: “title”} 存在字段title

高亮搜索：{
“query” : {
“match_phrase” : {
“about” : “rock climbing”
}
},
“highlight”: {
“fields” : {
“about” : {}
}
}
}
字段模糊查询：
{
“query”: {
“query_string”: {
“query”: “35288753”,
“default_field”: “*phoneNumber”
}
}
}

1、检查集群健康 GET /_cluster/health
2、检查某些字段 GET /website/blog/123?_source=title,text
3、检查多个文档 POST /_mget { “docs” : [{ “_index” : “website”, “_type” : “blog”,"_id" : 2 },
{ “_index” : “website”, “_type” : “pageviews”, “_id” : 1, “_source”: “views” } ]}
4、批量处理文档POST /_bulk 每一行都要有换行符
{ “delete”: { “_index”: “website”, “_type”: “blog”, “_id”: “123” }}
{ “create”: { “_index”: “website”, “_type”: “blog”, “_id”: “123” }}
{ “title”: “My first blog post” }
{ “index”: { “_index”: “website”, “_type”: “blog” }}
{ “title”: “My second blog post” }
{ “update”: { “_index”: “website”, “_type”: “blog”, “_id”: “123”, “_retry_on_conflict” : 3} }
{ “doc” : {“title” : “My updated blog post”} }

5、分析搜索词是如何解析的
GET /_analyze?analyzer=standard&text=Text to analyze
6、查看类型
GET /gb/_mapping/tweet
7\https://blog.csdn.net/lijingjingchn/article/details/88654714 分离最大化查询

搜索返回的意义

took – Elasticsearch运行查询多长时间（以毫秒为单位）
timed_out –搜索请求是否超时
_shards –搜索了多少个分片以及成功，失败或跳过了多少个分片。
max_score –找到的最相关文件的分数
hits.total.value -找到了多少个匹配的文档
hits.sort -文档的排序位置（不按相关性得分排序时）
hits._score-文档的相关性得分（使用时不适用match_all）

并发处理时，可以使用乐观并发控制

PUT /website/blog/1?version=1 <1> 指定版本修改，版本不对则无法修改

注意问题

1、深分页：如果你确实需要从集群里获取大量documents，你可以通过设置搜索类型scan禁用排序，来高效地做这件事。（新版本已废弃）
2\虽然你可以给索引添加新的类型，或给类型添加新的字段，但是你不能添加新的分析器或修改已有字段。

查询分片数量

get http://192.168.144.115:9200/facebook/_settings?pretty

设置默认分片数量

post http://192.168.144.115:9200/_template/template_http_request_record
{
“index_patterns”: ["*"],
“settings”: {
“number_of_shards”: 3,
“number_of_replicas”: 1
}
}

迁移es 数据

curl -XPOST “http://192.168.144.115:9200/_reindex?pretty” -H “Content-Type:application/json” -d “{“source”:{“remote”:{“host”:“http://192.168.144.109:19200”,“username”:“aaa”,“password”:“qqq”},“index”:“phone”},“dest”:{“index”:“phone”}}”

curl -u aaa:qqq -XPOST “http://:19200/_reindex?pretty” -H “Content-Type:application/json” -d “{“source”:{“remote”:{“host”:“http://192.168.144.231:19200”,“username”:“aaaa”,“password”:“qqq”},“index”:“packets-123”, “query”: {“bool”:{“filter”:[{“term”:{“layers.eth.address.keyword”:“00:21:cc:d4:25:a6”}}]}}},“dest”:{“index”:“packets-123”}}”

常见问题

not whitelisted in reindex.remote.whitelist
在elasticsearch.yml配置文件中添加白名单，这里的白名单表示允许远程指定ip上的es访问我的es
在elasticsearch.yml文件中添加：reindex.remote.whitelist: [“ip:9200”,”ip2:9200”]
注意：1、多个ip地址时用逗号间隔 2、在源es与目标es上都需要进行配置。

冰码达

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
es数据库原生操作

将mysql数据导入到es数据库增windows：curl -XPOST -H Content-Type:application/json -s localhost:19200/phone/user_info/_bulk --data-binary @foo.jsonjson文件格式{“index”: {“ _ index”：“ a2”，“ _ type”：“ log”， “_id”: ...
复制链接

扫一扫