接上篇
elasticsearch 启动运行,启动后即可索引json格式文档
给文档添加索引
PUT命令指定将文档添加到某个索引,需要唯一文档id,和k-v对格式的请求体
curl -X PUT "localhost:9200/customer/_doc/1?pretty" -H 'Content-Type: application/json' -d'
{
"name": "John Doe"
}
'
创建一个名为 customer 的索引(如果不存在)添加一个id=1的文档,存储文档并对 name 字段进行索引
返回
{
"_index" : "customer",
"_type" : "_doc",
"_id" : "1",
"_version" : 2,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 1,
"_primary_term" : 4
}
因为是新文档,所以版本号_version 为1
查询文档
次数es集群中任一节点都可查询到该文档
curl -X GET "localhost:9200/customer/_doc/1?pretty"
{
"_index" : "customer",
"_type" : "_doc",
"_id" : "1",
"_version" : 2,
"_seq_no" : 1,
"_primary_term" : 4,
"found" : true,
"_source" : {
"name" : "John Doe"
}
}
给批量文档建索引
当有大量文档需要索引时,可以批量操作,减少网络来回包
最优的批量大小取决于几点:
-
文档大小和复杂性
-
索引和搜索的负载
-
集群可用资源数
如果没有概念,可以先从
1,000 -5,000的批量大小开始
- 下载 Download the accounts.json 文件
里面都是一些账户数据{ "account_number": 0, "balance": 16623, "firstname": "Bradshaw", "lastname": "Mckenzie", "age": 29, "gender": "F", "address": "244 Columbus Place", "employer": "Euron", "email": "bradshawmckenzie@euron.com", "city": "Hobucken", "state": "CO" }
- 批量导入
curl -H "Content-Type: application/json" -XPOST "localhost:9200/bank/_bulk?pretty&refresh" --data-binary "@accounts.json"
- 查看索引情况
curl "localhost:9200/_cat/indices?v"
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size yellow open bank 2XbBvzsVS3ql-a_wT8rK1w 1 1 1000 0 427.6kb 427.6kb yellow open customer KQ7n5lVLQe2McxqqX32JqQ 1 1 1 0 3.5kb 3.5kb
customer 是一开始建立的,有一个文档
bank是批量建立的,有1000份文档
转自
Elasticsearch Reference [7.5]