elasticsearch入门

最新推荐文章于 2022-10-12 17:09:19 发布

ONEMELON

最新推荐文章于 2022-10-12 17:09:19 发布

阅读量165

点赞数

本文链接：https://blog.csdn.net/qq_28327891/article/details/88649752

版权

Index
Elastic 会索引所有字段，经过处理后写入一个反向索引（Inverted Index）。查找数据的时候，直接查找该索引。
所以，Elastic 数据管理的顶层单位就叫做 Index（索引）。它是单个数据库的同义词。每个 Index （即数据库）的名字必须是小写。
下面的命令可以查看当前节点的所有 Index。
获取所有索引

http://192.168.88.126:9200/_cat/indices?v

Document
Index 里面单条的记录称为 Document（文档）。许多条 Document 构成了一个 Index。
Document 使用 JSON 格式表示，下面是一个例子。


 {
     "CP5178": "18000.00",
     "CP5177": "16000.00",
     "CP5176": "14000.00",
     "CP5175": "12000.00",
     "CP5179": "20000.00"
 }

同一个 Index 里面的 Document，不要求有相同的结构（scheme），但是最好保持相同，这样有利于提高搜索效率。

type
Document 可以分组，比如card_nature这个 Index 里面，可以按卡号分组，这种分组就叫做 Type，它是虚拟的逻辑分组，用来过滤 Document。
不同的 Type 应该有相似的结构（schema），举例来说，id字段不能在这个组是字符串，在另一个组是数值。这是与关系型数据库的表的一个区别。性质完全不同的数据（比如products和logs）应该存成两个 Index，而不是一个 Index 里面的两个 Type（虽然可以做到）。
但是6.0以上版本一个index中只能有一个type。

{
  "took": 11,
  "timed_out": false,
  "_shards": {
    "total": 6,
    "successful": 6,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 100010,
    "max_score": 1,
    "hits": [
      {
        "_index": "card_nature",
        "_type": "card_nature",
        "_id": "287FB369FE60D6FB639115C36C9BCF1E93451D39254068ED0194F1326892099D",
        "_score": 1,
        "_source": {
          "CP5178": "18000.00",
          "CP5177": "16000.00",
          "CP5176": "14000.00",
          "CP5175": "12000.00",
          "CP5179": "20000.00",
		  }

响应中最重要的部分是hits，它包含了total字段来表示匹配到的文档总数

hits数组中的每个结果都包含_index、_type和文档的_id字段，被加入到_source字段中这意味着在搜索结果中我们将可以直接使用全部文档。这不像其他搜索引擎只返回文档ID，需要你单独去获取文档。

每个节点都有一个_score字段，这是相关性得分(relevance score)，它衡量了文档与查询的匹配程度。默认的，返回的结果中关联性最大的文档排在首位；这意味着，它是按照_score降序排列的。这种情况下，我们没有指定任何查询，所以所有文档的相关性是一样的，因此所有结果的_score都是取得一个中间值1

max_score指的是所有文档匹配查询中_score的最大值。

took

took告诉我们整个搜索请求花费的毫秒数。

shards

_shards节点告诉我们参与查询的分片数（total字段），有多少是成功的（successful字段），有多少的是失败的（failed字段）。通常我们不希望分片失败，不过这个有可能发生。如果我们遭受一些重大的故障导致主分片和复制分片都故障，那这个分片的数据将无法响应给搜索请求。这种情况下，Elasticsearch将报告分片failed，但仍将继续返回剩余分片上的结果。

timeout

time_out值告诉我们查询超时与否。一般的，搜索请求不会超时。如果响应速度比完整的结果更重要，你可以定义timeout参数为10或者10ms（10毫秒），或者1s（1秒）

获取集群的节点列表：
http://192.168.88.126:9200/_cat/nodes?v

创建索引

curl -XPUT '192.168.88.126:9200/customer1?pretty'

enter description here

上图表示我们有一个叫customer1的索引，它有五个私有的分片以及一个副本，在它里面有0个文档。

创建文档

kibana:

PUT /customer1/external/12
{
  "name":"oj",
  "sex":"1",
  "map":"{1,2,3,3}",
  "ms":{"sjjs":"eje","sjjsf":"eje"}
}

如果不指定，Elasticsearch将产生一个随机的ID来索引这个文档。Elasticsearch生成的ID会作为索引API调用的一部分被返回。由于我们没有指定一个ID，我们使用的是POST而不是PUT。

enter description here

命令行
curl -H “Content-Type: application/json” -XPOST ‘http://192.168.88.126:9200/customer1/external/2’ -d ‘{“name”:“dididache”}’

查看mapping

GET /mnchcollect02/_mapping

查询语句：

curl -XGET 'localhost:9200/customer1/external/1?pretty'

其中含义为：获取customer1索引下类型为external，id为1的数据，pretty参数表示返回结果格式美观。

删除索引：

  curl -XDELETE 'http://192.168.88.126:9200/testdelete?pretty'

批量删除索引：
删除以index19022814开头的索引

  curl -XDELETE 'http://host.IP.address:9200/index19022814*'

删除index

DELETE /uuuindex?pretty

更新文档：

下面的例子展示了怎样将我们ID为2的文档的name字段改成“bytwo”：

curl -H "Content-Type: application/json;charset=UTF-8" -XPOST 'http://192.168.88.126:9200/customer1/external/2/_update?pretty' -d '{"doc": { "name": "bytwo" }}'

增加字段
下面的例子展示了怎样将我们ID为2的文档的增加age字段”：

curl -H "Content-Type: application/json;charset=UTF-8" -XPOST 'http://192.168.88.126:9200/customer1/external/2/_update?pretty' -d '{"doc": { "name": "bytwo","age": 20 }}'

kibana:

  POST /customer1/external/2/_update?pretty
{"doc": { "name": "bytwoqqq" }}

删除文档
shell命令

curl -XDELETE 'http://192.168.88.126:9200/customer1/external/2?pretty'

kibana:

DELETE /customer1/external/FApDpmQBy3hqqGE8_XeN?pretty

我们也能够一次删除符合某个查询条件的多个文档。以下的例子展示了如何删除名字中包含“oj”的所有的客户：
这是2.x版本的

curl -H "Content-Type: application/json" -XDELETE 'http://192.168.88.126:9200/customer1/external' -d '{"query": { "match":{ "name": "oj" } }}'

6.X版本是这种写法

curl  -H "Content-Type: application/json" -XPOST 'http://192.168.88.126:9200/customer1/external/_delete_by_query?pretty' -d '{"query": {"match":{"name": "oj"}}}'

批处理：

{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "The bulk request must be terminated by a newline [\n]"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "The bulk request must be terminated by a newline [\n]"
  },
  "status" : 400
}

原因：批量导入的 json 文件最后必须要以\n结尾，也就是需要一个空行。
解决：在 json 文件末尾加多一个回车。

curl -H "Content-Type: application/json" -XPOST 'http://192.168.88.126:9200/customer1/external/_bulk?pretty' -d '{"index":{"_id":"1"}}
{"name": "update1" }
{"index":{"_id":"4"}}
{"name": "update4" }
'

注意格式在 json 文件末尾加多一个回车，每一个操作都有2行数据组成，末尾要回车换行。第一行用来说明操作命令和原数据、第二行是自定义的选项.举个例子，同时执行插入2条数据、删除一条数据, 新建bulkdata.json,写入如下内容：

curl -H "Content-Type: application/json" -XPOST 'http://192.168.88.126:9200/customer1/external/_bulk?pretty' -d '
{"update":{"_id":"1"}}
{"doc": { "name": "John Doe becomes Jane Doe" } }
{"delete":{"_id":"4"}}
'

注意上面的delete动作，由于删除动作只需要被删除文档的ID，所以并没有对应的源文档。

bulk API按顺序执行这些动作。如果其中一个动作因为某些原因失败了，将会继续处理它后面的动作。当bulk API返回时，它将提供每个动作的状态（按照同样的顺序），所以你能够看到某个动作成功与否。

查询语言

curl -H "Content-Type: application/json" -XPOST 'http://192.168.88.126:9200/customer1/_search?pretty' -d '
{
"query": { "match_all": {} },
"size": 4
}'

kibana：

POST /customer1/_search?pretty
{
"query": { "match_all": {} },
"size": 4
}

分解以上的这个查询，其中的query部分告诉我查询的定义，match_all部分就是我们想要运行的查询的类型。match_all查询，就是简单地查询一个指定索引下的所有的文档。如果没有指定size的值，那么它默认就是10。

下面的例子，做了一次match_all并且返回第11到第20个文档：

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
        {
          "query": { "match_all": {} },
          "from": 10,
          "size": 10
        }'

其中的from参数（0-based）从哪个文档开始，size参数指明从from参数开始，要返回多少个文档。这个特性对于搜索结果分页来说非常有帮助。注意，如果不指定from的值，它默认就是0。

下面这个例子做了一次match_all并且以账户余额降序排序，最后返前十个文档：

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
 "query": { "match_all": {} },
 "sort": { "balance": { "order": "desc" } }
 }'

默认情况下，是返回完整的JSON文档的。这可以通过 source来引用（搜索hits中的_sourcei字段）。如果我们不想返回完整的源文档，我们可以指定返回的几个字段。

curl -H "Content-Type: application/json" -XPOST 'http://192.168.88.126:9200/customer1/_search?pretty' -d '
{
"query": { "match_all": {} },
"_source": ["name","sex"]
}'
kibana
POST /customer1/_search?pretty
{
"query": { "match_all": {} },
"_source": ["name","sex"]
}

match查询
这可以看成是一个简单的字段搜索查询（比如对应于某个或某些特定字段的搜索）

curl -H "Content-Type: application/json" -XPOST 'http://192.168.88.126:9200/customer1/_search?pretty' -d '
{
"query": { "match": {"name": "uu"} }
}'

kinaba:

POST /customer1/_search?pretty
{
"query": { "match": {"name": "uu"} }
}

逻辑运算
如果有多个搜索关键字， Elastic 认为它们是or关系。

curl -H "Content-Type: application/json" 'http://192.168.88.126:9200/customer1/external/_search?pretty'  -d '
{
  "query" : { "match" : { "name" : "update4 女" }}
}'

如果要执行多个关键词的and搜索，必须使用布尔查询。

curl -H "Content-Type: application/json" 'http://192.168.88.126:9200/customer1/external/_search?pretty'  -d '
{
  "query": {
    "bool": {
      "must": [
        { "match": { "sex": "1" } },
        { "match": { "name": "孙红梅" } }
      ]
    }
  }
}'

https://blog.csdn.net/bigkeen/article/details/45294691

ONEMELON

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
elasticsearch入门

IndexElastic 会索引所有字段，经过处理后写入一个反向索引（Inverted Index）。查找数据的时候，直接查找该索引。所以，Elastic 数据管理的顶层单位就叫做 Index（索引）。它是单个数据库的同义词。每个 Index （即数据库）的名字必须是小写。下面的命令可以查看当前节点的所有 Index。获取所有索引http://192.168.88.126:9200/_...
复制链接

扫一扫