【喜加一】es6学习笔记

最新推荐文章于 2024-03-12 09:32:19 发布

某Zz

最新推荐文章于 2024-03-12 09:32:19 发布

阅读量275

点赞数

分类专栏：工作喜加一

本文链接：https://blog.csdn.net/zhuyuanfu/article/details/85094188

版权

喜加一同时被 2 个专栏收录

14 篇文章 0 订阅

订阅专栏

工作

6 篇文章 0 订阅

订阅专栏

2018.12.19，周三，多云

今天的窗帘也没有拉开，如果拉开的话就更好了。

es6学习笔记：

概念：

cluster：一群nodes（servers）组成一个cluster。cluster能够管理和搜索所有nodes。一个cluster有一个名字（必须是独特的），clusters之间用名字区分彼此。当一个node启动时被指定加入名为xxx的cluster，那么该node就会搜索局域网并加入它；如果没找到，它就自立门户，自建一个cluster并加入它。

node：一个node室一个server，是一个集群的一部分。node的名字必须是唯一的，名字是它的标识。node参与cluster的indexing和search。

index：由一群documents（信息）组成，这些documents之间有一定共性。后面会讲到类比于关系型数据库时，index类似于一个数据库，而type类似表（非常不类似的那种）。我可以建立一个存储用户信息的index，一个售卖目录的index，还能建立一个存储所有订单的index。index的名字也是唯一的，并且要求全小写。

在一个cluster内，我能定义任意多个indices（indexes）。

type：在一个index下，我能定义多个types。

document：Json格式的字符串，代表一些信息。document既能表示一个用户的信息，也能表示一个商家的信息，还能表示一条订单的信息。我能指定一个document归类于哪个index下，也能（必须）指定它在哪（某）个type下。

shard：一个index内容过多以至于内存超出机器限制时，必须将该index的内容分散在不同机器上。这些分散的部分就叫shard。

replica：=replica shard，本质是shard的复制品。（虽然没细看但我猜是主从选举结构）shard和replica完全由es管理，管理细节对普通逻辑码农透明。

当index被创建时，我能指定它拥有多少个shards和replicas。

每个es shard都是一个Lucene index（？？啥意思？？）。

了解基本概念后跑跑看。

我的机器是mac，用brew install elasticsearch即可安装elasticsearch。

安装完，在终端输入elasticsearch（不要加任何参数）即可启动es集群（这是单机版？如果我想在一群机器上起一个集群该怎么办？）。

输入elasticsearch -Enode.name=node2 -Ecluster.name=clusterName000，可以开启一个只有一个名为node2且加入clusterName000的node。

检查健康状态：

curl -X GET "localhost:9200/_cat/health?v"

获得所有node信息：

curl -X GET "localhost:9200/_cat/nodes?v"

获得所有indexes：

curl -X GET "localhost:9200/_cat/indices?v"

会发现是空的，因为我们还没建立任何index。

建立index：

curl -X PUT "localhost:9200/customer?pretty"

再看看所有index：

curl -X GET "localhost:9200/_cat/indices?v"

会发现有了一个叫customer的index。

往这个index里插入一个document：

curl -X PUT "localhost:9200/customer/_doc/1?pretty" -H 'Content-Type: application/json' -d'
{
  "name": "zyf"
}
'

插入了一个document。将它取回：

curl -X GET "localhost:9200/customer/_doc/1?pretty"

能看到取回这东西：

{
  "_index" : "customer",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "found" : true,
  "_source" : {
    "name" : "zyf"
  }
}

其中found=1表示找到了id为1的document，_source:{xx:xx}的冒号右边是完整的json数据。

删除index（不是删除那个id为1的document）：

curl -X DELETE "localhost:9200/customer?pretty"
curl -X GET "localhost:9200/_cat/indices?v"

能看到所有index都消失了。

到现在为止，我们学过的命令如下：

curl -X PUT "localhost:9200/customer"
curl -X PUT "localhost:9200/customer/_doc/1" -H 'Content-Type: application/json' -d'
{
  "name": "zyf"
}
'
curl -X GET "localhost:9200/customer/_doc/1"
curl -X DELETE "localhost:9200/customer"

可以总结出，这些命令的格式大体是这样的：

<HTTP Verb> /<Index>/<Type>/<ID> [options]

或者，当我们用命令行的curl工具时，格式长这样：

curl -X VERB "ipAddress:port/index/type/documentId?pretty" [options]

。

当我们用PUT方法用一个新的document塞进一个已经存在的document时（指定新docId和某旧docId一致），新doc会替代老doc。

当我们不明确指定id塞进一个document时，需要用POST方法：

curl -X POST "localhost:9200/customer/_doc?pretty" -H 'Content-Type: application/json' -d'
{
  "name": "Jane Doe"
}
'

此时es会帮我们指定一个新id（自增？）。

到现在为止，我们对document的操作是插入和替代。以下是update（虽然看起来是update，但es干的事情是删去老doc，插入同一个id的新doc）：

curl -X POST "localhost:9200/customer/_doc/1/_update?pretty" -H 'Content-Type: application/json' -d'
{
  "doc": { "name": "zhuyuanfu" }
}
'

⤴️这个方法是传入一个json实现的。json中，键"doc"的含义是，用它的值替换原来的文本。下面的例子则是用"script"键来实现执行它的值代表的脚本：

curl -X POST "localhost:9200/customer/_doc/1/_update?pretty" -H 'Content-Type: application/json' -d'
{
  "script" : "ctx._source.age += 5"
}
'

⤴️这个方法是用脚本实现“更新”的。

删除document的例子。非常直接，简单粗暴：

curl -X DELETE "localhost:9200/customer/_doc/2?pretty"

想要删除满足“xxx”条件的所有document时，可以使用deleteByQuery的API。如下：https://www.elastic.co/guide/en/elasticsearch/reference/6.5/docs-delete-by-query.html

以上讲的是对单个document进行插入删除修改操作。使用_bulkAPI可以对多个document进行批量增删改操作。

例子：

curl -X POST "localhost:9200/customer/_doc/_bulk?pretty" -H 'Content-Type: application/json' -d'
{"index":{"_id":"1"}}
{"name": "John Doe" }
{"index":{"_id":"2"}}
{"name": "Jane Doe" }
'

上例新建（index）了两个document，一个id是1，另一个id是2，并分别给它们的name属性赋值了。

下例则更新（update） id为1的document，并删除id为2的document：

curl -X POST "localhost:9200/customer/_doc/_bulk?pretty" -H 'Content-Type: application/json' -d'
{"update":{"_id":"1"}}
{"doc": { "name": "John Doe becomes Jane Doe" } }
{"delete":{"_id":"2"}}
'

（我一开始还在奇怪为什么不把index、update的具体操作写在它们的json内部，原来这里是一个约定：当操作是index和update时，后续的一个json描述具体操作；当操作是delete时，由于只需要docid即可进行删除doc的操作，所以就无需后续json了）

下面模拟真实环境塞点数据。下载accounts.json（下载地址：someAddr）后，cd到这个文件夹里，然后：

curl -H "Content-Type: application/json" -XPOST "localhost:9200/bank/_doc/_bulk?pretty&refresh" --data-binary "@accounts.json"

（这个-XPOST是干啥用的？那个@的含义是什么？）

然后检查一下集群健康状态：

curl "localhost:9200/_cat/indices?v"

得到如下结果：

health status index    uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   bank     3KL1RmyMTEiaH_RRm61QhA   5   1       1000            0     95.5kb         95.5kb
yellow open   customer -ZjOKf1bQ0mdaZu91l319w   5   1          1            0      4.4kb          4.4kb

这说明已经把1000个documents塞进bank了。

接下来是查询。一种查询方式是将搜索条件写在URI里（不推荐），另一种查询方式是将搜索条件写在传进去的-d json里，如下：

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": { "match_all": {} },
  "sort": [
    { "account_number": "asc" }
  ]
}
'

（https://www.elastic.co/guide/en/elasticsearch/reference/6.5/getting-started-search-API.html把?pretty给忘了）

注意一下，返回搜索结果后，该搜索不会留一个指针指向es集群里的某个位置，这和MySql resultset的概念不一样。

另一个搜索例子：

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": { "match_all": {} },
  "from": 10,
  "size": 10,
  "_source":["account_number","balance"],
  "sort": { 
    "balance": { 
        "order": "desc" 
    } 
  }
}
'

这个搜索的含义是：取按余额第10-19高的人降序排列，并且只显示账号和余额，不显示名字年龄性别等。搜索结果如下：

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1000,
    "max_score" : null,
    "hits" : [
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "255",
        "_score" : null,
        "_source" : {
          "account_number" : 255,
          "balance" : 49339
        },
        "sort" : [
          49339
        ]
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "524",
        "_score" : null,
        "_source" : {
          "account_number" : 524,
          "balance" : 49334
        },
        "sort" : [
          49334
        ]
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "751",
        "_score" : null,
        "_source" : {
          "account_number" : 751,
          "balance" : 49252
        },
        "sort" : [
          49252
        ]
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "119",
        "_score" : null,
        "_source" : {
          "account_number" : 119,
          "balance" : 49222
        },
        "sort" : [
          49222
        ]
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "504",
        "_score" : null,
        "_source" : {
          "account_number" : 504,
          "balance" : 49205
        },
        "sort" : [
          49205
        ]
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "878",
        "_score" : null,
        "_source" : {
          "account_number" : 878,
          "balance" : 49159
        },
        "sort" : [
          49159
        ]
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "921",
        "_score" : null,
        "_source" : {
          "account_number" : 921,
          "balance" : 49119
        },
        "sort" : [
          49119
        ]
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "825",
        "_score" : null,
        "_source" : {
          "account_number" : 825,
          "balance" : 49000
        },
        "sort" : [
          49000
        ]
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "591",
        "_score" : null,
        "_source" : {
          "account_number" : 591,
          "balance" : 48997
        },
        "sort" : [
          48997
        ]
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "160",
        "_score" : null,
        "_source" : {
          "account_number" : 160,
          "balance" : 48974
        },
        "sort" : [
          48974
        ]
      }
    ]
  }
}

下例显示了寻找10个（默认值）40岁的不住在ID的人：

curl -X GET "localhost:9200/bank/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "bool": {
      "must": [
        { "match": { "age": "40" } }
      ],
      "must_not": [
        { "match": { "state": "ID" } }
      ]
    }
  }
}
'

除了match、match_all、bool等筛选条件外，我们还可以使用filter。例子如下：

curl -X GET "localhost:9200/bank/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "bool": {
      "must": { "match_all": {} },
      "filter": {
        "range": {
          "balance": {
            "gte": 20000,
            "lte": 30000
          }
        }
      }
    }
  }
}
'

aggregation了解一下？只开了个头。例子如下：

curl -X GET "localhost:9200/bank/_search" -H 'Content-Type: application/json' -d'
{
  "size": 0,
  "aggs": {
    "group_by_state": {
      "terms": {
        "field": "state.keyword"
      }
    }
  }
}
'

上例约等于如下这条SQL：

SELECT state, COUNT(*) FROM bank GROUP BY state ORDER BY COUNT(*) DESC LIMIT 10;

更多aggregation例子见the aggregations reference guide。

总结：es又简单又复杂，很强大。

搬运自：https://www.elastic.co/guide/en/elasticsearch/reference/6.5/getting-started-conclusion.html

某Zz

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
【喜加一】es6学习笔记

2018.12.19，周三，多云今天的窗帘也没有拉开，如果拉开的话就更好了。es6学习笔记：概念：cluster：一群nodes（servers）组成一个cluster。cluster能够管理和搜索所有nodes。一个cluster有一个名字（必须是独特的），clusters之间用名字区分彼此。当一个node启动时被指定加入名为xxx的cluster，那么该node就会搜索局域网并...
复制链接

扫一扫

专栏目录