Elasticsearch学习（零）——安装与基本操作

最新推荐文章于 2024-05-03 09:32:31 发布

阿团团

最新推荐文章于 2024-05-03 09:32:31 发布

阅读量198

点赞数

分类专栏： Elasticsearch学习文章标签： Elasticsearch

本文链接：https://blog.csdn.net/jiangxuege/article/details/88182642

版权

Elasticsearch学习专栏收录该内容

0 篇文章 0 订阅

订阅专栏

官方文档

https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started-install.html

全程在Centos6.10的虚拟机安装

1 安装

安装非常简单，一个tar包下载解压即可

cd 你的工作目录
curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.6.1.tar.gz
tar -xvf elasticsearch-6.6.1.tar.gz
cd elasticsearch-6.6.1/bin
./elasticsearch  # 启动

然而启动报错：

OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000c5330000, 986513408, 0) failed; error='Cannot allocate memory' (errno=12)

需要修改jvm的设置，在Elasticsearch目录下的config/jvm.options里，修改下面两项，直到可以启动

可以通过下面这个命令启动指定集群名和节点名的Elasticsearch

./elasticsearch -Ecluster.name=my_cluster_name -Enode.name=my_node_name > ../logs/elasticsearch.out 2>&1 &

通过API检测集群状态

curl -X GET "localhost:9200/_cat/health?v"

列出所有索引，当然现在还没有任何索引

curl -X GET "localhost:9200/_cat/indices?v"

这样启动的集群只能在localhost访问，基本是一个测试模式，真正用的时候需要修改配置config/elasticsearch.yml

重新启动集群，注意，一旦设置了IP，elasticsearch集群可能会报错，各种too low

前两个，说明该用户的最大线程、最大文件数受限，这个需要用root权限修改/etc/security/limits.conf文件，第一列是用户名。这项修改要重新登陆用户生效

最后一个，需要修改/etc/sysctl.conf，增加一行vm.max_map_count=262144，加载配置

sudo sysctl -p  # 加载配置

现在来启动一下，发现报了新错system call filters failed to install

这个是由于Centos6不支持这项功能，在Elasticsearch的配置里禁用即可

2 索引

2.1 创建索引

现在创建一个名为customer的索引，pretty表示打印

curl -X PUT "localhost:9200/customer?pretty"  # 添加一个名为customer的索引
curl -X GET "localhost:9200/_cat/indices?v"  # 查询index

现在集群健康状态变为yellow，yellow表示有的replica没有地方存放，我们现在的集群里只有一个节点，所以会显示yellow

现在在customer索引下增加一个ID为1的文档，这个文档有一个字段，name，ID不是必须的参数

curl -X PUT "localhost:9200/customer/_doc/1?pretty" -H 'Content-Type: application/json' -d '{"name": "John Doe"}'

返回如下

可以用API查询文档

curl -X GET "localhost:9200/customer/_doc/1?pretty"

查询一下ID为2的文档，看一下查询失败的返回

2.2 删除索引

curl -X DELETE "localhost:9200/customer?pretty"  # 删除索引customer
curl -X GET "localhost:9200/_cat/indices?v"  # 列出所有索引

删除之后会返回一个简单json

尝试删除一个不存在的索引会返回这样的报错json

总结一下Elasticsearch的restful API格式

<HTTP Verb> /<Index>/<Type>/<ID>

2.3 修改数据

现在重新添加一下customer索引和文档

curl -X PUT "localhost:9200/customer?pretty"  # 添加一个名为customer的索引
curl -X PUT "localhost:9200/customer/_doc/1?pretty" -H 'Content-Type: application/json' -d '{"name": "John Doe"}'

现在再次添加一个ID为1的文档，这个新文档会覆盖原来的文档

curl -X PUT "localhost:9200/customer/_doc/1?pretty" -H 'Content-Type: application/json' -d '{"name": "John Doe"}'
curl -X GET "localhost:9200/customer/_doc/1?pretty"  # 查询ID为1的文档

也可以对现有的文档进行修改，比如给文档增加一个新的字段

curl -X POST "localhost:9200/customer/_doc/1/_update?pretty" -H 'Content-Type: application/json' -d' { "doc": { "name": "Jane Doe", "age": 20 }}'  # 更新文档
curl -X GET "localhost:9200/customer/_doc/1?pretty"

在修改的时候还可以用脚本的方法，比如把age增加5岁

curl -X POST "localhost:9200/customer/_doc/1/_update?pretty" -H 'Content-Type: application/json' -d' {"script" : "ctx._source.age += 5"}'
curl -X GET "localhost:9200/customer/_doc/1?pretty"

2.4 批处理

现在用批处理的方式，添加两个ID为1和2的文档，比如用换行符分隔内容，不然会报错

curl -X POST "localhost:9200/customer/_doc/_bulk?pretty" -H 'Content-Type: application/json' -d'
{"index":{"_id":"1"}}
{"name": "John Doe" }
{"index":{"_id":"2"}}
{"name": "Jane Doe" }
'

也可以在一个操作里混合多种操作，现在把ID为1的文档名改为John Doe becomes Jane Doe，再删掉ID为2的*文档

curl -X POST "localhost:9200/customer/_doc/_bulk?pretty" -H 'Content-Type: application/json' -d'
{"update":{"_id":"1"}}
{"doc": { "name": "John Doe becomes Jane Doe" } }
{"delete":{"_id":"2"}}
'

现在再来查询看看

现在来做一点更加实战的东西，因为现在json有点长，所以把json写在文档里，再用curl调用比较方便。建立一个文档为accounts.json，填上下面的内容

你也可以去官方网站下载完整的，有1000条信息https://raw.githubusercontent.com/elastic/elasticsearch/master/docs/src/test/resources/accounts.json

{"index":{"_id":"1"}}
{"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}
{"index":{"_id":"6"}}
{"account_number":6,"balance":5686,"firstname":"Hattie","lastname":"Bond","age":36,"gender":"M","address":"671 Bristol Street","employer":"Netagy","email":"hattiebond@netagy.com","city":"Dante","state":"TN"}
{"index":{"_id":"13"}}
{"account_number":13,"balance":32838,"firstname":"Nanette","lastname":"Bates","age":28,"gender":"F","address":"789 Madison Street","employer":"Quility","email":"nanettebates@quility.com","city":"Nogal","state":"VA"}

批量添加

curl -H "Content-Type: application/json" -XPOST "localhost:9200/bank/_doc/_bulk?pretty&refresh" --data-binary "@accounts.json"

返回十分长

现在来查询一下集群状态，可以看到新的索引bank已经添加了

3 搜索

3.1 查询

终于到了最关键的搜索啦，搜索参数可以放在url里，也可以放在json中

先尝试url,在bank索引中搜索，用正则*匹配所有的文档，按account_number字段正序排列

curl -X GET "localhost:9200/bank/_search?q=*&sort=account_number:asc&pretty"

返回里面的_score这一项表示搜索条件和文档的匹配程度

再尝试一下json的请求

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d '{"query": { "match_all": {}},"sort": [{"account_number": "asc"}]}'

搜索请求体的query项描述查询条件，有非常多种写法

# account_number等于20
{
  "query": { "match": { "account_number": 20 } }
}

# address包含mill或lane
{
  "query": { "match": { "address": "mill lane" } }
}

# address包含短语"mill lane"
{
  "query": { "match_phrase": { "address": "mill lane" } }
}

# address包含mill和lane，must表示所有条件都需要满足，想当于与
{
  "query": {
    "bool": {
      "must": [
        { "match": { "address": "mill" } },
        { "match": { "address": "lane" } }
      ]
    }
  }
}

# address包含mill或lane，must表示所有条件都需要满足，想当于或
{
  "query": {
    "bool": {
      "should": [
        { "match": { "address": "mill" } },
        { "match": { "address": "lane" } }
      ]
    }
  }
}

# address不能包含mill和lane
{
  "query": {
    "bool": {
      "must_not": [
        { "match": { "address": "mill" } },
        { "match": { "address": "lane" } }
      ]
    }
  }
}

3.2 过滤

前面说了返回的_score字段表示查询条件和文档的相关性，过滤器filter可以在不改变_score的情况下对查询结果进行过滤

建立一个filter.json文件，准备查询账户余额在1万到5万之间的人

{
  "query": {
    "bool": {
      "must": { "match_all": {} },
      "filter": {
        "range": {
          "balance": {
            "gte": 10000,
            "lte": 50000
          }
        }
      }
    }
  }
}

查询

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d@"filter.json"

3.3 聚合

现在做一些统计的工作，比如统计一下男女的账户各有多少个

写好aggregator.json

{
  "size": 0,
  "aggs": {
    "group_by_gender": {
      "terms": {
        "field": "gender.keyword"
      }
    }
  }
}

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d@"aggregator.json"

返回值

还可以计算账户余额的平均数，修改json

{
  "size": 0,
  "aggs": {
    "group_by_gender": {
      "terms": {
        "field": "gender.keyword"
      },
      "aggs": {
        "average_balance": {
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  }
}

还可以来点更复杂的，按存款余额范围分组，把小于1W和大于1W的分为两个组，在组内再按男女分组，再计算平均余额

阿团团

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Elasticsearch学习（零）——安装与基本操作

官方文档https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started-install.html全程在Centos6.10的虚拟机安装1 安装安装非常简单，一个tar包下载解压即可cd 你的工作目录curl -L -O https://artifacts.elastic.co/d...
复制链接

扫一扫

专栏目录