测试使用-批量往es索引中添加数据,es的使用小结。

本文链接：https://blog.csdn.net/wyl9527/article/details/76039686

# encoding:utf8
from datetime import datetime
from elasticsearch import Elasticsearch
import elasticsearch.helpers
import random

es = Elasticsearch(['172.18.1.22:9200', '172.18.1.23:9200', '172.18.1.24:9200', '172.18.1.25:9200', '172.18.1.26:9200'])


es.indices.create(index='test_index', ignore=400)
#es.index(index="skynet_social_twitter_v6", doc_type="test-type", id=42, body={"any": "data", "timestamp": datetime.now()})

package = []
for i in range( 10 ):
    row = {
        "@timestamp":datetime.now().strftime( "%Y-%m-%dT%H:%M:%S.000+0800" ),
        "count" : random.randint(  1, 100 )
    }
    package.append( row )

actions = [
    {
        '_op_type': 'index',
        '_index': "test_index",  
        '_type': "test-type",  
        '_source': d
    }
    for d in package
]    

elasticsearch.helpers.bulk( es, actions )

他人博客总结的：他人总结的es使用小结

给索引取别名，这样告诉使用者别名就ok了。

 curl -XPOST 'http://172.18.1.22:9200/_aliases' -d 
{
    "actions": [
        {"add": {"index": "info-test", "alias": "wyl"}}
    ]
}

移除别名：

curl -XPOST 'http://localhost:9200/_aliases' -d 
{
    "actions": [
        {"remove": {"index": "test1", "alias": "alias1"}}
    ]
}

重命名一个别名就是一个简单的remove然后add的操作，也是使用相同的API。这个操作是原子的。

重命名:

curl -XPOST 'http://localhost:9200/_aliases' -d '
{
    "actions": [
        {"remove": {"index": "test1", "alias": "alias1"}},
        {"add": {"index":"test1", "alias": "alias2"}}
    ]
}'

将一个别名同多个的索引关联起来：

curl -XPOST 'http://localhost:9200/_aliases' -d '
{
    "actions": [
        {"add": {"index": "test1", "alias":"alias1"}},
        {"add": {"index": "test2", "alias":"alias1"}}
    ]
}'

向一个指向多个索引的别名去索引数据会引发一个错误。

1、查看集群的所有节点

http://172.24.5.149:9200/_cat/nodes?v

2、查看集群的健康情况

http://172.24.5.149:9200/_cat/health?v

3、查看集群中所有的索引

http://172.24.5.149:9200/_cat/indices?v

4、删除info-test索引

curl -XDELETE 'http://172.24.5.149:9200/info-test'

5、创建info-test索引

curl -XPUT 'http://172.24.5.149:9200/info-test'

6、向索引中插入一个ID为1的文档

    curl -XPUT "localhost:9200/info-test/people/1?
    {
        "name": "John Doe"
    }"

7、在没有ID的情况下向索引中插入文档，ES会随机生成一个ID：

    curl -XPOST "localhost:9200/info-test/people?
    {
        "name": "John Doe"
     }"

8、根据ID查询文档

 curl -XGET 'localhost:9200/info-test/people/1?

9、更新ID为1的文档，将name字段的值改为Jane Doe

curl -XPOST "localhost:9200/info-test/people/1/_update?
        {
          "doc": { "name": "Jane Doe" }
        }"

10、更新ID为1的文档，将name字段的值改为Jane Doe，同时加上age字段

 curl -XPOST "localhost:9200/info-test/people/1/_update?
        {
          "doc": { "name": "Jane Doe", "age": 20 }
        }

11、通过脚本来执行，给ID为1的文档的age属性值加5

 curl -XPOST "localhost:9200/info-test/people/1/_update?
        {
          "script" : "ctx._source.age += 5"
        }"

在上面的例子中，ctx._source指向当前要被更新的文档。

12、删除ID为2的文档

curl -XDELETE "localhost:9200/info-test/people/2?"
可以设置超时时间
curl -XDELETE 'http://localhost:9200/twitter/tweet/1?timeout=5m'

13、删除名字中包含“John”的所有文档

  curl -XDELETE "localhost:9200/info-test/people/_query?
        {
          "query": { "match": { "name": "John" } }
        }

14、批量插入ID为1和ID为2的文档

 curl -XPOST 'localhost:9200/info-test/people/_bulk? {"index":{"_id":"1"}}{"name": "John Doe" }{"index":{"_id":"2"}}{"name": "Jane Doe" }'

15、批量更新ID为1的文档，删除ID为2的文档

   curl -XPOST 'localhost:9200/customer/external/_bulk?
        {"update":{"_id":"1"}}
        {"doc": { "name": "John Doe becomes Jane Doe" } }
        {"delete":{"_id":"2"}}'

16、搜索info-test索引中的所有文档

curl 'localhost:9200/info-test/_search?q=*'

17、使用POST请求体搜索info-test索引中的所有文档

      curl -XPOST 'localhost:9200/info-test/_search?
            {
              "query": { "match_all": {} }
            }'

18、使用POST请求体搜索info-test索引中的所有文档，但只要求返回一个文档（默认返回10个）

        curl -XPOST 'localhost:9200/info-test/_search?
            {
              "query": { "match_all": {} },
        "size": 1
            }'

19、使用POST请求体搜索info-test索引中的所有文档，返回第11到第20个文档

  curl -XPOST 'localhost:9200/info-test/_search?
        {
          "query": { "match_all": {} },
          "from": 10,
          "size": 10
        }'

如果不指定from的值，它默认就是0。

20、使用POST请求体搜索info-test索引中的所有文档并按照name属性降序排列

    curl -XPOST 'localhost:9200/info-test/_search?
        {
          "query": { "match_all": {} },
          "sort": { "name": { "order": "desc" } }
        }'

21、使用POST请求体搜索info-test索引中的所有文档，但是只要求返回部分字段

   curl -XPOST 'localhost:9200/info-test/_search?
        {
          "query": { "match_all": {} },
          "_source": ["age", "name"]
        }'

22、使用POST请求体搜索info-test索引中age属性值为20的文档

  curl -XPOST 'localhost:9200/info-test/_search?
        {
          "query": { "match": { "age": 20 } }
        }

23、使用POST请求体搜索info-test索引中address属性值包含mill lane的文档.（Jane Doe相当于一个短语）

   curl -XPOST 'localhost:9200/info-test/_search?
        {
          "query": { "match_phrase": { "address": "mill lane" } }
        }'

24、使用POST请求体搜索info-test索引中address属性值包含”mill”和”lane”的文档

     curl -XPOST 'localhost:9200/info-test/_search?
        {
          "query": {
            "bool": {
              "must": [
                { "match": { "address": "mill" } },
                { "match": { "address": "lane" } }
              ]
            }
          }
        }'
must：and。 should: or。 must_not:非。

25、使用POST请求体搜索info-test索引中balance的属性值在2000大于等于20000并且小于等于30000的文档

   curl -XPOST 'localhost:9200/info-test/_search?
        {
          "query": {
            "filtered": {
              "query": { "match_all": {} },
              "filter": {
                "range": {
                  "balance": {
                    "gte": 20000,
                    "lte": 30000
                  }
                }
              }
            }
          }
        }'

26、使用POST请求体搜索info-test索引中的文档，并按照state属性分组
curl -XPOST ‘localhost:9200/info-test/_search?

 {
          "size": 0,
          "aggs": {
            "group_by_state": {
              "terms": {
                "field": "state"
              }
            }
          }
        }'

响应（其中一部分）是：

"hits" : {
            "total" : 1000,
            "max_score" : 0.0,
            "hits" : [ ]
          },
          "aggregations" : {
            "group_by_state" : {
              "buckets" : [ {
                "key" : "al",
                "doc_count" : 21
              }, {
                "key" : "tx",
                "doc_count" : 17
              }, {
                "key" : "id",
                "doc_count" : 15
              }, {
                "key" : "ma",
                "doc_count" : 15
              }, {
                "key" : "md",
                "doc_count" : 15
              }, {
                "key" : "pa",
                "doc_count" : 15
              }, {
                "key" : "dc",
                "doc_count" : 14
              }, {
                "key" : "me",
                "doc_count" : 14
              }, {
                "key" : "mo",
                "doc_count" : 14
              }, {
                "key" : "nd",
                "doc_count" : 14
              } ]
            }
          }
        }

27、在先前聚合的基础上，现在这个例子计算了每个州的账户的平均余额

curl -XPOST 'localhost:9200/bank/_search?
        {
          "size": 0,
          "aggs": {
            "group_by_state": {
              "terms": {
                "field": "state"
              },
              "aggs": {
                "average_balance": {
                  "avg": {
                    "field": "balance"
                  }
                }
              }
            }
          }
        }'

28、基于前面的聚合，现在让我们按照平均余额进行排序：

  curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
        {
          "size": 0,
          "aggs": {
            "group_by_state": {
              "terms": {
                "field": "state",
                "order": {
                  "average_balance": "desc"
                }
              },
              "aggs": {
                "average_balance": {
                  "avg": {
                    "field": "balance"
                  }
                }
              }
            }
          }
        }'

29、使用年龄段（20-29，30-39，40-49）分组，然后在用性别分组，然后为每一个年龄段的每一个性别计算平均账户余额：

 curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
        {
          "size": 0,
          "aggs": {
            "group_by_age": {
              "range": {
                "field": "age",
                "ranges": [
                  {
                    "from": 20,
                    "to": 30
                  },
                  {
                    "from": 30,
                    "to": 40
                  },
                  {
                    "from": 40,
                    "to": 50
                  }
                ]
              },
              "aggs": {
                "group_by_gender": {
                  "terms": {
                    "field": "gender"
                  },
                  "aggs": {
                    "average_balance": {
                      "avg": {
                        "field": "balance"
                      }
                    }
                  }
                }
              }
            }
          }
        }'

30、给已有的mapping新增一个字段

POST /information/_mapping/email1
{
  "properties": {
    "name": {
      "type": "text",
      "index": "analyzed"
    }
  }
}

31、设置索引的setting

PUT /atom/_settings
{
  "settings": {

       "index.mapping.total_fields.limit": 4000

},
  "index": {
    "refresh_interval": "30s",
    "number_of_replicas":"0"
  }
}

32、查看指定type的mapping（如果不指定type，则查看index下面所有type的mapping）

GET /atom/_mapping/人类

33、条件更新_update_by_query

POST /index/type/_update_by_query?conflicts=proceed
{
  "script": {
    "inline": "ctx._source.ontology_type=(params.tag)",
    "lang": "painless",
    "params": {
      "tag": "event"
    }
  },
  "query": {
    "match_all": {}
  }
}

34、查询某个type下面的所有数据

POST /atom/欧洲排球锦标赛/_search
{
  "query": {
    "match_all": {}
  }
}

35、创建文档的时候带版本号

PUT twitter/tweet/1?version=2
{
    "message" : "elasticsearch now has versioning support, double cool!"
}

version类型：internal、external or external_gt、external_gte

36、创建文档的时候带op_type参数

PUT twitter/tweet/1?op_type=create
{
    "user" : "kimchy",
    "post_date" : "2011-11-15T14:12:12",
    "message" : "trying out Elasticsearch"
}

或者

PUT twitter/tweet/1/_create
{
    "user" : "kimchy",
    "post_date" : "2011-11-15T14:12:12",
    "message" : "trying out Elasticsearch"
}

37、创建文档的时候自动生成id字段

POST twitter/tweet/
{
    "user" : "kimchy",
    "post_date" : "2009-11-15T14:12:12",
    "message" : "trying out Elasticsearch"
}

38、创建文档的时候指定路由字段

POST twitter/tweet?routing=kimchy
{
    "user" : "kimchy",
    "post_date" : "2011-11-15T14:12:12",
    "message" : "trying out Elasticsearch"
}

39、创建文档时设置超时时间

PUT twitter/tweet/1?timeout=5m
{
    "user" : "kimchy",
    "post_date" : "2011-11-15T14:12:12",
    "message" : "trying out Elasticsearch"
}

40、查询时不要source字段

GET twitter/tweet/0?_source=false

41、查询时选择source中的字段

GET twitter/tweet/0?_source_include=*.id&_source_exclude=entities

或者

GET twitter/tweet/0?_source=*.id,retweeted

42、只获取source里面的字段

GET twitter/tweet/1/_source

也可以选择source里面的部分字段

GET twitter/tweet/1/_source?_source_include=*.id&_source_exclude=entities'

43、自定义routing

GET twitter/tweet/2?routing=user1

创建文档的时候指定了routing的话，查询时候也要带上routing

44、给指定的type创建mapping

POST /information/_mapping/email1
{
  "properties": {
    "name": {
      "type": "text",
      "index": "analyzed"
    }
  }
}

45、delete_by_query

POST atom_v3/news/_delete_by_query?conflicts=proceed
{
  "query": { 
    "match": {
      "docType": "news"
    }
  }
}

46、强制合并索引的segment

POST atom_v3/_forcemerge?max_num_segments=5

47、查看某个索引的segments

http://172.24.8.83:9200/atom_v3/_segments

或者

http://172.24.8.83:9200/_cat/segments/atom_v3

48、创建索引的同时创建mapping

PUT my_index
{
  "mappings": {
    "user": {
      "_all": {
        "enabled": false
      },
      "properties": {
        "title": {
          "type": "text"
        },
        "name": {
          "type": "text"
        },
        "age": {
          "type": "integer"
        }
      }
    },
    "blogpost": {
      "_all": {
        "enabled": false
      },
      "properties": {
        "title": {
          "type": "text"
        },
        "body": {
          "type": "text"
        },
        "user_id": {
          "type": "keyword"
        },
        "created": {
          "type": "date",
          "format": "strict_date_optional_time||epoch_millis"
        }
      }
    }
  }
}

49、reindex:index之间的数据导入

POST _reindex
{
  "source": {
    "index": "twitter"
  },
  "dest": {
    "index": "new_twitter"
  }
}