ElasticSearch基础操作

最新推荐文章于 2023-03-10 10:42:09 发布

wankunde

最新推荐文章于 2023-03-10 10:42:09 发布

阅读量423

点赞数

分类专栏： elasticsearch

本文链接：https://blog.csdn.net/wankunde/article/details/78893071

版权

elasticsearch 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

ES 基础操作参考文章

查看实例状态
curl http://10.25.164.23:9200
查看集群全部索引
curl -X GET '10.25.164.23:9200/_cat/indices?v'
查看Index中包含的Type

curl -X GET '10.25.164.23:9200/_mapping?pretty=true'

新建Index

curl -X PUT '10.25.164.23:9200/accounts' -d '
{
  "mappings": {
    "person": {
      "properties": {
        "user": {
          "type": "text",
          "analyzer": "ik_max_word",
          "search_analyzer": "ik_max_word",
          "index": "not_analyzed"
        },
        "title": {
          "type": "text"
        },
        "desc": {
          "type": "text"
        },
        "join_time": {
            "type": "date", 
            "format": "dateOptionalTime", 
            "index": "not_analyzed"
        }
      }
    }
  }
}'


curl -X PUT '10.25.164.23:9200/accounts' -d '
{
  "settings":{
           "number_of_shards":1,     
           "number_of_replicas":1,  
           "index":{
                  "analysis":{
                         "analyzer":{
                                "default":{
                                       "tokenizer":"standard",    
                                       "filter":[ 
                                              "asciifolding",
                                              "lowercase",
                                              "ourEnglishFilter"
                                       ]
                                }
                         },
                         "filter":{
                                "ourEnglishFilter":{
                                       "type":"kstem"
                                }
                         }
                  }
           }
    }
}'

-- 模板
curl -XPUT [address]/blog/ -d '{
    "settings":{
           "number_of_shards":1,     //设置分片数量
           "number_of_replicas":2,  //设置副本数量
           //自定义索引默认分析器
           "index":{
                  "analysis":{
                         "analyzer":{
                                "default":{
                                       "tokenizer":"standard",     //分词器
                                       "filter":[ //过滤器
                                              "asciifolding",
                                              "lowercase",
                                              "ourEnglishFilter"
                                       ]
                                }
                         },
                         "filter":{
                                "ourEnglishFilter":{
                                       "type":"kstem"
                                }
                         }
                  }
           }
    }
}'

index mapping 属性
- type表示field的数据类型，上例中interests的type为string表示为普通文本。
  
  Elasticsearch支持以下数据类型：
  文本: string
  数字: byte, short, integer, long
  浮点数: float, double
  布尔值: boolean
  Date: date
  对于type为 string 的字段，最重要的属性是：index and analyzer。
  - index
    
    index 属性控制string如何被索引，它有三个可选值:
    
    analyzed: First analyze the string, then index it. In other words, index this field as full text.
    not_analyzed:: Index this field, so it is searchable, but index the value exactly as specified. Do not analyze it.
    no: Don’t index this field at all. This field will not be searchable.
    对于string类型的filed index 默认值是： analyzed.如果我们想对进行精确查找, 那么我们需要将它设置为： not_analyzed。
    例如：
    { “tag”: { “type”: “string”, “index”: “not_analyzed” }
  - analyzer
    
    对于 string类型的字段, 我们可以使用 analyzer 属性来指定在搜索阶段和索引阶段使用哪个分词器. 默认, Elasticsearch 使用 standard analyzer, 你也可以指定Elasticsearch内建的其它分词器，比如 whitespace, simple, or english:
    例如：
    { “tweet”: { “type”: “string”, “analyzer”: “english” }
删除Index

curl -X DELETE '10.25.164.23:9200/weather'

插入数据

curl -X PUT '10.25.164.23:9200/accounts/person/1' -d '
{
  "user": "张三",
  "title": "工程师",
  "desc": "数据库管理"
}'

查看插入的数据

curl '10.25.164.23:9200/accounts/person/1?pretty=true'

查询所有记录

curl '10.25.164.23:9200/accounts/person/_search?pretty=true'

删除记录

curl -X DELETE '10.25.164.23:9200/accounts/person/1'

全文检索

curl '10.25.164.23:9200/accounts/person/_search'  -d '
{
  "query" : { "match" : { "desc" : "软件" }}
}'

HIVE管理ES外部表

set mapreduce.map.java.opts=-Xmx1024m -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/@taskid@.hprof;
add jar hdfs://pasc/metadata/libs/es/elasticsearch-hadoop-hive-2.3.1.jar;
set hive.execution.engine=mr;

CREATE EXTERNAL TABLE test_person(
  user string,
  title string,
  desc  string)
ROW FORMAT SERDE 
  'org.elasticsearch.hadoop.hive.EsSerDe' 
STORED BY 
  'org.elasticsearch.hadoop.hive.EsStorageHandler' 
TBLPROPERTIES (
  'es.mapping.names'='user:user,title:title,desc:desc', 
  'es.nodes'='10.25.164.23:9200', 
  'es.read.metadata'='true', 
  'es.resource'='accounts/person');