Elasticsearch的安装/术语/索引/查询API/DSL

倒排索引

Elasticsearch使用一种叫做倒排索引(inverted index)的结构来做快速的全文搜索。倒排索引由在文档中出现的唯一的单词列表,以及对于每个单词在文档中的位置组成。

下载安装

安装或升级JDK(Java SE Development Kit 8)

java -version
echo $JAVA_HOME

下载Elasticsearch程序压缩包并解压,以2.2.1为例:

curl -L -O https://download.elastic.co/elasticsearch/release/org/elasticsearch/distribution/tar/elasticsearch/2.2.1/elasticsearch-2.2.1.tar.gz
tar -xvf elasticsearch-2.2.1.tar.gz
cd elasticsearch-2.2.1

以demon方式elasticsearch启动(需要非root账户)

bin/elasticsearch -d

Logstash启动

 bin/logstash -f config/jdbc.conf

其中jdbc.conf为配置文件。

名词解释

索引(index)

相当于SQL数据库的数据库(database)

类型(type)

相当于SQL数据库的数据表(table)

文档(document)

相当于SQL数据库的数据记录或行(record or row)

字段(Field)

相当于SQL数据库的数据列(column)

映像(mapping)

相当于SQL数据库的数据模式(schema)

权重(boosting)

用于增加权重,例如title^5表示分数(score)加到5倍

解析器(analyzer)

用于解析Index,包含一个Tokenizer。

Elasticsearch ships with a wide range of built-in analyzers, which can be used in any index without further configuration.
Analyzers are composed of a single Tokenizer and zero or more TokenFilters. The tokenizer may be preceded by one or more CharFilters. The analysis module allows you to register Analyzers under logical names which can then be referenced either in mapping definitions or in certain APIs.

POST _analyze
{
    "analyzer": "whitespace",
    "text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
}

分词(tokenizer)

用于解析字符串,成为terms或tokens

A tokenizer receives a stream of characters, breaks it up into individual tokens (usually individual words), and outputs a stream of tokens.
Tokenizers are used to break a string down into a stream of terms or tokens. A simple tokenizer might split the string up into terms wherever it encounters whitespace or punctuation.

POST _analyze
{
  "tokenizer": "whitespace",
  "text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
}

过滤器(filter)

建议(suggester)

用于搜索建议
一共有四种:
+ Term suggester
+ Phrase Suggester
+ Completion Suggester
+ Context Suggester

索引操作

创建索引

PUT test
{
    "settings" : {
        "number_of_shards" : 1
    },
    "mappings" : {
        "type1" : {
            "properties" : {
                "field1" : { "type" : "text" }
            }
        }
    }
}

删除索引

DELETE /twitter

获取索引

GET /twitter
GET twitter/_settings,_mappings

索引是否存在

HEAD twitter

打开关闭索引

POST /my_index/_close
POST /my_index/_open

设置mapping(Put Mapping)

PUT twitter 
{
  "mappings": {
    "tweet": {
      "properties": {
        "message": {
          "type": "text"
        }
      }
    }
  }
}

获取mapping(Get Mapping)

GET /_mapping/tweet,kimchy
GET /_all/_mapping/tweet,book
GET /twitter,kimchy/_mapping/field/message
GET /_all/_mapping/tweet,book/field/message,user.id
GET /_all/_mapping/tw*/field/*.id

类型是否存在(Types Exists)

HEAD twitter/_mapping/tweet

更新索引设置(更新analysis需要先关闭索引)

POST /twitter/_close
PUT /twitter/_settings
{
  "analysis" : {
    "analyzer":{
      "content":{
        "type":"custom",
        "tokenizer":"whitespace"
      }
    }
  }
}
POST /twitter/_open

获取设置(Get Settings)

GET /twitter,kimchy/_settings

解析(Analyze)

GET _analyze
{
  "tokenizer" : "keyword",
  "filter" : ["lowercase"],
  "char_filter" : ["html_strip"],
  "text" : "this is a <b>test</b>"
}

解析器详解(Explain Analyze)

GET _analyze
{
  "tokenizer" : "standard",
  "filter" : ["snowball"],
  "text" : "detailed output",
  "explain" : true,
  "attributes" : ["keyword"] 
}

索引状态信息(Indices stats Segments Recovery Shard Stores)

GET /index1,index2/_stats
GET twitter/_stats?level=shards
GET /index1,index2/_segments
GET index1,index2/_recovery?human
GET index1,index2/_shard_stores?status=green

清空缓存(Clear Cache)

POST /twitter/_cache/clear
POST /kimchy,elasticsearch/_cache/clear
POST /_cache/clear

释放内存到到索引存储(Flush)

POST twitter/_flush
POST kimchy,elasticsearch/_flush
POST _flush

刷新(refresh)

POST /twitter/_refresh
POST /kimchy,elasticsearch/_refresh
POST /_refresh

强制合并(Force Merge) 可以减少分段(segment)数

POST /twitter/_forcemerge
POST /kimchy,elasticsearch/_forcemerge
POST /_forcemerge

查询API

GET /twitter/_search?q=user:kimchy
GET /twitter/tweet,user/_search?q=user:kimchy
GET /kimchy,elasticsearch/tweet/_search?q=tag:wow
GET /_all/tweet/_search?q=tag:wow
GET /_search?q=tag:wow
GET twitter/tweet/_search?q=user:kimchy

q

The query string (maps to the query_string query, see Query String Query for more details).

请求主体查询

GET /twitter/tweet/_search
{
    "explain": true,
    "version": true,
    "query" : {
        "term" : { "user" : "kimchy" }
    },
    "from" : 0,
    "size" : 10,
    "sort" : [
        { "post_date" : {"order" : "asc"}},
        "user",
        { "name" : "desc" },
        { "age" : "desc" },
        "_score"
    ],
    "_source": [ "obj1.*", "obj2.*" ],
    "script_fields" : {
        "test1" : {
            "script" : "params['_source']['message']"
        }
    },
    "post_filter": { 
        "term": { "color": "red" }
    },
    "highlight" : {
        "pre_tags" : ["<tag1>", "<tag2>"],
        "post_tags" : ["</tag1>", "</tag2>"],
        "fields" : {
            "_all" : {}
        }
    }
}

Inner hits

Query DSL

全文搜索(Full text queries)

match

标准全文查询

GET /_search
{
    "query": {
        "match" : {
            "message" : "this is a test"
        }
    }
}
match_phrase

短语查询,支持分词

GET /_search
    {
        "query": {
            "match_phrase" : {
                "message" : {
                    "query" : "this is a test",
                    "analyzer" : "my_analyzer"
                }
            }
        }
    }
match_phrase_prefix
GET /_search
{
    "query": {
        "match_phrase_prefix" : {
            "message" : {
                "query" : "quick brown f",
                "max_expansions" : 10
            }
        }
    }
}
multi_match

支持多字段版本

GET /_search
{
  "query": {
    "multi_match" : {
      "query" : "this is a test",
      "fields" : [ "subject^3", "message" ] 
    }
  }
}
common_terms

停止符(stopwords)

GET /_search
{
    "query": {
        "common": {
            "body": {
                "query": "this is bonsai cool",
                    "cutoff_frequency": 0.001
            }
        }
    }
}
query_string

查询解析

GET /_search
{
    "query": {
        "query_string" : {
            "default_field" : "content",
            "query" : "this AND that OR thus"
        }
    }
}
simple_query_string

使用 SimpleQueryParser去解析查询语句

GET /_search
{
  "query": {
    "simple_query_string" : {
        "query": "\"fried eggs\" +(eggplant | potato) -frittata",
        "analyzer": "snowball",
        "fields": ["body^5","_all"],
        "default_operator": "and"
    }
  }
}

Term级别查询

更底层的查询

  • Term
  • Terms
  • range
  • exists
  • prefix
  • wildcard
  • regexp
  • fuzzy
  • type
  • ids
GET /_search
{
    "query": {
        "constant_score" : {
            "filter" : {
                "terms" : { "user" : ["kimchy", "elasticsearch"]}
            }
        }
    }
}

组合查询(Compound queries)

  • constant_score
  • bool
  • dis_max
  • function_score
  • boosting
  • indices
GET /_search
{
    "query": {
        "constant_score" : {
            "filter" : {
                "term" : { "user" : "kimchy"}
            },
            "boost" : 1.2
        }
    }
}

联合查询(Joining queries)

  • Nested
  • Has Child
  • Has Parent
  • Parent Id
GET /_search
{
    "query": {
        "constant_score" : {
            "filter" : {
                "term" : { "user" : "kimchy"}
            },
            "boost" : 1.2
        }
    }
}
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

小龙在山东

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值