Elasticsearch简单学习6:入门学习-3

一、多字段特性以及配置自定义Analyzer

1.多字段类型

3e7dd8b060edec7c424f92ecb9bf074fcbd.jpg

2.Exact Values & Full Text

a3e040a63b84f49cc200219ea4f0e5b8025.jpg

3.Exact Values 不需要被分词

aff1405dce8955282f71b00254bb7ea3079.jpg

4.自定义分词

c3646724ba7f36737a28d23cc5460acc430.jpg

5.Character Filters

c98b69ca75c592c261bb4726fb14fb413cd.jpg

6.Tokenizer

a35634bbc77b515210dfd750da8b649c56e.jpg

7.Token Filters

8b2616c1d90dbc7d2dff56b62ab241b5f3f.jpg

 


##使用keyword的tokenizer以及html字符过滤器
POST _analyze
{
  "tokenizer":"keyword",
  "char_filter":["html_strip"],
  "text": "<b>hello world</b>"
}

## 使用path_hierarchy的tokenizer
POST _analyze
{
  "tokenizer":"path_hierarchy",
  "text":"/user/local/a/b/c/d/e"
}

#使用char Filter进行替换
POST _analyze
{
  "tokenizer": "standard",
  "char_filter": [
      {
        "type" : "mapping",
        "mappings" : [ "- => _"]
      }
    ],
  "text": "123-456, I-test! test-990 650-555-1234"
}


#char filter 替换表情符号
POST _analyze
{
  "tokenizer": "standard",
  "char_filter": [
      {
        "type" : "mapping",
        "mappings" : [ ":) => happy", ":( => sad"]
      }
    ],
    "text": ["I am felling :)", "Feeling :( today"]
}

###whitespace与stop
GET _analyze
{
  "tokenizer": "whitespace",
  "filter": ["stop"],
  "text": ["The rain in Spain falls mainly on the plain."]
}

##remove 加入lowercase后,The被当成 stopword删除
GET _analyze
{
  "tokenizer": "whitespace",
  "filter": ["lowercase","stop"],
  "text": ["The girls in China are playing this game!"]
}

//正则表达式
GET _analyze
{
  "tokenizer": "standard",
  "char_filter": [
      {
        "type" : "pattern_replace",
        "pattern" : "http://(.*)",
        "replacement" : "$1"
      }
    ],
    "text" : "http://www.elastic.co"
}


#创建索引时,自定义analyzer和filter
PUT my_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_custom_analyzer" : {
          "type": "custom",
          "char_filter":[
            "emoticons" 
          ],
          "tokenizer":"punctuation",
          "filter":[
            "lowercase",
            "english_stop"
          ]
        }
      },
      "tokenizer": {
        "punctuation": {
          "type" : "pattern",
          "pattern" : "[ .,!?]"//有个空格
        }
      },
      "char_filter": {
        "emoticons": {
          "type" : "mapping",
          "mappings" : [ ":) => _happy_", ":( => _sad_"]
        }
      },
      "filter": {
        "english_stop" : {
          "type" : "stop",
          "stopwords" : "_english_"
        }
      }
    }
  }
}

#测试一下
POST my_index/_analyze 
{
  "analyzer": "my_custom_analyzer",
  "text" : "I'm a :) person , and you ?"
}

二、Index Template和Dynamic Template

管理很多的索引时比较有用!!

79ae314999c76c91b616af562f982d8d84b.jpg

1.什么是Index Template?

793d0bcc25b56e5524980c77ae431cdf7a1.jpg

9b23fae6baa915f0d89a51a93da5e3b3ed5.jpg

date_detection,设置为false,字符串日期类型会被解释为字符串。

numeric_detection , 设置为true,字符串的数字类型,会被解释为number类型。

2.Index Template的工作方式

442472e84096fa53fe0f453c4850d2f26b0.jpg

####数字字符串被映射成text ,日期字符串被映射成日期
PUT ttemplate/_doc/1
{
	"someNumber":"1",
	"someDate":"2019/01/01"
}

## 查看映射
GET ttemplate/_mapping


################## 创建一个默认的template ##########
#Create a default template
PUT _template/template_default
{
  "index_patterns": ["*"],
  "order" : 0,
  "version": 1,
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas":1
  }
}

#以test开头的索引使用的模板
PUT /_template/template_test
{
    "index_patterns" : ["test*"],
    "order" : 1,
    "settings" : {
    	"number_of_shards": 1,
        "number_of_replicas" : 2
    },
    "mappings" : {
    	"date_detection": false,
    	"numeric_detection": true
    }
}

#查看template信息
GET /_template/template_default
GET /_template/template_test
GET /_template/temp*

#写入新的数据,index以test开头 ,someNumber为Long ,someDate为text类型
PUT testtemplate/_doc/1
{
	"someNumber":"1",
	"someDate":"2019/01/01"
}

##查看发现,someNumber为Long ,someDate为text类型
GET testtemplate/_mapping
GET testtemplate/_settings

############### 给某个索引定义setting后,以这次为准######
PUT testmy
{
	"settings":{
		"number_of_replicas":5
	}
}

#插入测试数据
put testmy/_doc/1
{
  "key":"value"
}

#查看testmy索引的setting
GET testmy/_settings

DELETE testmy
DELETE /_template/template_default
DELETE /_template/template_test

3.什么是Dynamic Template?

623d6b449ef982c2327c2c00b9d549b0975.jpg

b8021b3d26f290bd567670809dc1620f24e.jpg

87dd9029ed0b7c88eb6dc2dcbd4cf7e6f65.jpg

############################# #Dynaminc Mapping 根据类型和字段名 #############
DELETE my_index

##插入一条测试数据,可以发现firstName和isVip都会被映射成text
PUT my_index/_doc/1
{
  "firstName":"Han",
  "isVIP":"true"
}

GET my_index/_mapping
DELETE my_index

######## 自定义映射关系
PUT my_index
{
  "mappings": {
    "dynamic_templates": [
            {
        "strings_as_boolean": {
          "match_mapping_type":   "string",
          "match":"is*",
          "mapping": {
            "type": "boolean"
          }
        }
      },
      {
        "strings_as_keywords": {
          "match_mapping_type":   "string",
          "mapping": {
            "type": "keyword"
          }
        }
      }
    ]
  }
}

##重新插入测试数据,可以发现firstName被映射成text,isVip被映射成boolean
PUT my_index/_doc/1
{
  "firstName":"Han",
  "isVIP":"true"
}

GET my_index/_mapping
DELETE my_index

######################结合路径 ##############
PUT my_index
{
  "mappings": {
    "dynamic_templates": [
      {
        "full_name": {
          "path_match":   "name.*",
          "path_unmatch": "*.middle",
          "mapping": {
            "type":       "text",
            "copy_to":    "full_name"
          }
        }
      }
    ]
  }
}

##插入测试数据
PUT my_index/_doc/1
{
  "name": {
    "first":  "John",
    "middle": "Winston",
    "last":   "Lennon"
  }
}

##测试一下查询
GET my_index/_search?q=full_name:John

Index Templates :https://www.elastic.co/guide/en/elasticsearch/reference/7.1/indices-templates.html

Dynamic Template : https://www.elastic.co/guide/en/elasticsearch/reference/7.1/dynamic-mapping.html

三、Elasticsearch聚合分析简介

1.什么是聚合(Aggregation)?

7475cc1734028a826a39e2a2985ffbccc32.jpg

2.集合的分类

34f10361e5962cdb0963995f1405e9c7ec7.jpg

3.Bucket & Metric

123ccbd975868d90fe8440b923c1054a1b3.jpg

4.Bucket

22682af37fe6b34dbd3ecbdb011a1aa0d62.jpg

5.Metric

6d12a1d9e0d6a79204a2bdb351a7db363f9.jpg

+++++++++++++++++++++++++++++++++++++++++++

39c73114baaf9783f9d09d1cf9aa99f1eac.jpg

6.一个Bucket的例子

d980d6be6a85e732c1804e0b26097aa2af3.jpg

注意:size应该设置为0,否则会返回查询结果。

7.加入Metrics

35e0a70fa845f8d61e711799fe339b00df3.jpg

8.嵌套

6bec909e45564033f15c5f24b77a9169014.jpg



#按照目的地进行分桶统计
GET kibana_sample_data_flights/_search
{
	"size": 0,
	"aggs":{
		"flight_dest":{
			"terms":{
				"field":"DestCountry"
			}
		}
	}
}

#查看航班目的地的统计信息,增加平均,最高最低价格
GET kibana_sample_data_flights/_search
{
	"size": 0,
	"aggs":{
		"flight_dest":{
			"terms":{
				"field":"DestCountry"
			},
			"aggs":{
				"avg_price":{
					"avg":{
						"field":"AvgTicketPrice"
					}
				},
				"max_price":{
					"max":{
						"field":"AvgTicketPrice"
					}
				},
				"min_price":{
					"min":{
						"field":"AvgTicketPrice"
					}
				}
			}
		}
	}
}


#价格统计信息+天气信息 ,注意使用了stats
GET kibana_sample_data_flights/_search
{
	"size": 0,
	"aggs":{
		"flight_dest":{
			"terms":{
				"field":"DestCountry"
			},
			"aggs":{
				"stats_price":{
					"stats":{
						"field":"AvgTicketPrice"
					}
				},
				"wather":{
				  "terms": {
				    "field": "DestWeather",
				    "size": 5
				  }
				}

			}
		}
	}
}

文档:https://www.elastic.co/guide/en/elasticsearch/reference/7.1/search-aggregations.html

 

 

转载于:https://my.oschina.net/hanchao/blog/3080191

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值