[elasticsearch笔记] Analysis - Analyzer

最新推荐文章于 2022-11-04 13:25:54 发布

VIP文章箭飞天

最新推荐文章于 2022-11-04 13:25:54 发布

阅读量645

点赞数

分类专栏： elasticsearch

本文链接：https://blog.csdn.net/weixin_43834662/article/details/98043559

版权

文章目录

- note
- demo
- - custom
  - standard
  - simple
  - whitespace
  - stop
  - keyword
  - pattern
  - fingerprint

note

索引和搜索过程的 analyzer 应该保持一致
analyzer包含：character fitlers, tokenizers, and token filters
character fitlers: 单词转化，过滤。
analyzer有0到多个character filters，按顺序执行
tokenizer：把输入处理成多个单词 term（单词），还负责记录 term 的位置信息。
analyzer必须有一个 tokenizer
Token filters：负责增、删、改 tokens。lowercase token filter：转化为小写； stop token filter：删除stop words；synonym token filter：增加同义词。toker filters 不会改变字符的偏移等信息。
analyzer 有0或者多个token filters
Analyzers

demo

custom

GET analyzer_index/_mapping

POST _analyze
{
  "analyzer": "whitespace",
  "text":     "The quick brown fox."
}

POST _analyze
{
  "tokenizer": "standard",
  "filter":  [ "lowercase", "asciifolding" ],
  "text":      "Is this déja vu?"
}

DELETE analyzer_index
#
# self define Custom analyzers
#
PUT analyzer_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "std_folded": { 
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "asciifolding"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "my_text": {
        "type": "text",
        "analyzer": "std_folded" 
      }
    }
  }
}

GET

最低0.47元/天解锁文章

箭飞天

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
[elasticsearch笔记] Analysis - Analyzer

文章目录notedemocustomstandardsimplewhitespacestopkeywordpatternfingerprintnote索引和搜索过程的 analyzer 应该保持一致analyzer包含：character fitlers, tokenizers, and token filterscharacter fitlers: 单词转化，过滤。analyze...
复制链接

扫一扫