使用logstash-input-jdbc插件同步mysql数据到elasticsearch,系统会使用一个默认的动态映射模板,模板名字为logstash。在启动logstash过程中你会看到如下信息
Using mapping template from {:path=>nil}
Attempting to install template{:manage_template=>{"template"=>"logstash-*","version"=>50001,"settings"=>{"index.refresh_interval"=>"5s"},"mappings"=>{"_default_"=>{"_all"=>{"enabled"=>true,"norms"=>false},"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message","match_mapping_type"=>"string", "mapping"=>{"type"=>"text","norms"=>false}}},{"string_fields"=>{"match"=>"*","match_mapping_type"=>"string","mapping"=>{"type"=>"text","norms"=>false,"fields"=>{"keyword"=>{"type"=>"keyword"}}}}}],"properties"=>{"@timestamp"=>{"type"=>"date","include_in_all"=>false},"@version"=>{"type"=>"keyword","include_in_all"=>false},"geoip"=>{"dynamic"=>true,"properties"=>{"ip"=>{"type"=>"ip"},"location"=>{"type"=>"geo_point"},"latitude"=>{"type"=>"half_float"},"longitude"=>{"type"=>"half_float"}}}}}}}}
Installing elasticsearch template to_template/logstash
你看第一行path=>nil表示没有找到自定义模板,那就使用默认模板,并且最后将模板存储在elasticsearch模板路径中,以logstash命名。模板内容:
{
"template":"logstash-*",
"version": 50001,
"settings": {
"index.refresh_interval":"5s"
},
"mappings": {
"_default_": {
"_all": {
"enabled": true,
"norms": false
},
"dynamic_templates": [
{
"message_field":{
"path_match":"message",
"match_mapping_type": "string",
"mapping": {
"type":"text",
"norms":false
}
}
},
{
"string_fields":{
"match": "*",
"match_mapping_type": "string",
"mapping": {
"type":"text",
"norms":false,
"fields":{
"keyword": {
"type": "keyword"
}
}
}
}
}
],
"properties": {
"@timestamp": {
"type":"date",
"include_in_all":false
},
"@version": {
"type":"keyword",
"include_in_all":false
},
"geoip": {
"dynamic": true,
"properties": {
"ip": {
"type":"ip"
},
"location": {
"type":"geo_point"
},
"latitude": {
"type":"half_float"
},
"longitude":{
"type":"half_float"
}
}
}
}
}
}
}
他会帮我们自动映射同步过来的字段,但是有一个不好的地方是大部分text类型都分词,而我自己的需求更多是不分词,所以要自定义映射;刚开始我没意识到模板的优先级,我是没改模板配置,一切都是默认,只不过在启动logstash之前,我先用curl -XPUT命令在es集群上创建了不分词的映射,但是发现同步完数据后并没生效,这才意识到logstash的output插件优先级高于你在集群上创建的映射。所以接下来修改模板并覆盖默认的。
首先用命令删除默认的模板:
curl –XDELETE–u elastic ‘192.168.11.31:8011/_template/logstash’
然后新建一个文件es-template.json,名字随便起,在默认模版的内容上修改一下,粘进去,我这里将text类型全部不分词,一下是模板内容
{
"template": "my_index",
"settings" : {
"index.refresh_interval" :"5s"
},
"mappings" : {
"_default_" : {
"_all" : {"enabled":false, "omit_norms" : true},
"dynamic_templates" : [ {
"message_field" : {
"match" :"message",
"match_mapping_type" :"string",
"mapping" : {
"type" :"string", "index" : "not_analyzed","omit_norms" : true,
"fielddata" : {"format" : "disabled" }
}
}
}, {
"string_fields" : {
"match" :"*",
"match_mapping_type" :"string",
"mapping" : {
"type" :"string", "index" : "not_analyzed","omit_norms" : true,
"fielddata" : {"format" : "disabled" },
"fields" : {
"raw" :{"type": "string", "index" :"not_analyzed", "ignore_above" : 256}
}
}
}
} ]
}
}
}
然后是logstash的启动文件jdbc.conf里面output模块配置:
if[type] =="my_type"{
elasticsearch {
hosts => ["192.168.110.31:8011","192.168.110.31:8012","192.168.110.31:8013"]
user => "elastic"
password => "abc123qwer"
index => "my_index"
document_id => "%{id}"
#manage_template =>"false"
template =>"/home/lvyuan/elasticsearch/logstash-5.5.3/template/es-template.json"
template_name =>"my_index"
template_overwrite =>"true"
}
}
启动前先删除以前创建的索引和模板,启动后发现没生效的话,一定要先删除索引和模板(是存储到_template下的模板,不是这个模板物理文件),然后再修改再运行看看。
curl -XDELETE-u elastic 'http://192.168.110.31:8011/_template/my_index'
curl -XDELETE-u elastic 'http://192.168.110.31:8011/my_index'
因为我的初衷是elasticsearch替代mysql的sql语句查询,并不想全文搜索,所以分词还可能影响我的功能,例如有一个字段在mysql中是存储一段既有大写有小写间杂的字母序列(eg:HTZG5jjhffdwe),当采用默认的映射模板(会分词)时,会将这个字母序列先全部转为小写再存入token中,这样的话,用termQuery(不分词,精确匹配)肯定找不到,有人会说可以用matchPhraseQuery,这个的确可以查到;但是如果我想用前缀匹配时prefixQuery(不分词)就查不到了,用以小写的“htzg5”开头的前缀可以匹配到,但是用大写的就匹配不到,token表里全是小写的,肯定匹配不到。所以说具体情况具体分析,不是所有情况下都应该分词。你可以试一下:
http://localhost:8011/_analyze?pretty&analyzer=standard&text=HTZG5jjhffdEX7w52r37880 全转为小写存入token
:{"tokens":[{"token":"htzg5jjhffdex7w52r37880","start_offset":0,"end_offset":22,"type":"<ALPHANUM>","position":0}]}
参考地址:http://blog.csdn.net/asia_kobe/article/details/51192848
http://www.cnblogs.com/NextNight/p/6860283.html
http://www.cnblogs.com/cocowool/p/elk_dynamic_templates.html
https://elasticsearch.cn/article/21
http://m.blog.csdn.net/u012516166/article/details/75106184