借助ES reindex,将源索引数据同步到新索引中,并在新索引中将指定字段作为路由routing
源索引信息
1.源索引创建mapping结构
PUT /regular_address_test
{
"mappings" : {
"properties" : {
"_class" : {
"type" : "keyword",
"index" : false,
"doc_values" : false
},
"address" : {
"type" : "text",
"fields" : {
"accurate" : {
"type" : "keyword"
}
},
"analyzer" : "ik_max_word"
},
"branchCode" : {
"type" : "keyword"
},
"cityName" : {
"type" : "keyword"
},
"companyCode" : {
"type" : "keyword"
},
"countyName" : {
"type" : "keyword"
},
"createBy" : {
"type" : "keyword"
},
"createTime" : {
"type" : "date",
"format" : "yyyy-MM-dd HH:mm:ss"
},
"distributionCode" : {
"type" : "keyword"
},
"id" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"mdCodeGuiJi" : {
"type" : "keyword"
},
"mdCodeOther" : {
"type" : "keyword"
},
"mdCodeYiZhan" : {
"type" : "keyword"
},
"provinceName" : {
"type" : "keyword"
},
"sendCode" : {
"type" : "keyword"
},
"sendTime" : {
"type" : "date",
"format" : "yyyy-MM-dd HH:mm:ss"
},
"trackingNoLatest" : {
"type" : "keyword"
},
"updateBy" : {
"type" : "keyword"
},
"updateTime" : {
"type" : "date",
"format" : "yyyy-MM-dd HH:mm:ss"
}
}
},
"settings" : {
"refresh_interval" : "120s",
"number_of_shards" : 6,
"number_of_replicas" : "0"
}
}
2.源索引数据写入时是没有设置路由的
{
"_index" : "regular_address",
"_type" : "_doc",
"_id" : "28ce46ba1665f73f4fca6722e3bdedaa",
"_version" : 1,
"_seq_no" : 420109,
"_primary_term" : 1,
"found" : true,
"_source" : {
"_class" : "com.yd.addr.regular.core.entity.RegularAddress",
"id" : "28ce46ba1665f73f4fca6722e3bdedaa",
"address" : "安徽省合肥市肥西县桃花工业园丹霞路281号南城创谷后侧院内联强运筹三楼",
"provinceName" : "安徽省",
"cityName" : "合肥市",
"countyName" : "肥西县",
"distributionCode" : "230001",
"companyCode" : "230017",
"branchCode" : "231081",
"sendCode" : "J4",
"mdCodeGuiJi" : "",
"mdCodeYiZhan" : "",
"mdCodeOther" : "MD300003874223",
"trackingNoLatest" : "462247184479337",
"sendTime" : "2022-02-20 13:50:32",
"createBy" : "system",
"createTime" : "2022-03-10 14:43:08",
"updateBy" : "system",
"updateTime" : "2022-03-10 14:43:08"
}
}
目标索引信息
1.目标索引创建
相比源索引的创建配置,新增了routing.required=true。在setting中,新增routing_partition_size=4,去减少自定义路由导致数据倾斜带来的影响
PUT /regular_address_test
{
"mappings" : {
"_routing": {
"required": true
},
"properties" : {
"_class" : {
"type" : "keyword",
"index" : false,
"doc_values" : false
},
"address" : {
"type" : "text",
"fields" : {
"accurate" : {
"type" : "keyword"
}
},
"analyzer" : "ik_max_word"
},
"branchCode" : {
"type" : "keyword"
},
"cityName" : {
"type" : "keyword"
},
"companyCode" : {
"type" : "keyword"
},
"countyName" : {
"type" : "keyword"
},
"createBy" : {
"type" : "keyword"
},
"createTime" : {
"type" : "date",
"format" : "yyyy-MM-dd HH:mm:ss"
},
"distributionCode" : {
"type" : "keyword"
},
"id" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"mdCodeGuiJi" : {
"type" : "keyword"
},
"mdCodeOther" : {
"type" : "keyword"
},
"mdCodeYiZhan" : {
"type" : "keyword"
},
"provinceName" : {
"type" : "keyword"
},
"sendCode" : {
"type" : "keyword"
},
"sendTime" : {
"type" : "date",
"format" : "yyyy-MM-dd HH:mm:ss"
},
"trackingNoLatest" : {
"type" : "keyword"
},
"updateBy" : {
"type" : "keyword"
},
"updateTime" : {
"type" : "date",
"format" : "yyyy-MM-dd HH:mm:ss"
}
}
},
"settings" : {
"routing_partition_size": 4,
"refresh_interval" : "120s",
"number_of_shards" : 8,
"number_of_replicas" : "0"
}
}
_reindex导入数据并配置路由
这里将字段branchCode作为路由配置
POST _reindex
{
"source": {
"index": "regular_address"
},
"dest": {
"index": "regular_address_test",
"routing": "=ctx._routing"
},
"script": {
"source": "ctx._routing = ctx._source.branchCode"
}
}
效果展示
目标索引数据会按路由进行分片
_routing字段是指定的字段值