2019.09.17 16:30:00
- 创建索引/修改配置
//创建索引 public function create_index(){ $params = [ 'index' => 'my_index', 'body' => [ 'settings' => [ 'number_of_shards' => 2, 'number_of_replicas' => 0, ] ] ]; $client = ClientBuilder::create()->build(); $response = $client->indices()->create($params); var_dump($response); } //修改配置 public function put_setting(){ $params = [ 'index' => 'person', 'body' => [ 'settings' => [ 'number_of_replicas' => 10, ] ], ]; $client = ClientBuilder::create()->build(); var_dump($client->indices()->putSettings($params)); }
创建好的索引分片是无法通过put_setting来修改的,这个是一个坑,要求我们在创建索引之处就要好好规划这个结构及容量,否则之后的扩容过程会比较辛苦
- 修改mapping
//将修改mapping public function put_mapping(){ $mapping = [ 'properties' => [ 'address' => [ 'type' => 'keyword', ], 'email' => [ 'type' => 'keyword', ] ] ]; $params = [ 'index' => 'person', 'type' => 'doc', 'body' => $mapping, ]; $client = ClientBuilder::create()->build(); var_dump($client->indices()->putMapping($params)); }
如果要对已存在的索引进行修改,与创建时有所不同,要指出修改的mapping类型,这里还要有一个地方要注意,那就是修改的mapping,新增的字段是追加的形式放入es里的,之前存在的并不会消失。
- bulk批量操作
//批量创建文档 public function bulk_create_another(){ $params = [ 'index' => 'person', 'type' => 'doc', 'body' => [], ]; for ($i =1; $i<=10;$i++){ $params['body'][] = [ 'create' => [ //index 与 create一致都是创建文档 '_id' => $i, ] ]; $params['body'][] = [ 'name' => 'PHPerJiang'.$i, 'age' => $i, 'sex' => $i%2, ]; } $client = ClientBuilder::create()->build(); var_dump($client->bulk($params)); } //批量更新 public function bulk_update_another(){ $params = [ 'index' => 'person', 'type' => 'doc', 'body' => [] ]; for($i = 1; $i <= 10; $i++){ $params['body'][] = [ 'update' => [ '_id' => $i ] ]; $params['body'][] = [ 'doc' => [ 'name' => 'PHPerJiang'.$i*2, 'age' => $i*3, 'sex' => $i%2, ] ]; } $client = ClientBuilder::create()->build(); var_dump($client->bulk($params)); } //批量删除 public function bluk_delete_another(){ $params = [ 'index' => 'person', 'type' => 'doc', 'body' => [], ]; for ($i = 1; $i <= 10; $i++){ $params['body'][] = [ 'delete' => [ '_id' => $i, ] ]; } $client = ClientBuilder::create()->build(); var_dump($client->bulk($params)); }
批量增删改,要注意批量参数中body的写法,指出索引、类型、身体,身体中的操作分为连两部分,一部分是条件,一部分是数据。另外要注意的就是修改和产出操作,身体的第二部分数据部分要指明索引,否则es会报错,而新增数据参数中的第二部分不需要志宁索引
-
部分修改文档
//部分更改doc,若 body 参数中指定一个 doc 参数。这样 doc 参数内的字段会与现存字段进行合并。 public function update_doc(){ $params = [ 'index' => 'person', 'type' => 'doc', 'id' => 2, 'body' => [ 'doc' => [ 'bbb' => '3' ] ] ]; $client = ClientBuilder::create()->build(); var_dump($client->update($params)); }
body参数中若指出doc参数,则会将es现有的字段与doc中的字段合并,相当于php的array_merge()函数,即es中如果没有这个字段则会创建。
2019-09-19更新
- 使用脚本script更新doc
$params = [ 'index' => 'my_index', 'type' => 'my_type', 'id' => 'my_id', 'body' => [ 'script' => 'ctx._source.counter += count', 'params' => [ 'count' => 4 ] ] ]; $response = $client->update($params);
PHP-ElasticSearch文档中是如上写的,经过我实际应用发现是个坑,按照以上写法会报错找不到参数count,正确的写法应该是如下
//使用脚本更新数据 public function update_doc_by_script(){ $params = [ 'index' => 'person', 'type' => 'doc', 'id' => 2, 'body' => [ 'script' => [ 'lang' => 'painless', 'source' => 'ctx._source.age += params.count', 'params' => ['count' => 1], ] ] ]; $client = ClientBuilder::create()->build(); var_dump($client->update($params)); }
将参数放入script参数内才可以,表示开始对文档有深深的怀疑了。
2019-09-20 更新
php-es的官方文档有很多错误,希望大家选择性使用
- 使用脚本更新数据,若数数据中没有这个字段则设定默认值。文档中是这么用的
$params = [ 'index' => 'my_index', 'type' => 'my_type', 'id' => 'my_id', 'body' => [ 'script' => 'ctx._source.counter += count', 'params' => [ 'count' => 4 ], 'upsert' => [ 'counter' => 1 ] ] ]; $response = $client->update($params);
第一点文档中的script使用方法不对,首先我们先把script给修正以下,如下代码,注意下列代码中的age1字段在es中是没有的。
$params = [ 'index' => 'person', 'type' => 'doc', 'id' => 8, 'body' => [ 'script' => [ 'lang' => 'painless', 'source' => "ctx._source.age1 += params.count", 'params' => [ 'count' => 5, ], ], 'upsert' => [ 'count' => 1 ] ], ];
当我们执行如上脚本的时候,会报错找不到这个字段
Message: {"error":{"root_cause":[{"type":"remote_transport_exception","reason":"[first-node][127.0.0.1:9300][indices:data/write/update[s]]"}],"type":"illegal_argument_exception","reason":"failed to execute script","caused_by":{"type":"script_exception","reason":"runtime error","script_stack":["ctx._source.age1 += params.count"," ^---- HERE"],"script":"ctx._source.age1 += params.count","lang":"painless","caused_by":{"type":"null_pointer_exception","reason":null}}},"status":400}
实际上就是这个upsert参数没有生效,这是文档里的第二个错误。正确的写法应该如下
$params = [ 'index' => 'person', 'type' => 'doc', 'id' => 8, 'body' => [ 'script' => [ 'lang' => 'painless', 'source' => "ctx._source.age1 = (ctx._source.age1 ?: 2) + params.count", 'params' => [ 'count' => 5, ], ], ], ];
我们在script脚本中判断是否存在这个age1字段,如果存在则执行后面的累加,如果不存在则给它一个默认值2,并且此时会在es的索引中会加入此字段。这里要注意 script中出现的 ?: 是painless中特定的语法,详情看https://www.elastic.co/guide/en/elasticsearch/reference/5.4/modules-scripting-painless-syntax.html
-
搜索的bool查询:filter\should\must\must_not
public function search_complex(){ $params = [ 'index' => 'person', 'type' => 'doc', 'body' => [ 'query' => [ 'bool' => [ 'filter' => [ 'term' => ['age1' => 22] ], 'must' => [ ['term' => ['age' =>8]], ['term' => ['sex' =>0]] ], ], ], ], ]; $client = ClientBuilder ::create() -> build(); echo json_encode($client -> search($params)); }
搜索分为过滤filter 和查询 must\must_not\should,其中在bool参数下单独使用filter则不会打分,单独使用must\must_not\should或与filter与前面三个方式组合查询会返回参数。如果想使用filter查询又想获取相关性的得分,有以下两种方式可以实现:
//方式一 $params = [ 'index' => 'person', 'type' => 'doc', 'body' => [ 'query' => [ 'bool' => [ 'filter' => [ 'term' => ['age1' => 22] ], 'must' => [ 'match_all' => new stdClass() ] ], ], ], ]; //方式二 $params = [ 'index' => 'person', 'type' => 'doc', 'body' => [ 'query' => [ 'constant_score' => [ 'boost' => 2, 'filter' => [ 'term' => ['sex' => 0] ], ], ], ], ];
方式一是使用的must与filter组合查询,must中使用match_all匹配全部,相当于过滤filter后文档的全体。方式二是用的contanst_score,它取代了bool,这样过滤后的文档得分会被置为1,配合boost权重,可以给某一个查询过滤增加权重来分配不同的得分。