php es 批量搜索,Elasticsearch 500 万索引批量存储 php demo

Elasticsearch-PHP 安装

引用文件

dict.txt.cache.json, pinyin.php 云盘下载地址:

索引生成测试代码 php

单条索引生成,速度较慢  elastic.php

use Elasticsearch\ClientBuilder;

require 'vendor/autoload.php';

include('pinyin.php');

ini_set('memory_limit','10280M');

$start = "开始时间:" . time() . PHP_EOL;

echo $start;

$client = ClientBuilder::create()->build();

$dict = array_map(function($str){

return str_ireplace('.', '', $str);},

array_keys(json_decode(file_get_contents('./dict.txt.cache.json'),

true)

)

);

$a = 1;

do {

$params = [

'index' => 'my_index',

'type' => 'my_type',

'id' => $a,

'body' => [

'user_id' => mt_rand(100000000, 2000000000).'@qq.com',

'name' => $tmpname = $dict[mt_rand(1000,360000)],

'py' => \Utils\Pinyin::conv($tmpname),

'phone' => array_rand(array_flip([13,15,17,18])) . mt_rand(100000000, 999999999),

'mail' => substr(

str_shuffle(

"0123456789abcdefghijklmnopqrstuvwxyz"),

0, mt_rand(5,20)

) . '@' . array_rand(

array_flip(

[

'163.com',

'126.com',

'yeah.net',

'qq.com',

'foxmail.com',

'gmail.com',

'yahoo.com',

'hotmail.com',

'sina.com',

'sina.cn',

'sina.com.cn'

]

)

),

'appoint' => implode(

'-',

array_map(function() use($dict) {

return $dict[mt_rand(1000,360000)] . (mt_rand(0,100) > 60 ?

$dict[mt_rand(1000,360000)] : '') . (mt_rand(0,100) > 80 ?

$dict[mt_rand(1000,360000)] : '');},

array_pad([], mt_rand(2,5), '')

)

),

]

];

//创建索引

$response = $client->index($params);

$a = isset($a) ? ++$a : 1;

$a % 50 === 0 ? print(($b = isset($b) ? ++$b : 1).'%'.PHP_EOL) : '';

} while ($a <= 5000);

$end = "结束时间:" . time() . PHP_EOL;

echo $end;

echo "耗时:" . ($end - $start);

批量索引生成  elasticBulk.php

/**

* 500万用户模拟数据批量新增到 elasticsearch demo

*/

use Elasticsearch\ClientBuilder;

require 'vendor/autoload.php';

include('pinyin.php');

ini_set('memory_limit','10280M');

$connectionPool = '\Elasticsearch\ConnectionPool\StaticNoPingConnectionPool';

$selector = '\Elasticsearch\ConnectionPool\Selectors\StickyRoundRobinSelector';

$client = ClientBuilder::create()

->setRetries(10)//重试次数,默认重试次数为集群节点数

->setConnectionPool($connectionPool)

->setSelector($selector)

->build();

$dict = array_map(function($str){

return str_ireplace('.', '', $str);},

array_keys(json_decode(file_get_contents('./dict.txt.cache.json'),

true)

)

);

echo $start = "开始时间:" . time() . PHP_EOL;

createData($client, $dict);

echo $end = "结束时间:" . time() . PHP_EOL;

echo "耗时:" . $end - $start;

function createData($client, $dict){

$bulk = array('index'=>'my_index4','type'=>'my_type4');

//bulk批量生成

for($j = 0;$j <= 99; $j++) {

for($i = $j * 50000 + 1; $i <= $j * 50000 + 50000; $i ++) {

$bulk['body'][]=array(

'index' => array(

'_id'=>$i

),

'type' => 'blocking'

);

$bulk['body'][] = [

'user_id' => mt_rand(100000000, 2000000000).'@qq.com',

'name' => $tmpname = $dict[mt_rand(1000,360000)],

'py' => \Utils\Pinyin::conv($tmpname),

'phone' => array_rand(array_flip([13,15,17,18])) . mt_rand(100000000, 999999999),

'mail' => substr(

str_shuffle(

"0123456789abcdefghijklmnopqrstuvwxyz"),

0, mt_rand(5,20)

) . '@' . array_rand(

array_flip(

[

'163.com',

'126.com',

'yeah.net',

'qq.com',

'foxmail.com',

'gmail.com',

'yahoo.com',

'hotmail.com',

'sina.com',

'sina.cn',

'sina.com.cn'

]

)

),

'appoint' => implode(

'-',

array_map(function() use($dict) {

return $dict[mt_rand(1000,360000)] . (mt_rand(0,100) > 60 ?

$dict[mt_rand(1000,360000)] : '') . (mt_rand(0,100) > 80 ?

$dict[mt_rand(1000,360000)] : '');},

array_pad([], mt_rand(2,5), '')

)

),

];

}

$client->bulk($bulk);

//进度统计

print($j + 1).'%'.PHP_EOL;

}

}

单节点测试结论

1.单条性能约100条/s , 批量性能约5000条/s

2.单节点批量处理一次性bulk数据上限35万,否则报错:PHP Fatal error:  Uncaught exception 'Elasticsearch\Common\Exceptions\NoNodesAvailableException' with message 'No alive nodes found in your cluster

3. 500万全真模拟数据占空间2G

e2273f66e95d5c07a4d1d6e86de486c9.png

4.查询性能:

第一次稍慢,但在1秒以内

b0701b9f36dd1f66473864acadbf10c0.png

第二次极速

ac8f2666546749b1146de430b7d6af00.png

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值