Elasticsearch-PHP 安装
参考上一篇文档:http://my.oschina.net/sunmin/blog/726285
引用文件
dict.txt.cache.json, pinyin.php 云盘下载地址:
http://pan.baidu.com/s/1c1W6fQC
索引生成测试代码 php
单条索引生成,速度较慢 elastic.php
<?php
use Elasticsearch\ClientBuilder;
require 'vendor/autoload.php';
include('pinyin.php');
ini_set('memory_limit','10280M');
$start = "开始时间:" . time() . PHP_EOL;
echo $start;
$client = ClientBuilder::create()->build();
$dict = array_map(function($str){
return str_ireplace('.', '', $str);},
array_keys(json_decode(file_get_contents('./dict.txt.cache.json'),
true)
)
);
$a = 1;
do {
$params = [
'index' => 'my_index',
'type' => 'my_type',
'id' => $a,
'body' => [
'user_id' => mt_rand(100000000, 2000000000).'@qq.com',
'name' => $tmpname = $dict[mt_rand(1000,360000)],
'py' => \Utils\Pinyin::conv($tmpname),
'phone' => array_rand(array_flip([13,15,17,18])) . mt_rand(100000000, 999999999),
'mail' => substr(
str_shuffle(
"0123456789abcdefghijklmnopqrstuvwxyz"),
0, mt_rand(5,20)
) . '@' . array_rand(
array_flip(
[
'163.com',
'126.com',
'yeah.net',
'qq.com',
'foxmail.com',
'gmail.com',
'yahoo.com',
'hotmail.com',
'sina.com',
'sina.cn',
'sina.com.cn'
]
)
),
'appoint' => implode(
'-',
array_map(function() use($dict) {
return $dict[mt_rand(1000,360000)] . (mt_rand(0,100) > 60 ?
$dict[mt_rand(1000,360000)] : '') . (mt_rand(0,100) > 80 ?
$dict[mt_rand(1000,360000)] : '');},
array_pad([], mt_rand(2,5), '')
)
),
]
];
//创建索引
$response = $client->index($params);
$a = isset($a) ? ++$a : 1;
$a % 50 === 0 ? print(($b = isset($b) ? ++$b : 1).'%'.PHP_EOL) : '';
} while ($a <= 5000);
$end = "结束时间:" . time() . PHP_EOL;
echo $end;
echo "耗时:" . ($end - $start);
批量索引生成 elasticBulk.php
<?php
/**
* 500万用户模拟数据批量新增到 elasticsearch demo
*/
use Elasticsearch\ClientBuilder;
require 'vendor/autoload.php';
include('pinyin.php');
ini_set('memory_limit','10280M');
$connectionPool = '\Elasticsearch\ConnectionPool\StaticNoPingConnectionPool';
$selector = '\Elasticsearch\ConnectionPool\Selectors\StickyRoundRobinSelector';
$client = ClientBuilder::create()
->setRetries(10)//重试次数,默认重试次数为集群节点数
->setConnectionPool($connectionPool)
->setSelector($selector)
->build();
$dict = array_map(function($str){
return str_ireplace('.', '', $str);},
array_keys(json_decode(file_get_contents('./dict.txt.cache.json'),
true)
)
);
echo $start = "开始时间:" . time() . PHP_EOL;
createData($client, $dict);
echo $end = "结束时间:" . time() . PHP_EOL;
echo "耗时:" . $end - $start;
function createData($client, $dict){
$bulk = array('index'=>'my_index4','type'=>'my_type4');
//bulk批量生成
for($j = 0;$j <= 99; $j++) {
for($i = $j * 50000 + 1; $i <= $j * 50000 + 50000; $i ++) {
$bulk['body'][]=array(
'index' => array(
'_id'=>$i
),
'type' => 'blocking'
);
$bulk['body'][] = [
'user_id' => mt_rand(100000000, 2000000000).'@qq.com',
'name' => $tmpname = $dict[mt_rand(1000,360000)],
'py' => \Utils\Pinyin::conv($tmpname),
'phone' => array_rand(array_flip([13,15,17,18])) . mt_rand(100000000, 999999999),
'mail' => substr(
str_shuffle(
"0123456789abcdefghijklmnopqrstuvwxyz"),
0, mt_rand(5,20)
) . '@' . array_rand(
array_flip(
[
'163.com',
'126.com',
'yeah.net',
'qq.com',
'foxmail.com',
'gmail.com',
'yahoo.com',
'hotmail.com',
'sina.com',
'sina.cn',
'sina.com.cn'
]
)
),
'appoint' => implode(
'-',
array_map(function() use($dict) {
return $dict[mt_rand(1000,360000)] . (mt_rand(0,100) > 60 ?
$dict[mt_rand(1000,360000)] : '') . (mt_rand(0,100) > 80 ?
$dict[mt_rand(1000,360000)] : '');},
array_pad([], mt_rand(2,5), '')
)
),
];
}
$client->bulk($bulk);
//进度统计
print($j + 1).'%'.PHP_EOL;
}
}
单节点测试结论
1.单条性能约100条/s , 批量性能约5000条/s
2.单节点批量处理一次性bulk数据上限35万,否则报错:PHP Fatal error: Uncaught exception 'Elasticsearch\Common\Exceptions\NoNodesAvailableException' with message 'No alive nodes found in your cluster
3. 500万全真模拟数据占空间2G
4.查询性能:
第一次稍慢,但在1秒以内
第二次极速