可能你现在依次做这个.即获取数据1,进程数据1,提取数据2,进程数据2,…和瓶颈可能是数据传输.
您可以使用
curl_multi_exec()并行化一点.
要注册一个
CURLOPT_WRITEFUNCTION并处理每个数据块(因为md5()只在一个数据块上工作,这样很棘手).
或检查已经完成的卷曲句柄,然后处理该句柄的数据.
编辑:使用hash extension(提供增量散列函数)和php5.3+ closure:的quick& dirty示例:
$urls = array(
'https://stackoverflow.com/',
'http://sstatic.net/so/img/logo.png',
'http://www.gravatar.com/avatar/212151980ba7123c314251b185608b1d?s=128&d=identicon&r=PG',
'http://de.php.net/images/php.gif'
);
$data = array();
$fnWrite = function($ch, $chunk) use(&$data) {
foreach( $data as $d ) {
if ( $ch===$d['curlrc'] ) {
hash_update($d['hashrc'], $chunk);
}
}
};
$mh = curl_multi_init();
foreach($urls as $u) {
$current = curl_init();
curl_setopt($current, CURLOPT_URL, $u);
curl_setopt($current, CURLOPT_RETURNTRANSFER, 0);
curl_setopt($current, CURLOPT_HEADER, 0);
curl_setopt($current, CURLOPT_WRITEFUNCTION, $fnWrite);
curl_multi_add_handle($mh, $current);
$hash = hash_init('md5');
$data[] = array('url'=>$u, 'curlrc'=>$current, 'hashrc'=>$hash);
}
$active = null;
//execute the handles
do {
$mrc = curl_multi_exec($mh, $active);
} while ($mrc == CURLM_CALL_MULTI_PERFORM);
while ($active && $mrc == CURLM_OK) {
if (curl_multi_select($mh) != -1) {
do {
$mrc = curl_multi_exec($mh, $active);
} while ($mrc == CURLM_CALL_MULTI_PERFORM);
}
}
foreach($data as $d) {
curl_multi_remove_handle($mh, $d['curlrc']);
echo $d['url'], ': ', hash_final($d['hashrc'], false), "\n";
}
curl_multi_close($mh);
(没有检查结果,但它只是一个起点)