一个简单的有分页采集功能的php采集程序－－自制的哦

最新推荐文章于 2022-12-20 16:13:10 发布

llx19861021366

最新推荐文章于 2022-12-20 16:13:10 发布

阅读量1.8k

点赞数 1

文章标签： php class file fp url rss

本文链接：https://blog.csdn.net/llx19861021366/article/details/4008736

版权

<?
@set_time_limit(0); //设置网页运行时间，其中0为不限
//采集首页地址
$url="http://emotion.pclady.com.cn/skills/";
//获取页面代码
$rs=file_get_contents($url);
//设置匹配正则
$preg='/<i/s+class=/"titles/"><a/s+href=/"[^>]+/">(.*)<//a><//i>/i';
//进行正则搜索
preg_match_all($preg,$rs,$title);
//计算标题数量
$count=count($title[0]);
//通过标题数量进行内容采集
for ($i=0;$i<$cout;$i++){

    //设置内容页地址
     $pr='/<a/s+href=/"[^>]+/">/isU';
    preg_match_all($pr,$title[0][$i],$jurl);
     $substr=substr($jurl[0][0],9);
     $curl=substr($substr,0,-18);
    //获取内容页代码
    $c=file_get_contents($curl);
    //设置内容页匹配正则
    $pc='/<div/s+class=/"words/">(.*)<//div>/isU';
    //进行正则匹配搜索

最低0.47元/天解锁文章

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

llx19861021366

关注关注

1
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
一个简单的有分页采集功能的php采集程序－－自制的哦

@set_time_limit(0); //设置网页运行时间，其中0为不限//采集首页地址$url="http://emotion.pclady.com.cn/skills/";//获取页面代码$rs=file_get_contents($url);//设置匹配正则$preg=/]+/">(.*)/i;//进行正则搜索preg_match_all($preg,$rs,$title);//计算
复制链接

扫一扫