PHP爬虫之QQ空间自动点赞--更换cookie版

QQ空间自动点赞网上一搜一大把,但是关于php的还是停留在用以前的3Gqq登陆方式获取sid之后再点赞的。而现在貌似3Gqq没法用了,
网上也没有关于最新的。实现QQ空间自动点赞已经很久了,一直没有发布是:这个思路是我借鉴别人的,只是用的语言不同而已。自己的书面表达能力也不是很好,怕说的稀里糊涂的。
然而最近几天着实无聊,不知道干什么,就想到写博客,练练自己的书面表达能力。
给那些刚学php提供一点有趣的代码。(博主的php也不是很好,只是用onethink做了几个半成品的项目)

我这里参考了一位博主用Python3写的QQ空间自动点赞程序的博客(文末有链接),也借鉴了他的思路,只是在实现的过程中遇到的坑可能不一样而已(写程序最主要的也只是思路,只是语言不同而已,就像表达感情用中文说是“我爱你”,用英文是"I Love You")

ok,直入主题:

我们不论是用移动端或PC端在QQ空间里给好友点赞是,当我们点击大拇指的图标就会向服务端发送一个请求,而点赞成功图标点亮就是服务端给我们的响应(表示点赞成功),因此要实现自动点赞就是模拟点赞(直接向服务端发送点赞请求),而请求也就是数据包,想要获取数据包就得会抓包(我用的也是fiddler或者直接按F12两种效果是一样的),我主要以用开发者工具来讲再用fiddler就和用python3写的点赞程序一样了,想要看fiddler的可以去他的博客看一下,讲的很详细

开发者工具(F12):



上面两张图片就是当我们用开发者工具能获取到的一些信息,下面我们就用php模拟点赞,会用到php的curl扩展

使用CURL发送请求的基本流程

使用CURL的PHP扩展完成一个HTTP或HTTPS请求的发送一般有以下几个步骤:

  1. 初始化连接句柄;
  2. 设置CURL选项;
  3. 执行并获取结果;
  4. 释放CURL连接句柄。

准备工作做完了,上代码了,首先,先爬取到个人中心首页

<?php
$ch = curl_init();//初始化curl
curl_setopt($ch, CURLOPT_URL, 'https://user.qzone.qq.com/2547433259');//爬取网站的地址,注意换成自己的个人中心首页
curl_setopt($ch, CURLOPT_HEADER, 0);//响应头 0为false  1为true 以下均是
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Host: user.qzone.qq.com",
"User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:56.0) Gecko/20100101 Firefox/56.0",
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Language: zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3",
"Accept-Encoding: gzip, deflate, br",
"Cookie:pt2gguin=o2547433259; RK=zN/HQPXftj; ptcz=285a50cec1343dd4f08d4a24b3e7e4c8deccfeb1a09b2fb295a6f4353ecd4f7f; pgv_pvid=8833794038; pgv_pvi=3674447872; o_cookie=1015296415; tvfe_boss_uuid=bb6e8e00328c657a; ue_ts=1510636814; ue_uk=c4acd2d047d08b8221f61612987f1df0; ue_uid=ad7f6eac0026d494d2b50e881db986f9; ue_skey=762d26492b215370cc78f8174e5941ba; eas_sid=e1z5m110f6N356x821A3H8p2j2; LW_uid=p1o5H1q0d6c3n6N8d1y3j9z2Z4; LW_pid=9a02e1dd158683bb27a4ccf1e90ba2d5; ptui_loginuin=1015296415@qq.com; uin=o2547433259; skey=@ikbsWKYcA; ptisp=cm; p_uin=o2547433259; pt4_token=Nt-DsHRP8GmyBxd81nFnkmVwKn41R6PpaCe*VXHlXtA_; p_skey=W-zbAdHfMneX-7450NfyG0Q*fyCD6uMN0cNN5yYZV*I_; Loading=Yes; pgv_info=ssid=s4774580249&pgvReferrer=; IED_LOG_INFO2=userUin%3D2547433259%26nickName%3Dwait%26userLoginTime%3D1529748728; pgv_si=s8756195328; __Q_w_s__QZN_TodoMsgCnt=1; qz_screen=1536x864; QZ_FE_WEBP_SUPPORT=0; cpu_performance_v8=30",
"Upgrade-Insecure-Requests: 1",
"If-Modified-Since: Sat, 04 Nov 2017 13:16:46 GMT",
"Cache-Control: max-age=0, no-cache",
"Pragma: no-cache"));//php发送请求的header,直接复制图片中的请求头即可(若用此代码,只需改变cookie的值,一般来说cookie能维持12小时左右)
curl_setopt($ch, CURLOPT_ENCODING, "gzip");//关键步骤,解压http的gzip格式
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);//返回值
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);//跳过https验证
$out = curl_exec($ch);//执行并获取结果
curl_close($ch);//释放curl连接
echo $out;//输出结果
?>

在爬取过程中要注意解压gzip格式,否则爬取到的是一堆乱码

首页是爬取到了,下面才是正主,构建POST数据包。

$data = array(
		"qzreferrer"	        =>	"https://user.qzone.qq.com/".$qq,   //$qq就是自己的QQ号码
		"opuin"			=>	$qq,
		"unikey"		=>	"http://user.qzone.qq.com/20050606/mood/aef23101e4a21b5b05a30900",
		"curkey"		=>	"http://user.qzone.qq.com/20050606/mood/aef23101e4a21b5b05a30900",
		"from"			=>	"1",
		"appid"			=>	"311",
		"typeid"		=>	"0",
		"abstime"		=>	time(),   //时间戳
		// "fid"		=>	"aef23101e4a21b5b05a30900",  //其实这个fid可有可无,所以我给注释了
		"active"		=>	"0",
		"fupdate"		=>	"1"
		);//post数据包

QQ固定之后,POST数据中的变量也只有unikey和curkey,而时间戳PHP中用time()表示即可,fid可以省略,那么现在就找unikey和curkey,现在就要用到我们爬取到的个人中心首页,还是先打开开发者工具(F12),在查看器中可查到所在位置


可以发现我们爬取的结果是就是html代码,代码中包含着unikey和curkey,现在要把他们提取出来,就要用到正则表达式

正则表达式博主表示不会,但是有个很好用的在线网址测试http://tool.oschina.net/regex/

$regex1 =  '#data-unikey="(http[^"]*)"[^d]*data-curkey="([^"]*)"[^d]*data-clicklog=("like")[^h]*href="javascript:;"';

但是需要注意的是,IE浏览器不一样,具体的你们可以在IE浏览器中查看unikey和curkey与其他浏览器位置的区别

<?php
$ch = curl_init();//初始化curl
curl_setopt($ch, CURLOPT_URL, 'https://user.qzone.qq.com/2547433259');//爬取网站的地址,注意换成自己的个人中心首页
curl_setopt($ch, CURLOPT_HEADER, 0);//响应头 0为false  1为true 以下均是
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Host: user.qzone.qq.com",
"User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:56.0) Gecko/20100101 Firefox/56.0",
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Language: zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3",
"Accept-Encoding: gzip, deflate, br",
"Cookie:pt2gguin=o2547433259; RK=zN/HQPXftj; ptcz=285a50cec1343dd4f08d4a24b3e7e4c8deccfeb1a09b2fb295a6f4353ecd4f7f; pgv_pvid=8833794038; pgv_pvi=3674447872; o_cookie=1015296415; tvfe_boss_uuid=bb6e8e00328c657a; ue_ts=1510636814; ue_uk=c4acd2d047d08b8221f61612987f1df0; ue_uid=ad7f6eac0026d494d2b50e881db986f9; ue_skey=762d26492b215370cc78f8174e5941ba; eas_sid=e1z5m110f6N356x821A3H8p2j2; LW_uid=p1o5H1q0d6c3n6N8d1y3j9z2Z4; LW_pid=9a02e1dd158683bb27a4ccf1e90ba2d5; ptui_loginuin=1015296415@qq.com; uin=o2547433259; skey=@ikbsWKYcA; ptisp=cm; p_uin=o2547433259; pt4_token=Nt-DsHRP8GmyBxd81nFnkmVwKn41R6PpaCe*VXHlXtA_; p_skey=W-zbAdHfMneX-7450NfyG0Q*fyCD6uMN0cNN5yYZV*I_; Loading=Yes; pgv_info=ssid=s4774580249&pgvReferrer=; IED_LOG_INFO2=userUin%3D2547433259%26nickName%3Dwait%26userLoginTime%3D1529748728; pgv_si=s8756195328; __Q_w_s__QZN_TodoMsgCnt=1; qz_screen=1536x864; QZ_FE_WEBP_SUPPORT=0; cpu_performance_v8=30",
"Upgrade-Insecure-Requests: 1",
"If-Modified-Since: Sat, 04 Nov 2017 13:16:46 GMT",
"Cache-Control: max-age=0, no-cache",
"Pragma: no-cache"));//php发送请求的header,直接复制图片中的请求头即可(注意换成自己的个人中心首页的请求头)
curl_setopt($ch, CURLOPT_ENCODING, "gzip");//关键步骤,解压http的gzip格式
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);//返回值
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);//跳过https验证
$out = curl_exec($ch);//执行并获取结果
curl_close($ch);//释放curl连接
if($out){
	//建立规则
	$regex1 = '#data-unikey="(http[^"]*)"[^d]*data-curkey="([^"]*)"[^d]*data-clicklog=("like")[^h]*href="javascript:;"';
	preg_match_all($regex1, $out, $unikey);//进行匹配
	print_r($unikey);//输出结果

}else{
	echo "抓取失败";
	}
?>

运行上面的代码就能获取到curkey和unikey的值,接下来我们就对服务端发起请求,等等,还有一个重要的事情忘说了,那就是对服务端发起请求时的URL中有个参数g_tk,这个参数每次登陆的值是会改变的,而且g_tk的值是根据请求头中的cookie中的p_skey计算得来的,关于g_tk算法百度上也有

gtk.php

<?php
	function getGTK($skey){
		$hash = 5381;
		for($i=0;$i<strlen($skey);++$i){
			$hash += ($hash << 5) + utf8_uni($skey[$i]);
		}
		return $hash & 0x7fffffff;
	}
	function utf8_uni($u){
		switch(strlen($u)){
			case 1: return ord($u);
			case 2: $n = (ord($u[0]) & 0x3f) << 6;
			$n += ord($u[1]) & 0x3f;
			return $n;
			case 3: $n = (ord($u[0]) & 0x1f) << 12;
			$n += (ord($u[1]) & 0x3f) << 6;
			$n += ord($u[2]) & 0x3f;
			return $n;
			case 4: $n = (ord($u[0]) & 0x0f) << 18;
			$n += (ord($u[1]) & 0x3f) << 12;
			$n += (ord($u[2]) & 0x3f) << 6;
			$n += ord($u[3]) & 0x3f;
			return $n;
		}
	}
?>

利用正则表达式在cookie中提取出p_skey的值用g_tk算法算出值

<?php
include "gtk.php";//此php页面就是上面的g_tk算法
$header= array("Host: user.qzone.qq.com",
"User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:56.0) Gecko/20100101 Firefox/56.0",
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Language: zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3",
"Accept-Encoding: gzip, deflate, br",
"Content-Type: application/x-www-form-urlencoded",
"Cookie:pt2gguin=o2547433259; RK=zN/HQPXftj; ptcz=285a50cec1343dd4f08d4a24b3e7e4c8deccfeb1a09b2fb295a6f4353ecd4f7f; pgv_pvid=8833794038; pgv_pvi=3674447872; o_cookie=1015296415; tvfe_boss_uuid=bb6e8e00328c657a; ue_ts=1510636814; ue_uk=c4acd2d047d08b8221f61612987f1df0; ue_uid=ad7f6eac0026d494d2b50e881db986f9; ue_skey=762d26492b215370cc78f8174e5941ba; eas_sid=e1z5m110f6N356x821A3H8p2j2; LW_uid=p1o5H1q0d6c3n6N8d1y3j9z2Z4; LW_pid=9a02e1dd158683bb27a4ccf1e90ba2d5; ptui_loginuin=1015296415@qq.com; uin=o2547433259; skey=@ikbsWKYcA; ptisp=cm; p_uin=o2547433259; pt4_token=Nt-DsHRP8GmyBxd81nFnkmVwKn41R6PpaCe*VXHlXtA_; p_skey=W-zbAdHfMneX-7450NfyG0Q*fyCD6uMN0cNN5yYZV*I_; Loading=Yes; pgv_info=ssid=s4774580249&pgvReferrer=; IED_LOG_INFO2=userUin%3D2547433259%26nickName%3Dwait%26userLoginTime%3D1529748728; pgv_si=s8756195328; __Q_w_s__QZN_TodoMsgCnt=1; qz_screen=1536x864; QZ_FE_WEBP_SUPPORT=0; cpu_performance_v8=30",
"Upgrade-Insecure-Requests: 1",
"Pragma: no-cache",
"Cache-Control: no-cache");//POST请求头
$regex =  '#p_skey=([^;^\']*)#';//正则表达式
preg_match($regex, $header[6], $matches);//匹配
$g_tk = getGTK($matches[1]);//结果
?>

下面就是整合代码了

index.php

<?php
	/**
	 * @author   hdy
	 * @time 	2017/11/4
	 * desc QQ空间自动点赞
	 */
 include "gtk.php";
	$header= array("Host: user.qzone.qq.com",
"User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:56.0) Gecko/20100101 Firefox/56.0",
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Language: zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3",
"Accept-Encoding: gzip, deflate, br",
"Content-Type: application/x-www-form-urlencoded",
"Cookie:pt2gguin=o2547433259; RK=zN/HQPXftj; ptcz=285a50cec1343dd4f08d4a24b3e7e4c8deccfeb1a09b2fb295a6f4353ecd4f7f; pgv_pvid=8833794038; pgv_pvi=3674447872; o_cookie=1015296415; tvfe_boss_uuid=bb6e8e00328c657a; ue_ts=1510636814; ue_uk=c4acd2d047d08b8221f61612987f1df0; ue_uid=ad7f6eac0026d494d2b50e881db986f9; ue_skey=762d26492b215370cc78f8174e5941ba; eas_sid=e1z5m110f6N356x821A3H8p2j2; LW_uid=p1o5H1q0d6c3n6N8d1y3j9z2Z4; LW_pid=9a02e1dd158683bb27a4ccf1e90ba2d5; ptui_loginuin=1015296415@qq.com; uin=o2547433259; skey=@ikbsWKYcA; ptisp=cm; p_uin=o2547433259; pt4_token=Nt-DsHRP8GmyBxd81nFnkmVwKn41R6PpaCe*VXHlXtA_; p_skey=W-zbAdHfMneX-7450NfyG0Q*fyCD6uMN0cNN5yYZV*I_; Loading=Yes; pgv_info=ssid=s4774580249&pgvReferrer=; IED_LOG_INFO2=userUin%3D2547433259%26nickName%3Dwait%26userLoginTime%3D1529748728; pgv_si=s8756195328; __Q_w_s__QZN_TodoMsgCnt=1; qz_screen=1536x864; QZ_FE_WEBP_SUPPORT=0; cpu_performance_v8=30",
"Upgrade-Insecure-Requests: 1",
"Pragma: no-cache",
"Cache-Control: no-cache");
	$regex =  '#p_skey=([^;^\']*)#';
	preg_match($regex, $cookie[6], $matches);
	$g_tk = getGTK($matches[1]);
	$qq = '2547433259';//换成自己的QQ
	$key = include 'cookie.php';//将返回的数组(unikey和curkey)赋值给$key
	if($key){
	for($i=0;$i<count($key[0]);$i++){	
	$time = time();//获取当前时间
	$fid  = explode('/', $key[1][$i]);//获取fid的值
	$data = array(
		"qzreferrer"	        =>	"https://user.qzone.qq.com/".$qq,
		"opuin"			=>	$qq,
		"unikey"		=>	$key[1][$i],
		"curkey"		=>	$key[2][$i],
		"from"			=>	"1",
		"appid"			=>	"311",
		"typeid"		=>	"0",
		"abstime"		=>	$time,
		// "fid"		=>	$fid[5],
		"active"		=>	"0",
		"fupdate"		=>	"1"
		);//post数据包
	$ch = curl_init();
	curl_setopt($ch, CURLOPT_URL, 'https://user.qzone.qq.com/proxy/domain/w.qzone.qq.com/cgi-bin/likes/internal_dolike_app?g_tk='.$g_tk);//发送post的地址
	curl_setopt($ch, CURLOPT_POST, 1);//是否发送post数据
	curl_setopt($ch, CURLOPT_HEADER, 0);
	curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
	curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($data));
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
	//下面两步是跳过SSL验证
	curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
	curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
	$output = curl_exec($ch);
	if($output){
		// echo $output;
		// curl_close($ch);
		echo "success";
	}
	curl_close($ch);
}
		
		echo "<meta http-equiv='refresh' content='2'/>"; //需间隔一段时间再爬取个人中心,否则QQ号会被冻结一段时间,两秒最合适,不会被冻结
	}else{
		echo "等待刷新";
		echo "<meta http-equiv='refresh' content='2'/>";
	}
?>

cookie.php

<?php
$rech = curl_init();
curl_setopt($rech, CURLOPT_URL, 'https://user.qzone.qq.com/2547433259');//注意换成自己的个人中心首页
curl_setopt($rech, CURLOPT_HEADER, 0);
curl_setopt($rech, CURLOPT_HTTPHEADER, array("Host: user.qzone.qq.com",
"User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:56.0) Gecko/20100101 Firefox/56.0",
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Language: zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3",
"Accept-Encoding: gzip, deflate, br",
"Cookie:pt2gguin=o2547433259; RK=zN/HQPXftj; ptcz=285a50cec1343dd4f08d4a24b3e7e4c8deccfeb1a09b2fb295a6f4353ecd4f7f; pgv_pvid=8833794038; pgv_pvi=3674447872; o_cookie=1015296415; tvfe_boss_uuid=bb6e8e00328c657a; ue_ts=1510636814; ue_uk=c4acd2d047d08b8221f61612987f1df0; ue_uid=ad7f6eac0026d494d2b50e881db986f9; ue_skey=762d26492b215370cc78f8174e5941ba; eas_sid=e1z5m110f6N356x821A3H8p2j2; LW_uid=p1o5H1q0d6c3n6N8d1y3j9z2Z4; LW_pid=9a02e1dd158683bb27a4ccf1e90ba2d5; ptui_loginuin=1015296415@qq.com; uin=o2547433259; skey=@ikbsWKYcA; ptisp=cm; p_uin=o2547433259; pt4_token=Nt-DsHRP8GmyBxd81nFnkmVwKn41R6PpaCe*VXHlXtA_; p_skey=W-zbAdHfMneX-7450NfyG0Q*fyCD6uMN0cNN5yYZV*I_; Loading=Yes; pgv_info=ssid=s4774580249&pgvReferrer=; IED_LOG_INFO2=userUin%3D2547433259%26nickName%3Dwait%26userLoginTime%3D1529748728; pgv_si=s8756195328; __Q_w_s__QZN_TodoMsgCnt=1; qz_screen=1536x864; QZ_FE_WEBP_SUPPORT=0; cpu_performance_v8=30",
"Upgrade-Insecure-Requests: 1",
"If-Modified-Since: Sat, 04 Nov 2017 13:16:46 GMT",
"Cache-Control: max-age=0, no-cache",
"Pragma: no-cache"));
	curl_setopt($rech, CURLOPT_ENCODING, "gzip");//关键步骤,解压http的gzip格式
	curl_setopt($rech, CURLOPT_RETURNTRANSFER, 1);
	curl_setopt($rech, CURLOPT_SSL_VERIFYPEER, 0);
	$out = curl_exec($rech);
	curl_close($rech);
	if($out){
		$regex1 =  '#data-unikey="(http[^"]*)"[^d]*data-curkey="([^"]*)"[^d]*data-clicklog=("like")[^h]*href="javascript:;"#';
		preg_match_all($regex1, $out, $unikey);
		if(!empty($unikey[0])){
			return $unikey;//返回unikey和curkey的值
		}else{
			return "";
		}
	}else{
		echo "抓取失败";
	}
?>

这里面的思路我讲的不好,自己是真的不善于交流(自己也在多多练习),也是第一次将自己所会的分享出来。

大家有思路不明白的地方可以留言或者在文末去看python3的思路,我上面的代码只需换了cookie,将3个php放在web目录下运行,即可实现自动点赞。我没有封装成类,①是你们自己封装成类可以加深理解。②是因为我太懒了,还有不换cookie的要用到数据库,我又嫌麻烦就没弄了


参考博客:https://blog.csdn.net/qq_21882325/article/details/52889500

                 https://blog.csdn.net/qq_21882325/article/details/52985334

  • 4
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值