snoopy是一个php类,用来模仿web浏览器的功能,它能完成获取网页内容和发送表单的任务。
下面是它的一些特征:
1、方便抓取网页的内容11111111111
2、方便抓取网页的文字(去掉HTML代码)
3、方便抓取网页的链接
4、支持代理主机
5、支持基本的用户/密码认证模式
6、支持自定义用户agent,referer,cookies和header内容
7、支持浏览器转向,并能控制转向深度
8、能把网页中的链接扩展成高质量的url(默认)
9、方便提交数据并且获取返回值
10、支持跟踪HTML框架(v0.92增加)
11、支持再转向的时候传递cookies
下面是简单的例子,比如说我们抓取我的blog的文字
include
"
Snoopy.class.php
"
;
$snoopy = new Snoopy ;
$snoopy -> fetchtext ( " http://www.phpobject.net/blog " ) ;
echo $snoopy -> results ;
$snoopy = new Snoopy ;
$snoopy -> fetchtext ( " http://www.phpobject.net/blog " ) ;
echo $snoopy -> results ;
^_^,不错把,在比如抓取链接
include
"
Snoopy.class.php
"
;
$snoopy = new Snoopy ;
$snoopy -> fetchlinks ( " http://www.phpobject.net/blog " ) ;
print_r ( $snoopy -> results ) ;
$snoopy = new Snoopy ;
$snoopy -> fetchlinks ( " http://www.phpobject.net/blog " ) ;
print_r ( $snoopy -> results ) ;
使用snoopy提交数据实现登陆
模拟登陆可以用curl或者socket来实现,当curl需要服务器相应的启用curl module,自己socket实现相对比较麻烦,使用snoopy就简单了很多啦。
在这里,我们使用喜悦国际村做为例子。(^_^,纯属研究)
首先,我们要获取到登陆需要发送什么字段,目标地址是什么。这里我们使用snoopy的fetchform来实现。
include
"
Snoopy.class.php
"
;
$snoopy = new Snoopy ;
$snoopy -> fetchform ( " http://www.phpx.com/happy/logging.php?action=login " ) ;
print $snoopy -> results ;
$snoopy = new Snoopy ;
$snoopy -> fetchform ( " http://www.phpx.com/happy/logging.php?action=login " ) ;
print $snoopy -> results ;
当然你也可以直接查看http://www.phpx.com/happy/…的源代码来实现,不过这样更加方便把。这里,我们获取到目标和提交的数据,下一步就可以实现模拟登陆了。
代码如下:
include
"
Snoopy.class.php
"
;
$snoopy = new Snoopy ;
$submit_url = " http://www.phpx.com/happy/logging.php?action=login " ;
$submit_vars [ " loginmode " ] = " normal " ;
$submit_vars [ " styleid " ] = " 1 " ;
$submit_vars [ " cookietime " ] = " 315360000 " ;
$submit_vars [ " loginfield " ] = " username " ;
$submit_vars [ " username " ] = " ******** " ; //你的用户名
$submit_vars [ " password " ] = " ******* " ; //你的密码
$submit_vars [ " questionid " ] = " 0 " ;
$submit_vars [ " answer " ] = "" ;
$submit_vars [ " loginsubmit " ] = " 提 交 " ;
$snoopy -> submit ( $submit_url , $submit_vars ) ;
print $snoopy -> results ;
$snoopy = new Snoopy ;
$submit_url = " http://www.phpx.com/happy/logging.php?action=login " ;
$submit_vars [ " loginmode " ] = " normal " ;
$submit_vars [ " styleid " ] = " 1 " ;
$submit_vars [ " cookietime " ] = " 315360000 " ;
$submit_vars [ " loginfield " ] = " username " ;
$submit_vars [ " username " ] = " ******** " ; //你的用户名
$submit_vars [ " password " ] = " ******* " ; //你的密码
$submit_vars [ " questionid " ] = " 0 " ;
$submit_vars [ " answer " ] = "" ;
$submit_vars [ " loginsubmit " ] = " 提 交 " ;
$snoopy -> submit ( $submit_url , $submit_vars ) ;
print $snoopy -> results ;