Just a Crawler

44 篇文章 0 订阅
20 篇文章 0 订阅

 
 
use strict ;
use WWW:: Mechanize ;
use HTTP:: Cookies ;

###go to login page and login.
#my $url = 'https://www.google.com/accounts/ServiceLogin?hl=en&service=finance&nui=1&continue=http%3A%2F%2Ffinance.google.com%2Ffinance';
my $url = 'https://accounts.google.com/ServiceLogin' ;
my $username = $ARGV [ 0 ];
my $password = $ARGV [ 1 ];
my $keyword = $ARGV [ 2 ];
my $outputfile = $ARGV [ 3 ];
chomp ( $username );
chomp ( $password );
chomp ( $keyword );
chomp ( $outputfile );

print "usr: $username\n" ;
print "psw: $password\n" ;
print "keyword: $keyword\n" ;
print "output: $outputfile\n" ;
print "Searching ......\n" ;

my $mech = WWW:: Mechanize -> new ();
$mech -> cookie_jar ( HTTP:: Cookies -> new ());
$mech -> get ( $url );
$mech -> form_number ( 1 );
$mech -> field ( Email => $username );
$mech -> field ( Passwd => $password );
$mech -> click ();
#Go to the next link, now that we are logged in.
#$url = 'http://www.google.com/trends/viz?q=alan+kay&graph=all_csv&sa=N';
$url = 'http://www.google.com/trends/viz?q=' . $keyword . '&date=all&geo=cn&graph=all_csv&scale=1&sa=N' ;
#$url = 'http://finance.google.com/finance/portfolio?action=view&pid=1&pview=pview&output=csv';

$mech -> get ( $url );
my $output_page = $mech -> content ();

my $fh ;
open $fh , ">$outputfile" ;
print $fh $output_page ;


12/4/2011 Update

This script  can't work sometimes because of  Google 's ban. 


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值