RUBY学习笔记二:使用RUBY实现通过PROXY的方式请求网页

最新推荐文章于 2024-10-10 15:17:19 发布

杠杠PP

最新推荐文章于 2024-10-10 15:17:19 发布

阅读量616

点赞数

文章标签： ruby url query 服务器 apache string

RUBY学习笔记二:使用RUBY实现通过PROXY的方式请求网页

现在很多网站上某些活动都有限制同一IP只能投一票的规定,但是有时候迫于压迫,又不得不想办法多投几票,以前是采用Apache里的HttpClient来实现这些功能,日前正在看Ruby,就用它也来玩下:

require ' net/http '
# #获得网页内容
def query_url(url)
return Net :: HTTP . get(URI . parse(url));
end

# 抓取cnproxy上所有的代理列表,并将结果保存到proxy.txt中去
#你可以修改这块代码或者其他的代理服务器列表
def find_all_proxy
z = " 3 " ;j = " 4 " ;r = " 2 " ;l = " 9 " ;c = " 0 " ;x = " 5 " ;i = " 7 " ;a = " 6 " ;p = " 8 " ;s = " 1 "
pf = File . new( " proxy.txt " , " w+ " )
for page_no in 1 .. 10
url = " http://www.cnproxy.com/proxy#{page_no}.html "
content = query_url(url)
# print content
## ^$?./\[]{}()+*
for array in content . scan( /< td > ( .*? ) < SCRIPT type = text \/ javascript > document . write \ ( " : " \+ ( .*? ) \ ) <\/ SCRIPT ><\/ td >/ )
if array . length == 2
pf . write ( " #{array[0]}:#{eval(array[1])}\n " )
end
end
end
pf . close
end

# #处理请求
def open_url_with_proxy(url)
pf = File . open ( " proxy.txt " , " r " )
d = []
pf . each { | line | d << line }
for var in d
print " User Proxy #{var}\n "
begin
proxy = Net :: HTTP :: Proxy(var . split ( " : " )[ 0 ] , var . split ( " : " )[ 1 ] . to_i)
print proxy . get(URI . parse(url));
# print proxy.start("www.google.com",80){|http|
# response = http.get('/index.html')
# puts response.body
#}
rescue
# #吃掉异常
end
end
end

# #主程序
begin
if ! FileTest . exist ? ( " proxy.txt " )
find_all_proxy
end
open_url_with_proxy( ' http://www.google.com/index.html ' );
end

这里需要注意的是代理服务器的端口不能是String类型,Ruby竟然不会自动转换,搞得我浪费了N多时间.