既然已经上面几章学到了通过HttpUrlConnection和HttpClient发送网络请求,我们可以获取到网络的响应,其实我们上面的例子可以说是一个简单的爬虫啊,把一个Url的网页内容全部下载下来。
那今天我打算利用学到的这些知识做一件其他的事:刷新一个网页的访问量,就以刷新我的博客主页的访问量为例子吧。
一个网页内容,我们一直刷新访问,在一定时间内,访问量是不会增加,因为我们的ip始终是一个,所以要想刷新访问量,就需要准备一批ip去访问某个网页。要想获取不同的ip,网上有代理Ip的API接口,调用这个接口,会返回一批有效的ip,不过这些是收费的,所以我这里只能到网上随意搜几个IP测试一下。明白原理就好。
好了下面就开始写代码实现吧:
Ip信息类:
package com.blog.visit;
public class IpInfo {
private String address;
private String port;
public String getAddress() {
return address;
}
public void setAddress(String address) {
this.address = address;
}
public String getPort() {
return port;
}
public void setPort(String port) {
this.port = port;
}
}
Ip代理类:
package com.blog.visit;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.ArrayList;
import java.util.List;
import org.apache.commons.httpclient.Header;
import org.apache.commons.httpclient.HttpClient;
import org.apache.commons.httpclient.HttpException;
import org.apache.commons.httpclient.HttpMethod;
import org.apache.commons.httpclient.cookie.CookiePolicy;
import org.apache.commons.httpclient.methods.GetMethod;
public class IpProxy {
private IpInfo ip;
//访问成功的 次数
private int success=0;
//网上随便搜的几个ip
private String[] ips={
"182.88.205.151:8123",
"110.73.6.189:8123",
"218.14.121.237:9000",
"121.232.147.110:9000",
"125.117.123.99:9000",
"117.90.3.181:9000",
"61.128.237.77:8998",
"123.7.31.205:808",
"119.163.121.122:8080",
"124.42.7.103:80",
"210.22.85.34:8080"
};
private String API_URL="";//代理api接口,这个得去网上买,我没买,所以没法测试
//默认的几个IP用用吧
private List<IpInfo> getDefalutIpProxy(){
List<IpInfo> ipProxies=new ArrayList<IpInfo>();
IpInfo ipInfo=new IpInfo();
for(String ip:ips){
String[] temp=ip.split(":");
ipInfo.setAddress(temp[0]);
ipInfo.setPort(temp[1]);
ipProxies.add(ipInfo);
}
return ipProxies;
}
//代理IpApi接口取Ip
private List<IpInfo>getIpProxyByAPI(){
List<IpInfo> ipProxies=new ArrayList<IpInfo>();
StringBuffer sb=new StringBuffer();
HttpClient httpClient=new HttpClient();
GetMethod getMethod=new GetMethod(API_URL);
try {
int code=httpClient.executeMethod(getMethod);
if (code==200) {
InputStream is = getMethod.getResponseBodyAsStream();
BufferedReader dis=new BufferedReader(new InputStreamReader(is,"utf-8"));
String str ="";
while((str =dis.readLine())!=null){
sb.append(str);
}
}
} catch (Exception e) {
e.printStackTrace();
}
//再这里解析这个接口返回的数据sb,可能是xml,可能是json,把这个数据封装到ipProxies再返回
/**/
//这里省略解析的过程,因为没有实际买代理的api接口
/**/
return ipProxies;
}
public void addVisit(String blogUrl){
//采用默认的自己找的ip去访问吧,没有买api接口
List<IpInfo>ipProxies=this.getDefalutIpProxy();
for( IpInfo ip : ipProxies) {
System.setProperty("http.maxRedirects", "50");
System.getProperties().setProperty("proxySet", "true");
System.getProperties().setProperty("http.proxyHost", ip.getAddress());
System.getProperties().setProperty("http.proxyPort", ip.getPort());
GetMethod getMethod = new GetMethod(blogUrl);
HttpClient httpClient=new HttpClient();
httpClient.getParams().setCookiePolicy(CookiePolicy.BROWSER_COMPATIBILITY);
getMethod.setRequestHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0");
try {
int stateCode=httpClient.executeMethod(getMethod);
if (stateCode==200) {
synchronized (this) {
success++;
System.out.println("访问成功几次:"+success);
}
}
else {
System.out.println("访问不成功");
}
} catch (HttpException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
public IpInfo getIp() {
return ip;
}
public void setIp(IpInfo ip) {
this.ip = ip;
}
}
主函数:
package com.blog.visit;
import java.util.List;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
public class Main {
public static void main(String[] args) {
final String blogUrl="http://blog.csdn.net/u010248330/article/details/68925613";
final IpProxy ipProxies=new IpProxy();
ExecutorService threadPool=Executors.newFixedThreadPool(1);
for (int i = 0; i <1; i++) {
threadPool.execute(new Runnable() {
@Override
public void run() {
try {
ipProxies.addVisit(blogUrl);
} catch (Exception e) {
e.printStackTrace();
}
}
});
}
}
}
实验结果:
原访问量:
执行代码看看:
说明访问成功了11次,所以访问量应该增加11次,1765+11=1776次。
我们在访问看看访问量多少了:
说明是可行的。。
如果我们买一个代理Ip的api接口,拿到更多的数据,开启多个线程去访问,完全是可以刷出很多的访问量。
此文纯属学习娱乐!!!~~~~