老程序员演示68行代码抓视频弹幕,看弹幕再也不累了
大家我是B站UP主 我是程序汪
演示一个抓弹幕的小程序
弹幕太多看的我眼花,我要抓下来
程序员多实践多写有意义的代码
比如弹幕里有人骂我,我秒秒钟从1千弹幕中
把他揪出来关小黑屋,哈哈哈。
maven配置
http工具、json工具
fastjson
httpclient
<!-- https://mvnrepository.com/artifact/com.alibaba/fastjson -->
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.47</version>
</dependency>
<!--=====================http工具类================= begin-->
<dependency>
<groupId>commons-httpclient</groupId>
<artifactId>commons-httpclient</artifactId>
<version>3.0</version>
</dependency>
<!--=====================http工具类================= end-->
代码
这是java代码,你还可以php/C#/python都行
package http.b;
import com.alibaba.fastjson.JSONObject;
import com.alibaba.fastjson.JSONPath;
import org.apache.commons.httpclient.HttpClient;
import org.apache.commons.httpclient.cookie.CookiePolicy;
import org.apache.commons.httpclient.methods.GetMethod;
import org.apache.commons.httpclient.methods.PostMethod;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
/**
* 我是程序汪 弹幕抓抓抓
*/
public class BilibiliGrabData {
public static String TMP_COOKIES="你自己浏览器的cookie";
public static void main(String[] args) {
HttpClient httpClient = new HttpClient();
try {
httpClient.getParams().setCookiePolicy(CookiePolicy.BROWSER_COMPATIBILITY);
for (int i = 0; i < 20; i++) {
Thread.sleep(3000);
System.out.println("开始抓数据 开始"+i+1+"数量"+50);
getMessage(httpClient,i+1,50);
}
}
catch (Exception e) {
e.printStackTrace();
}
}
private static void getMessage(HttpClient httpClient,int cursor,int limit) throws IOException {
String dataUrl="https://api.bilibili.com/x/v2/dm/recent?pn="+cursor+"&ps="+limit;
PostMethod postMethod=new PostMethod();
GetMethod getMethod = new GetMethod(dataUrl);
getMethod.setRequestHeader("cookie", TMP_COOKIES);
postMethod.setRequestHeader("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36");
httpClient.executeMethod(getMethod);
String text = getMethod.getResponseBodyAsString();
JSONObject jsonObject=JSONObject.parseObject(text);
System.out.println("总共记录:"+jsonObject.size());
for (int i = 0; i <50 ; i++) {
try {
String content=JSONPath.eval(jsonObject,"$.data.result["+i+"].msg").toString();
writeContent(content);
System.out.println("弹幕:"+content);
}catch (Exception e){
System.out.println(e.getMessage());
}
}
}
public static void writeContent(String data){
try{
File file =new File("javaio-appendfile555.txt");
if(!file.exists()){
file.createNewFile();
}
FileWriter fileWritter = new FileWriter(file.getName(),true);
BufferedWriter bw= new BufferedWriter(fileWritter);
bw.write(data);
bw.newLine();
bw.flush();
bw.close();
System.out.println("Done");
}catch(IOException e){
e.printStackTrace();
}
}
}
下面是弹幕接口,返回的json
其实还有删除弹幕接口、评论消息接口等等,我不一一演示了
喜欢的我后面开发一个抓评论的,
可以抓4万条评论,目测我的评论总数1万不到
google 浏览器中json演示
思路总结
-
找出消息接口 (google F12大法好啊)
-
解析接口返回的json数据 (json工具非常多)
-
模拟登陆
欢迎添加程序汪VX itwang007