爬取学校超星网上未完成作业或考试,并输出至qq邮箱

爬取学校超星网上未完成作业或考试,并输出至qq邮箱

项目结构采用我之前编写的爬虫模板
爬虫模板.

相关接口

因为完成代码和写这篇笔记隔了好几天的,懒得截图接口信息了,这里就贴上我之前爬取的笔记,可以自己去抓取下练练手

courseid,classid,cpi等参数为 id 类,是不变的,且后续网站提供的链接并无此类参数,可以储存起来,
而每个链接都需要enc,而enc随机生成,每个链接都不一样,所以不能采用一步到位爬取,得依次爬取解析html取出链接访问

/**
 * 姓名,学号,电话接口: 
 *  这里是会有参数的,但是我当时忘了贴
 * 	@返回:html
 * http://passport2.chaoxing.com/mooc/accountManage	
 * 
 * 课程数据接口:
 * 这里是会有参数的,但是我当时忘了贴
 *  @返回:html
 * http://mooc1-1.chaoxing.com/visit/courselistdata
 * 
 * 课程首页接口:
 *  @参数:可来自课程数据接口
 * courseid,courseid,cpi为
 * 	@返回:html   
 * https://mooc2-ans.chaoxing.com/mycourse/stu?courseid=???&classId=???&cpi=???&enc=cda787016ce2c1f162fe6230b02bf948&t=1630417924301&pageHeader=9
 * 
 * 课程作业接口:
 *  @参数:来自课程首页
 *  @返回:html
 * https://mooc1.chaoxing.com/mooc2/work/list?courseId=?&classId=?&cpi=?&enc=27719c62b0263f5249aa52edea81c125&
 * 
 * 课程考试接口:
 *  @参数:来自课程首页
 *  @返回html
 * https://mooc1.chaoxing.com/mooc2/exam/exam-list?enc=a13c03b8366c587872c31e0a5d16eac6&openc=5a474025b333b5b147076b1eb3c38ab3&courseid=???&clazzid=???031&cpi=???&ut=s
 * 
 * 课程任务接口:
 *  @参数:来自课程首页
 *  @返回:json
 * https://mobilelearn.chaoxing.com/v2/apis/active/student/activelist?fid=1971&courseId=???&classId=???
 *

思路

获取课程信息->构造首页链接->访问考试,作业接口->解析出未完成,并添加到json,

得到接口后,就开始分析.因为访问课程考试,作业信息时,有enc参数,此参数不固定,为服务器随机生成,所以不能直接访问,且课程首页提供的链接不包含课程的courseid,classid,cpi,
需要先到课程首页获取考试,作业链接,并添加courseid,classid,cpi,访问获得考试,课程信息
得到信息后解析,输出

登录

登录没什么说的,主要看地址,参数
地址抓包来的
参数 分析
分析出那些为通用参数(人人都一样)
那些为个性化参数(都有,但没有不一样)
在分析下加密
完成

	@Override
	public boolean login() {
		boolean result = false;
		CloseableHttpClient httpclient = HttpClients.createDefault();
		HttpClientContext httpClientContext = HttpClientContext.create();
		HttpPost post = new HttpPost("http://passport2.chaoxing.com/fanyalogin");
		//将密码用base64方式
		String upwbase64password = base64(password);
		List<NameValuePair> paramslist = new ArrayList<NameValuePair>();
		paramslist.add(new BasicNameValuePair("fid", "1971"));//常量
		paramslist.add(new BasicNameValuePair("uname", uname));
		paramslist.add(new BasicNameValuePair("password", upwbase64password));
		paramslist.add(new BasicNameValuePair("refer", "http://i.mooc.chaoxing.com"));//常量
		paramslist.add(new BasicNameValuePair("t", "true"));//常量
		try {
			UrlEncodedFormEntity urlEncodedFormEntity = new UrlEncodedFormEntity(paramslist, "UTF-8");
			post.setEntity(urlEncodedFormEntity);
		} catch (UnsupportedEncodingException e1) {
			e1.printStackTrace();
		}
		CloseableHttpResponse response;
		try {
			response = httpclient.execute(post, httpClientContext);
			if (response.getStatusLine().getStatusCode() == 200) {
			//将服务器返回的cookie放在cookieStore中,以便下次使用
				cookieStore = httpClientContext.getCookieStore();
				result = true;
			}
		} catch (ClientProtocolException e) {
			e.printStackTrace();
		} catch (IOException e) {
			e.printStackTrace();
		}
		return result;
	}

爬取课程信息

courseid,classid,cpi等参数为 id 类,描述课程id,学生id,是不变的,且后续网站提供的链接并无此类参数(js动态链接),可以储存起来,
这个方法爬取用户首页并获取课程信息,并储存至全局变量courselist中

public void getcourse() {
		courselist = new ArrayList<schoolclass>();
		HashMap<String, String> params = new HashMap<String, String>();
		params.put("courseType", "1");
		params.put("courseFolderId", "0");
		params.put("courseFolderSize", "0");
		HttpResponse response = post("http://mooc1-1.chaoxing.com/visit/courselistdata", params);
		Document document = null;
		try {
			document = Jsoup.parse(EntityUtils.toString(response.getEntity()));
		} catch (ParseException e) {
			e.printStackTrace();
		} catch (IOException e) {
			e.printStackTrace();
		}
		Elements elements = document.getElementsByAttributeValue("class", "course clearfix");
		elements.forEach(element -> {
			String courseid = element.attr("courseId");
			String clazzId = element.attr("clazzId");
			String cpi = element.attr("personId");
			String name = element.child(1).child(0).child(0).child(0).text();
			courselist.add(new schoolclass(name, courseid, clazzId, cpi));
		});
	}

根据课程信息构造链接爬取作业和考试信息

此方法为获取单门课程的信息,还封装了一个遍历所有课程的方法,见全代码

public JSONObject getUncompleteHomeworkAndexame(schoolclass sc) throws ParseException, IOException {
		JSONObject jclass = new JSONObject();
		JSONArray ja = new JSONArray();
		String url = "https://mooc2-ans.chaoxing.com/mycourse/stu?courseid=" + sc.getCourseid() + "&cpi=" + sc.getCpi()
				+ "&clazzid=" + sc.getClassid() + "&pageHeader=8";
		HttpResponse httpResponse = get(url, null);
		Document document = null;
		document = Jsoup.parse(EntityUtils.toString(httpResponse.getEntity()));
		Elements homework = document.getElementsByAttributeValue("title", "作业");
		Elements exame = document.getElementsByAttributeValue("title", "考试");
		String homeworkurl = homework.attr("data-url");
		String exameurl = exame.attr("data-url") + "&courseid=" + sc.getCourseid() + "&cpi=" + sc.getCpi() + "&clazzid="
				+ sc.getClassid();
		HttpResponse homeworkresponse = get(homeworkurl, null);
		Document homeworkdocument = Jsoup.parse(EntityUtils.toString(homeworkresponse.getEntity()));
		Elements homeworks = homeworkdocument.getElementsByTag("li");
		homeworks.forEach(li -> {
			if (li.getElementsByClass("status").text().equals("未交")) {
				String hwname = li.getElementsByClass("overHidden2 fl").text();
				Elements time = li.getElementsByClass("time notOver");
				if(!time.isEmpty()) {
					String timestr=time.text();
					JSONObject jo = new JSONObject();
			        jo.put("名称", hwname);
			        jo.put("类型","作业");
			        jo.put("时间",timestr);
			        ja.add(jo);
				}
			}
		});
		HttpResponse exameresponse = get(exameurl, null);
		Document examedocument = Jsoup.parse(EntityUtils.toString(exameresponse.getEntity()));
		Elements exames = examedocument.getElementsByTag("li");
		exames.forEach(li -> {
			if (li.getElementsByClass("status").text().equals("未完成")) {
				String hwname = li.getElementsByClass("overHidden2 fl").text();
				Elements time = li.getElementsByClass("time notOver");
				if(!time.isEmpty()) {
					String timestr=time.text();
					JSONObject jo = new JSONObject();
			        jo.put("名称", hwname);
			        jo.put("类型","考试");
			        jo.put("时间",timestr);
			        ja.add(jo);
				}
			}
		});
		if(!ja.isEmpty()) {
			jclass.put("name",sc.getName());
			jclass.put("uncompleted", ja);
		}
		return jclass;
	}
	

全代码

public class caoxing extends mypachongimpl {
	String password;//密码 ,且密码采用base64加密
	String uname;//账号
	String email;//要接受的邮箱
	//shcoolclass封装了courseid,classid,cpi等参数,至于为啥有俩list,我也忘了
	List<schoolclass> classlist = null; 
	List<schoolclass> courselist = null;
	//构造器
	public caoxing(String uname, String password,String email) {
		this.password = password;
		this.uname = uname;
		this.email = email;
	}
	public caoxing() {
	}
	//相当于主函数 ,解析其他方法构造的json信息并发送.在使用时,直接调用这个方法
	public void domain() {
			JSONArray json=null;
			try {
				json = getUncomplete();//这个方法返回所有未完成项目的json,主要的爬虫代码就在这个方法中
			} catch (ParseException e1) {
				e1.printStackTrace();
			} catch (IOException e1) {
				e1.printStackTrace();
			}
			if(!json.isEmpty()) {
				Iterator<JSONObject> iter = json.iterator();
				String str="";
				while(iter.hasNext()) {
					JSONObject next = iter.next();
					String classname = (String) next.get("name");
					JSONArray uncompletedlist = (JSONArray) next.get("uncompleted");
					Iterator<JSONObject> listiter = uncompletedlist.iterator();
					String strlist="";
					while(listiter.hasNext()) {
						JSONObject next2 = listiter.next();
						String taskname = (String) next2.get("名称");
						String type = (String) next2.get("类型");
						String time = (String) next2.get("时间");
						strlist="<br>您的: "+taskname+"存在未完成:"+type+" "+time;
					}
					str+="您的 "+classname+" "+"存在未完成作业或考试,名单如下:"+strlist+"<br>";
				}
				mail.sendmail(email,str);//发送邮件,我直接封装的mail模块,csdn上一代堆,拿过来封装成一个类,哪里需要cv哪里
			}
	}
	/**
	 * ��ȡ����δ��ɵ���ҵ�Ϳ���(����ʱ��)
	 * @return 
	 * @throws ParseException
	 * @throws IOException
	 */
	public JSONArray getUncomplete() throws ParseException, IOException {
		if(courselist==null) getcourse();
		Iterator<schoolclass> iter =courselist.iterator();
		JSONArray ja = new JSONArray();
		while (iter.hasNext()) {
			schoolclass next = iter.next();
			JSONObject jo = getUncompleteHomeworkAndexame(next);
			if(!jo.isEmpty()) {
				ja.add(jo);
			}
		}
		return ja;
	}
	/**
	 * ��ȡ���ſγ�δ��ɵ���ҵ�Ϳ���(����ʱ��)
	 * @param sc ��Ҫ��ѯ��school����
	 * @return ����JSONObject
	 * @throws ParseException
	 * @throws IOException
	 */
	public JSONObject getUncompleteHomeworkAndexame(schoolclass sc) throws ParseException, IOException {
		JSONObject jclass = new JSONObject();
		JSONArray ja = new JSONArray();
		String url = "https://mooc2-ans.chaoxing.com/mycourse/stu?courseid=" + sc.getCourseid() + "&cpi=" + sc.getCpi()
				+ "&clazzid=" + sc.getClassid() + "&pageHeader=8";
		HttpResponse httpResponse = get(url, null);
		Document document = null;
		document = Jsoup.parse(EntityUtils.toString(httpResponse.getEntity()));
		Elements homework = document.getElementsByAttributeValue("title", "作业");
		Elements exame = document.getElementsByAttributeValue("title", "考试");
		String homeworkurl = homework.attr("data-url");
		String exameurl = exame.attr("data-url") + "&courseid=" + sc.getCourseid() + "&cpi=" + sc.getCpi() + "&clazzid="
				+ sc.getClassid();
		HttpResponse homeworkresponse = get(homeworkurl, null);
		Document homeworkdocument = Jsoup.parse(EntityUtils.toString(homeworkresponse.getEntity()));
		Elements homeworks = homeworkdocument.getElementsByTag("li");
		homeworks.forEach(li -> {
			if (li.getElementsByClass("status").text().equals("未交")) {
				String hwname = li.getElementsByClass("overHidden2 fl").text();
				Elements time = li.getElementsByClass("time notOver");
				if(!time.isEmpty()) {
					String timestr=time.text();
					JSONObject jo = new JSONObject();
			        jo.put("名称", hwname);
			        jo.put("类型","作业");
			        jo.put("时间",timestr);
			        ja.add(jo);
				}
			}
		});
		HttpResponse exameresponse = get(exameurl, null);
		Document examedocument = Jsoup.parse(EntityUtils.toString(exameresponse.getEntity()));
		Elements exames = examedocument.getElementsByTag("li");
		exames.forEach(li -> {
			if (li.getElementsByClass("status").text().equals("未完成")) {
				String hwname = li.getElementsByClass("overHidden2 fl").text();
				Elements time = li.getElementsByClass("time notOver");
				if(!time.isEmpty()) {
					String timestr=time.text();
					JSONObject jo = new JSONObject();
			        jo.put("名称", hwname);
			        jo.put("类型","考试");
			        jo.put("时间",timestr);
			        ja.add(jo);
				}
			}
		});
		if(!ja.isEmpty()) {
			jclass.put("name",sc.getName());
			jclass.put("uncompleted", ja);
		}
		return jclass;
	}

	/**
	 * ��¼֮���ȡ�γ���Ϣ,����ֵ������ij�Ա����
	 */
	public void getcourse() {
		courselist = new ArrayList<schoolclass>();
		HashMap<String, String> params = new HashMap<String, String>();
		params.put("courseType", "1");
		params.put("courseFolderId", "0");
		params.put("courseFolderSize", "0");
		HttpResponse response = post("http://mooc1-1.chaoxing.com/visit/courselistdata", params);
		Document document = null;
		try {
			document = Jsoup.parse(EntityUtils.toString(response.getEntity()));
		} catch (ParseException e) {
			e.printStackTrace();
		} catch (IOException e) {
			e.printStackTrace();
		}
		Elements elements = document.getElementsByAttributeValue("class", "course clearfix");
		elements.forEach(element -> {
			String courseid = element.attr("courseId");
			String clazzId = element.attr("clazzId");
			String cpi = element.attr("personId");
			String name = element.child(1).child(0).child(0).child(0).text();
			courselist.add(new schoolclass(name, courseid, clazzId, cpi));
		});
	}

	/**
	 * ����������get,postʵ��,��Ϊ,get,post����ô˷�����ʼ��cookieStore
	 */
	@Override
	public boolean login() {
		boolean result = false;
		CloseableHttpClient httpclient = HttpClients.createDefault();
		HttpClientContext httpClientContext = HttpClientContext.create();
		HttpPost post = new HttpPost("http://passport2.chaoxing.com/fanyalogin");
		String upwbase64password = base64(password);
		List<NameValuePair> paramslist = new ArrayList<NameValuePair>();
		paramslist.add(new BasicNameValuePair("fid", "1971"));
		paramslist.add(new BasicNameValuePair("uname", uname));
		paramslist.add(new BasicNameValuePair("password", upwbase64password));
		paramslist.add(new BasicNameValuePair("refer", "http://i.mooc.chaoxing.com"));
		paramslist.add(new BasicNameValuePair("t", "true"));
		try {
			UrlEncodedFormEntity urlEncodedFormEntity = new UrlEncodedFormEntity(paramslist, "UTF-8");
			post.setEntity(urlEncodedFormEntity);
		} catch (UnsupportedEncodingException e1) {
			e1.printStackTrace();
		}
		CloseableHttpResponse response;
		try {
			response = httpclient.execute(post, httpClientContext);
			if (response.getStatusLine().getStatusCode() == 200) {
				cookieStore = httpClientContext.getCookieStore();
				result = true;
			}

		} catch (ClientProtocolException e) {
			e.printStackTrace();
		} catch (IOException e) {
			e.printStackTrace();
		}
		return result;
	}

	/**
	 * 
	 * @param str Ҫ���ܵ�str
	 * @return ���ܺ��str
	 */
	public static String base64(String str) {
		String result = null;
		// ��Ϊ�ջ�null��ֱ�ӷ���null
		if (str.equals("") || str == null) {
			return null;
		}
		byte[] bytes = str.getBytes();
		result = Base64.getEncoder().encodeToString(bytes);
		return result;
	}
}
  • 6
    点赞
  • 23
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值