主要介绍一下华软mysise教务系统模拟登陆+简单获取数据
最近比较多朋友问到这个,很多事情很难口头说的清楚,还是那句话好:「Talk is cheap,Show me the code」,如果你看完这篇文章,还不能理解mysise的模拟登录过程,那恕小弟表达能力不太够吧!!!
模拟登陆的过程:
首先分析一下登录页面http://class.sise.com.cn:7001/sise/login.jsp的表单
<form name="form1" method="post" action="login_check_login.jsp"> <input type="hidden" name="888af9dd5a2c154b78e9bd65747b2a7b" value="8c4251f32f3bd68c2aa22f538b0b533e"> <input id="random" type="hidden" value="1514133455194" name="random" /> <input id="token" type="hidden" name="token" /> <div> <a href="/sise/coursetemp/courseInfo.html" target="_blank"><b>2017-2018学年2学期 排课信息查看</b></a></div><!-- <div><font size="2" color="#006666">学号:</font><input name="username" id="username" type="text" size="15" class="notnull" onkeypress="goNext()" ></div> <div><font size="2" color="#006666">密码:</font><input name="password" id="password" type="password" size="15" class="notnull" onkeypress="Check_Nums()" ></div> <div><input type="button" id="Submit" name="Submit" value=" 登 录 " class="button" onclick="loginwithpwd();" onmouseover="this.style.color='red'" onmouseout="this.style.color='#1e7977'"><input type="button" id="Submit2" name="Submit2" value=" 重 写 " class="button" onclick="resetWin();" onmouseover="this.style.color='red'" onmouseout="this.style.color='#1e7977'"></div> </form>
- 由于写这篇文章时还是选课期间,所以多了一个
排课信息查看
,然而现在的mysise登录页面表单里也多了一个token
和random
要post过去,虽然搞了一会解密了token
是random
+JSESSIONID
值md5后混合的,但我大一时模拟登录用的旧数据格式还能用,就不详细说了,姑且命名它为新的登录方式,下面代码里面有生成token
的过程。 - 好了,现在来分析登陆过程,我们到底传了怎样的数据过去mysise,先抓包一波看看:
POST /sise/login_check_login.jsp HTTP/1.1
Host: class.sise.com.cn:7001
Content-Length: 182
Cache-Control: max-age=0
Origin: http://class.sise.com.cn:7001
Upgrade-Insecure-Requests: 1
Content-Type: application/x-www-form-urlencoded
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
Referer: http://class.sise.com.cn:7001/sise/login.jsp
Accept-Language: zh-CN,zh;q=0.9
Cookie: JSESSIONID=y28vh1XPx0mplcym62G5nGCBFNSfwrJf1yvS1PdhG1dPjQ7GJQ1r!-2087883253
Connection: close
888af9dd5a2c154b78e9bd65747b2a7b=8c4251f32f3bd68c2aa22f538b0b533e&random=1514133455194&token=5155F144B17333B4852571992468DC49247C2ED10A6EC&username=学号&password=密码
我们来简单分析一波,
888af9dd5a2c154b78e9bd65747b2a7b=8c4251f32f3bd68c2aa22f538b0b533e
这串东西是不是好像在那看到过???没错,就是在首页表单里面的隐藏项里,888af9dd5a2c154b78e9bd65747b2a7b
和8c4251f32f3bd68c2aa22f538b0b533e
长度都是32位,眉头皱了一下,发现888af9dd5a2c154b78e9bd65747b2a7b
是我本地ip的md5值,而8c4251f32f3bd68c2aa22f538b0b533e
就是我本机ip+sise的md5值。random=1514133455194&token=5155F144B17333B4852571992468DC49247C2ED10A6EC
这串东东我大一刚来的时候是没有的,现在还能登录,那就可以不要了。username=学号&password=密码
这个不多说。
POST /sise/login_check_login.jsp
说明是post方式把数据传到/sise/login_check_login.jsp
这个页面。好了,知道原理了,下面来代码模拟登陆,这里选用PHP(Java什么的同理的,没有语言黑!!!)
<?php
//mysiselogin.php
class Mysise_Login {
private $username;
private $password;
private $myIp;
private $normalMode; //模式,1是构造旧的登陆数据包,2是构造新的登陆数据包
private $url = "http://class.sise.com.cn:7001/sise/";
private $cookie = '';
private $random = '';
public function __construct($user, $passwd, $ip, $mode = 1) {
$this->username = $user;
$this->password = $passwd;
$this->myIp = $ip;
$this->normalMode = $mode;
}
public function get_post_data() {
$ipmd5 = md5($this->myIp);
$ipsisemd5 = md5(md5($this->myIp) . "sise");
$datas = $ipmd5 . "=" . $ipsisemd5;
if ($this->normalMode == 1) {
$datas .= "&username=" . $this->username . "&password=" . $this->password;
} else if ($this->normalMode == 2) {
$token=$this->getToken();
$datas .= "&random=" . $this->random . "&token=" . $token . "&username=" . $this->username . "&password=" . $this->password;
}
return $datas;
}
private function getToken() {
$content = $this->getResponse();
$preg_name = "/JSESSIONID=(.*?)!/";
preg_match_all($preg_name, $content, $cookie_info);
$this->cookie = $cookie_info[1][0];
$preg_name = "/<input id=\"random\" type=\"hidden\" value=\"(.*?)\" name=\"random\" \/>/";
preg_match_all($preg_name, $content, $random_info);
$this->random = $random_info[1][0];
$value = strtoupper(md5($this->url . $this->cookie . $this->random));
$len = strlen($value);
$randomlen = strlen($this->random);
$token = '';
for($index = 0;$index < $len;$index++) {
$token .= $value[$index];
if ($index < $randomlen) $token .= $this->random[$index];
}
return $token;
}
private function getResponse() {
$ch = curl_init($this->url);
curl_setopt($ch, CURLOPT_HEADER, true); //返回头信息
curl_setopt($ch, CURLOPT_NOBODY, false); //
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); //返回数据
$content = curl_exec($ch); //执行并存储结果
curl_close($ch);
return $content;
}
}
- 当调用get_post_data函数时,根据实例化Mysise_Login 类时,传入的mode变量生成并返回登陆数据格式。
- 好了,根据生成的数据包,我们来模拟登录一下,先放上模拟登录的函数代码,再放上调用代码。
//utils.php
<?php
class Utils {
public static $loginUrl = "http://class.sise.com.cn:7001/sise/login_check_login.jsp"; //登录url
public static $schedularUrl = "http://class.sise.com.cn:7001/sise/module/student_schedular/student_schedular.jsp"; //课程表url
public static function login_post($url, $cookie, $post) {
$curl = curl_init(); //初始化curl模块
curl_setopt($curl, CURLOPT_URL, $url); //登录提交的地址
curl_setopt($curl, CURLOPT_HEADER, false); //是否显示头信息
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true); //是否自动显示返回的信息
curl_setopt($curl, CURLOPT_COOKIEJAR, $cookie); //
curl_setopt($curl, CURLOPT_POST, true); //post方式提交
curl_setopt($curl, CURLOPT_POSTFIELDS, $post); //要提交的信息
$rs = curl_exec($curl); //执行cURL
curl_close($curl); //关闭cURL资源,并且释放系统资源
return $rs;
}
public static function get_content($url, $cookie) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url); //定义地址
curl_setopt($ch, CURLOPT_HEADER, false); //显示头信息
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); //跟随转跳
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); //以数据流返回,是
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie); //读取cookie
// curl_setopt($ch, CURLOPT_COOKIE, $cookie);//设置cookie
$rs = curl_exec($ch); //执行cURL抓取页面内容
curl_close($ch);
return $rs;
}
}
- 下面是整个模拟登录过程的代码
//index.php
<?php
require_once('mysiselogin.php');
require_once('utils.php');
$cookie = dirname(__FILE__).'/cookie.txt';
$login = new Mysise_Login(学号,密码,你的电脑ip);
//$login = new Mysise_Login('学号,密码,你的电脑ip,2);//新的登录方式
$datas= $login->get_post_data();
Utils::login_post(Utils::$loginUrl,$cookie,$datas);
header("Content-type: text/html; charset=gb2312");
//因为Mysise的网页编码是GBK的,所以直接显示要声明一下,不然浏览器输出会乱码
echo Utils::get_content(Utils::$schedularUrl,$cookie);
如果操作正确的话,这时候访问index.php,已经可以看到课程表的数据了
是不是这样的课表跟我们平时看mysise差不多,一点都没有自己的风格?那么下面随意写一下正则获取各个数据的代码,然后我们就可以发邮件提示你啊,微信机器人提示啊什么的(自行想象吧)……
- 就这样,课程表数据就获取成功了(正则写的快死了,早知道不用正则了…….)
<?php
function get_schedular($content) {
$content = iconv('GBK', 'UTF-8', $content);
$preg_name = "/<td width=\"70%\" nowrap>\<span class=\"style15\"> \<span class=\"style16\">(.*?) 姓名:(.*?) 年级:(.*?) 专业:(.*?)<\/span> <\/span><\/td>/";
preg_match_all($preg_name, $content, $name_info);
$stu = array();
$stu['stuNumber'] = $name_info[1][0];
$stu['stuName'] = $name_info[2][0];
$stu['stugrade'] = $name_info[3][0];
$stu['stuMajor'] = $name_info[4][0];
echo "<br/>学生信息:";
print_r($stu);
echo "<br/>教学周:";
preg_match_all("/(教学周: 第(.*?)周)/", $content, $teach_time);
$stu_now_week = $teach_time[2][0];
echo $stu_now_week;
echo "<br/>课程时间安排:<br/>";
$preg_time = "/<td width='10%' align='center' valign='top' class='font12'>(.*?)节<br>(.*?)<\/td><td width='10%' align='left' valign='top' class='font12'>/";
preg_match_all($preg_time, $content, $schooltime);
$class_time = array();
$class_time['time_num'] = count($schooltime[2]); //获取有多少个时间段,注释的是之前以为12:30-13:50是休息时间需要跳过,但现在好多在线课都在12:30-13:50,所以正常获取了
for($num = 0;$num < $class_time['time_num'];$num++) {
/**
* if ($num == 2) {
* continue;
* }
*/
$class_time[] = $schooltime[2][$num];
echo $schooltime[2][$num] ;
/**
* if ($num > 2) {
* $num1 = $num;
* } else {
* $num1 = $num + 1;
* }
*/
$num1 = $num + 1;
echo " 第:" . $num1 . "节<br/>";
}
echo "<br/><pre>";
$preg = "/<td width='10%' align='left' valign='top' class='font12'>(.*?)<\/td>/si";
preg_match_all($preg, $content, $arr);
$subject = array();
static $vline = 0;
static $hline = 1;
$arr_size = count($arr[1]);
for($subject_count = 0;$subject_count != $arr_size;$subject_count++) {
if ($hline > 7) {
$hline = 1;
$vline += 1;
}
$subject[$hline][$vline] = $arr[1][$subject_count];
if ($arr[1][$subject_count] != " ") {
$class_content = $subject[$hline][$vline];
$preg_hz = "/[\x{4e00}-\x{9fa5}a-zA-Z 0-9]{2,}\(/u";
preg_match_all($preg_hz, $class_content, $class_name_info);
$class_name = substr($class_name_info[0][0], 0, strlen($class_name_info[0][0])-1); //课程名称
$preg_name = "/\((.*?)\)/";
preg_match_all($preg_name, $class_content, $class_details);
$preg_hz = "/[\x{4e00}-\x{9fa5}a-zA-Z0-9]{2,}/u";
preg_match_all($preg_hz, $class_details[1][0], $hz);
echo "<br/>";
$class_learn_class = $hz[0][0]; //教学班
$class_teacher = $hz[0][1]; //任课老师
echo "教学班:" . $class_learn_class . "<br/>";
// echo $class_name . " " . $class_teacher . "<br/>";
echo "课程名称:" . $class_name . "<br/>";
echo "任课老师:" . $class_teacher . "<br/>";
$preg = "/(\ \d+){1,}/";
preg_match_all($preg, $class_details[1][0], $src_arr);
$preg_name = "/(\d+){1,}/";
preg_match_all($preg_name, $src_arr[0][0], $arr1);
$WeekNum = count($arr1[0]) - 1;
$class_weeks = null;
for($num = 0;$num <= $WeekNum;$num++) {
$class_weeks .= $arr1[0][$num] . ",";
}
$preg_name = "/\[(.*?)\]/";
preg_match_all($preg_name, $class_details[1][0], $class_room_info);
$class_room = $class_room_info[1][0];
$which_class = $vline + 1;
echo "<br/>星期" . $hline . " 第" . $which_class . "节<br/>";
echo "<br/>课室:" . $class_room . "<br/>";
$all_WeekNum = $WeekNum + 1;
echo "<br/>一共" . $all_WeekNum . "周<br/>";
print_r($arr1[0]);
echo "<br/>";
}
$hline += 1;
}
}
?>
说点什么:
- 如果结合公众号什么的,还可以弄出更多便利的东西,我就懒的搞了,想获取考试时间表啊,考勤信息啊什么鬼的,自己根据
http://class.sise.com.cn:7001/sise/module/student_schedular/student_schedular.jsp
页面获取链接,再提取数据吧。(逃