背景:
当年刚毕业助学贷款还没还清的时候就被一个漂亮MM忽悠的学习了一年的ef英语 花了6K+大洋,当时因为自己懒惰一年没学到啥东西(现在想想就想骂自己)。后来又想
学了,账号过期了。再后来发现可通过一种方法得到ef学习的账号但是没密码。没关系自己尝试写个爬虫的东东应该可以搞定。
ef 登录链接如下:
http://www.ef.com.cn/partner/englishcenters/cn
仿照例子:
http://www.2cto.com/kf/201401/275152.html
实现了一个登录 英孚 英语的脚本。废话不多说,直接上代码吧:
#! /usr/bin/env python
#coding:utf-8
import sys
import re
import urllib2
import urllib
import cookielib
## 这段代码是用于解决中文报错的问题
reload(sys)
sys.setdefaultencoding("utf8")
#####################################################
loginurl = 'https://securecn1.englishtown.com/login/handler.ashx'
logindomain = 'securecn1.englishtown.com'
cj = cookielib.LWPCookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj) , urllib2.HTTPHandler)
urllib2.install_opener(opener)
loginparams = {'domain':logindomain,'username':'xxx', 'password':'asdfg'}
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.89 Safari/537.1',
'Content-Type': 'application/x-www-form-urlencoded',
'Connection': 'keep-alive',
'Referer': 'http://www.ef.com.cn/partner/englishcenters/cn',
'Origin': 'http://www.ef.com.cn',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
}
req = urllib2.Request(loginurl, urllib.urlencode(loginparams),headers=headers)
response = urllib2.urlopen(req)
operate = opener.open(req)
thePage = response.read()
temp = sys.stdout
sys.stdout = open('1.txt','w')
print thePage
通过用 fiddler 抓包分析,和 查看网页源码发现ef登录是一个https方式。可为什么上面的代码能行得通呢? 不解!
再在网上搜索教程发现一个https 登录的方式:
http://bbs.acehat.com/thread-3894-1-1.html
于是又写了一个脚本:
#! /usr/bin/env python
#coding:utf-8
import sys
import re
import urllib2
import urllib
import cookielib
import httplib
host = 'securecn1.englishtown.com'
url = '/login/handler.ashx'
csrf = cookielib.CookieJar()
loginparams = {
'username' : 'xxxx',
'password' : 'asdfg',
}
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.89 Safari/537.1',
'Content-Type': 'application/x-www-form-urlencoded',
'Connection': 'keep-alive',
'Cookie':'csrftoken=%s' % csrf,
'Referer': 'http://www.ef.com.cn/partner/englishcenters/cn',
'Origin': 'http://www.ef.com.cn',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
}
conn = httplib.HTTPSConnection(host, 443)
params = urllib.urlencode(loginparams)
conn.request("POST", url, params, headers)
response = conn.getresponse()
hdata = response.getheaders()
temp = sys.stdout
sys.stdout = open('1.txt','w')
print hdata
print 'Login: ', response.status, response.reason
通过上面的代码发现在密码正确和错误的时候,hdata打印的结果是不一样的:
如果密码正确的那么结果是:
'location', 'http://www.ef.com.cn/login/handler.ashx?sGoLVFJqFaPhTdqxYczULL237yUIbiadW++Ny78xbjm3qTB+BI428SsGmOXv5OcKtDTD4h7KYnphLyTUf1b2gtRA6ut4b2sS7Ci+aIb8LTAXOFBwOylPOxr/AAGZdCDpmC3mUUzhpxaqU+qyb2L32S1TjXLvMo/eQi22CMs5vcD4AnlBu8XAVJd8y+djZIL3E96gZjyggj1vEQRJ2NVpdvYSC+vMhxXxSWSyxd9vrPntalVUV+y0bVAoJSbO7fhySFZMUVRXoLMOoW+1VaZ2sfZZPLZWjgWeMHBEphCpjR3WQFNYxR6pBxGo7+TnaNpAnwgwe1jNdCj2mjF6X2VSIb576RHgs6UK8pgjJmX0XeAa5uRSOk6YW3zth28Dy1/LCCiMGXLl+L2xRE1bV9NVzfWz0XWwHVaKQT/M78myLHCHqu351GHBbLlG9xqJglZfAK91fjLdy0+MCIoH2JW7P11HbWCxYf2Qs3QsSB+QN84psFKs/p7vf26zXg2u3YtIlHZaLWqYf9af7tZrUStcXCoWSgBD2SNNP6fJvEtRKz0hi2TxTcvdIR/lMV7BdFMP/lUXOHDO7/wT/It6HVqdZd20PNLITFjrxpBpQbfiw0U=')
如果错误的结果则是:
'location', 'http://www.ef.com.cn/partner/englishcenters/cn?auth=false'
这些对于brute force cracking 来说已经够了~