微博登录的js更新好快啊~之前有一次想登陆就没搞定,现在终于有时间弄弄这个小工具了,顺便练练手。
参考了几个前人的文章:
http://www.douban.com/note/201767245/
http://blog.csdn.net/monsion/article/details/8656690
http://blog.csdn.net/huyoo/article/details/11952603
通过这几篇文章,基本搞懂了登录过程~
通过http://login.sina.com.cn/,看源代码可以发现登录过程的js脚本文件为Request URL:https://login.sina.com.cn/sso/login.php?client=ssologin.js(v1.4.15)&_=1399711215289
后面的最后一串是请求时间。原来时间的产生方式(v1.4.4)比现在的短了几位,所以还需要对time做个小修改,详见代码。
weiboLogin.py
#! /usr/bin/env python
# -*- coding: utf-8 -*-
import sys
import urllib
import urllib2
import cookielib
import base64
import re
import json
import hashlib
import rsa
import binascii
import time
class weiboLogin:
cj = cookielib.LWPCookieJar()
cookie_support = urllib2.HTTPCookieProcessor(cj)
opener = urllib2.build_opener(cookie_support, urllib2.HTTPHandler)
urllib2.install_opener(opener)
postdata = {
'entry': 'weibo',
'gateway': '1',
'from': '',
'savestate': '7',
'userticket': '1',
'ssosimplelogin': '1',
'vsnf': '1',
'vsnval': '',
'su': '',
'service': 'miniblog',
'servertime': '',
'nonce': '',
'pwencode': 'rsa2',
'sp': '',
'encoding': 'UTF-8',
'prelt': '115',
'rsakv': '',
'url': 'http://weibo.com/ajaxlogin.php?framelogin=1&callback=parent.sinaSSOController.feedBackUrlCallBack',
'returntype': 'META'
}
def get_servertime(self,username):
curtime=int(time.time()*1000)
url = r'http://login.sina.com.cn/sso/prelogin.php?entry=weibo&callback=sinaSSOController.preloginCallBack&su=%s&rsakt=mod&checkpin=1&client=ssologin.js(v1.4.15)&_=' %username +str(curtime)
# print url
data = urllib2.urlopen(url).read()
p = re.compile('\((.*)\)')
# print data
try:
json_data = p.search(data).group(1)
data = json.loads(json_data)
servertime = str(data['servertime'])
nonce = data['nonce']
pubkey = data['pubkey']
rsakv = data['rsakv']
return servertime, nonce, pubkey, rsakv
except:
print 'Get severtime error!'
return None
def get_pwd(self, password, servertime, nonce, pubkey):
rsaPublickey = int(pubkey, 16)
key = rsa.PublicKey(rsaPublickey, 65537) #创建公钥
message = str(servertime) + '\t' + str(nonce) + '\n' + str(password) #拼接明文js加密文件中得到
passwd = rsa.encrypt(message, key) #加密
passwd = binascii.b2a_hex(passwd) #将加密信息转换为16进制。
return passwd
def get_user(self, username):
username_ = urllib.quote(username)
username = base64.encodestring(username_)[:-1]
return username
def get_account(self,filename):
f=file(filename)
flag = 0
for line in f:
if flag == 0:
username = line.strip()
flag +=1
else:
pwd = line.strip()
f.close()
# print username,' ',pwd
return username,pwd
def login(self,filename):
username,pwd = self.get_account(filename)
url = 'http://login.sina.com.cn/sso/login.php?client=ssologin.js(v1.4.15)'
# try:
servertime, nonce, pubkey, rsakv = self.get_servertime(username)
print servertime
print nonce
print pubkey
print rsakv
# except:
# print 'get servertime error!'
# return
weiboLogin.postdata['servertime'] = servertime
weiboLogin.postdata['nonce'] = nonce
weiboLogin.postdata['rsakv'] = rsakv
weiboLogin.postdata['su'] = self.get_user(username)
weiboLogin.postdata['sp'] = self.get_pwd(pwd, servertime, nonce, pubkey)
weiboLogin.postdata = urllib.urlencode(weiboLogin.postdata)
print self.get_user(username),self.get_pwd(pwd, servertime, nonce, pubkey)
headers = {'User-Agent':'Mozilla/5.0 (X11; Linux i686; rv:8.0) Gecko/20100101 Firefox/8.0 Chrome/20.0.1132.57 Safari/536.11'}
req = urllib2.Request(
url = url,
data = weiboLogin.postdata,
headers = headers
)
result = urllib2.urlopen(req)
text = result.read()
self.writefile('./output/textlogin',text)
self.writefile('./output/resultlogin',eval("u'''"+text+"'''"))
p = re.compile('location\.replace\(\'(.*)\'\)')#这里博文用的是双引号,是错的,改成了单引号就好了!
try:
login_url = p.search(text).group(1)
print login_url
urllib2.urlopen(login_url)
print "Login success!"
return 1
except:
print 'Login error!'
return 0
def writefile(self,filename,content):
fw = file(filename,'w')
fw.write(content)
fw.close()
主程序main.py
# -*- coding: utf-8 -*-
import weiboLogin
import urllib
import urllib2
import time
import getWeiboPage
filename = './config/account'#保存微博账号的用户名和密码,第一行为用户名,第二行为密码,没有空行
WBLogin = weiboLogin.weiboLogin()
if WBLogin.login(filename)==1:
print 'Login success!'
else:
print 'Login error!'
exit()
WBmsg = getWeiboPage.getWeiboPage()
url = 'http://weibo.com/p/1005051447378675/weibo?from=page_100505&mod=TAB#place'
# 'http://weibo.com/274891787?from=otherprofile&wvr=3.6&loc=tagweibo'
WBmsg.get_firstpage(url)
WBmsg.get_secondpage(url)
WBmsg.get_thirdpage(url)
登陆后保存页面的,因为主页有lazy load机制,所以要分三次保存
getWeiboPage.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import urllib
import urllib2
import sys
import time
reload(sys)
sys.setdefaultencoding('utf-8')
class getWeiboPage:
body = {
'__rnd':'',
'_k':'',
'_t':'0',
'count':'50',
'end_id':'',
'max_id':'',
'page':1,
'pagebar':'',
'pre_page':'0',
'uid':''
}
uid_list = []
charset = 'utf8'
def get_msg(self,uid):
getWeiboPage.body['uid'] = uid
url = self.get_url(uid)
self.get_firstpage(url)
self.get_secondpage(url)
self.get_thirdpage(url)
def get_firstpage(self,url):
getWeiboPage.body['pre_page'] = getWeiboPage.body['page']-1
url = url +urllib.urlencode(getWeiboPage.body)
req = urllib2.Request(url)
result = urllib2.urlopen(req)
text = result.read()
self.writefile('./output/text1',text)
self.writefile('./output/result1',eval("u'''"+text+"'''"))
def get_secondpage(self,url):
getWeiboPage.body['count'] = '15'
# getWeiboPage.body['end_id'] = '3490160379905732'
# getWeiboPage.body['max_id'] = '3487344294660278'
getWeiboPage.body['pagebar'] = '0'
getWeiboPage.body['pre_page'] = getWeiboPage.body['page']
url = url +urllib.urlencode(getWeiboPage.body)
req = urllib2.Request(url)
result = urllib2.urlopen(req)
text = result.read()
self.writefile('./output/text2',text)
self.writefile('./output/result2',eval("u'''"+text+"'''"))
def get_thirdpage(self,url):
getWeiboPage.body['count'] = '15'
getWeiboPage.body['pagebar'] = '1'
getWeiboPage.body['pre_page'] = getWeiboPage.body['page']
url = url +urllib.urlencode(getWeiboPage.body)
req = urllib2.Request(url)
result = urllib2.urlopen(req)
text = result.read()
self.writefile('./output/text3',text)
self.writefile('./output/result3',eval("u'''"+text+"'''"))
def get_url(self,uid):
url = 'http://weibo.com/' + uid + '?from=otherprofile&wvr=3.6&loc=tagweibo'
return url
def get_uid(self,filename):
fread = file(filename)
for line in fread:
getWeiboPage.uid_list.append(line)
print line
time.sleep(1)
def writefile(self,filename,content):
fw = file(filename,'w')
fw.write(content)
fw.close()
删除机制测试了一下,请求返回的是错误页面。。求帮忙~谢谢!
替换getWeiboPage.py里的对应函数~
def get_firstpage(self,url):
getWeiboPage.body['pre_page'] = getWeiboPage.body['page']-1
url = url +urllib.urlencode(getWeiboPage.body)
req = urllib2.Request(url)
result = urllib2.urlopen(req)
text = result.read()
self.writefile('./output/text1.html',text)
p = re.compile('{\"ns\":\"pl\.content\.homeFeed\.index\"(.*)"html":"(.*)}')
try:
feeds = p.search(text).group(2)
self.writefile('./output/result1.html',feeds)
# print feeds,'FEEDS ok'
eachFeed = re.compile(r'action-data=\\"mid=(\d*)')
# pp = re.compile(r'mid=(\d+)')
nodes=eachFeed.findall(feeds)
middict={}
print nodes
for node in nodes:
middict[node]=1
for (key,x) in middict.items():
print "deleting key:"+key
urldel = 'http://weibo.com/aj/mblog/del?_wv=5'
postdata = {'mid':key}
postdata = urllib.urlencode(postdata)
req = urllib2.Request(url,postdata)
result = urllib2.urlopen(req)
time.sleep(4)
delresult = result.read()
print delresult
except:
print 'get feed error'
还是有错~求帮忙~