python web微信应用(一) 微信协议分析


前言

之前用 python 实现了一个微信客户端,支持功能:二维码扫码登录、缓存自动登录、联系人/群组/公众号的消息接收、文本/图片/语音/视频/位置/表情/文件/撤回等消息类型的解析

本篇文章作为系列第一篇文章,将分析实现过程中各步骤的协议以及代码实现主体

系列其它文章请参考:
python web微信应用(二) webwx 模块源码

python web微信应用(三) 微信智能聊天机器人

python web微信应用(四) 监测自己被群组消息 @

python web微信应用(五) 自动下载接收的图片/语音/视频

python web微信应用(六) 监测微信撤回的消息

一、__get_uuid,获取 uuid

request:

urlhttps://login.wx.qq.com/jslogin
methodGET
paramsappid: wx782c26e4c19acffb # 固定值
redirect_uri: https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxnewloginpage # 固定值
fun: new
lang: en_US
_: 1600072411492 # 时间戳,int(time.time() * 1000)

response:

window.QRLogin.code = 200; window.QRLogin.uuid = "QbKzIfgT3w==";

通过正则表达式获取 uuid 值:

r = self.session.get(url, params=params, headers=self.headers, proxies=self.proxies)
regx = r'window.QRLogin.code = (\d+); window.QRLogin.uuid = "(\S+?)";'
data = re.search(regx, r.text)
if data.group(1) == '200':
    self.uuid = data.group(2)

二、__gen_qrcode,生成二维码

直接打印 url 二维码到终端,等待手机端扫码登录:

url = 'https://login.weixin.qq.com/l/' + self.uuid
qr = qrcode.QRCode()
qr.add_data(url)
qr.print_ascii(invert=False)

三、__login,手机扫码登录

request:

urlhttps://login.wx.qq.com/jslogin
methodGET
paramsloginicon: true
uuid: QbKzIfgT3w==
tip: 1 # 0 - 表示已扫码,1 - 表示未扫码
r: -1950363289 # 时间戳,-int(time.time())
_: 1600072411493 # 时间戳,int(time.time() * 1000)

登录过程需要两次 GET 请求:

第一次是扫码,tip=1,返回码为 201, 表示扫码成功
response:

window.code=201;window.userAvatar = 'xxx';

第二次是登录,tip=0,返回码为 200, 表示登录成功
response:

window.code=200;
window.redirect_uri="https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxnewloginpage?ticket=AS7a6pAQk9lfSpMhdCh_gfhs@qrticket_0&uuid=QbKzIfgT3w==&lang=en_US&scan=1600072446";

通过正则表达式可以获取 redirect_uri 信息,用于后续请求使用:

tip = 1
while not self.isLogin:
    r = self.session.get(url, params=params, headers=self.headers, proxies=self.proxies)
    data = re.search(r'window.code=(\d+)', r.text)
    if data.group(1) == '408': # timeout
        tip = 1
    elif data.group(1) == '201': # scaned
        tip = 0
        print("scan success")
    elif data.group(1) == '200': # success
        param = re.search(r'window.redirect_uri="(\S+?)";', r.text)
        self.redirect_uri = param.group(1)
        self.isLogin = True
        print("login success")

四、__get_params,获取登录参数信息

request:

urlhttps://wx.qq.com/cgi-bin/mmwebwx-bin/webwxnewloginpage
methodGET
paramsticket: AS7a6pAQk9lfSpMhdCh_gfhs@qrticket_0
uuid: QbKzIfgT3w==
lang: en_US
scan: 1600072446
fun: new
version: v2
其中 ticket,uuid,lang,scan 参数已经包含在上步存储的 redirect_uri 中了

response:

<error>
    <ret>0</ret>
    <message></message>
    <skey>@crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084</skey>
    <wxsid>nEdiHXamlIgS+eRL</wxsid>
    <wxuin>2137425061</wxuin>
    <pass_ticket>AAwe7cZ4AQF3Wy%2BVRL30nJy8WWePRwl9BWViTSP7gu8pyV61n70B96v4x%2FS8WY3s</pass_ticket>
    <isgrayscale>1</isgrayscale>
</error>

通过 xml 解析解析 response 信息, 可以获取 skey,sid,uin,pass_ticket 等值,并赋值构造 base_request,用于后续请求使用,其中 device_id 由字母’e’开头外加15位数字组成,代码实现:'e' + repr(random.random())[2:17]

url = self.redirect_uri + '&fun=new&version=v2'
r = self.session.get(url, headers=self.headers, allow_redirects=False, proxies=self.proxies)
nodes = xml.dom.minidom.parseString(r.text).documentElement.childNodes
for node in nodes:
    if node.nodeName == 'skey':
        self.skey = node.childNodes[0].data
        self.base_request['Skey'] = self.skey
    elif node.nodeName == 'wxsid':
        self.sid = node.childNodes[0].data
        self.base_request['Sid'] = self.sid
    elif node.nodeName == 'wxuin':
        self.uin = node.childNodes[0].data
        self.base_request['Uin'] = self.uin
    elif node.nodeName == 'pass_ticket':
        self.pass_ticket = node.childNodes[0].data
self.base_request['DeviceID'] = self.device_id

五、__initinate,获取最近联系人信息

request:

urlhttps://wx.qq.com/cgi-bin/mmwebwx-bin/webwxinit
methodPOST
paramsr: -1950363289 # 时间戳,-int(time.time())
pass_ticket: AAwe7cZ4AQF3Wy%252BVRL30nJy8WWePRwl9BWViTSP7gu8pyV61n70B96v4x%252FS8WY3s
dataBaseRequest: {
DeviceID: e911771485005848
Sid: nEdiHXamlIgS+eRL
Skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084
Uin: 2137425061
}

返回信息包含最近联系的联系人,公众号,自己的账户等信息。 注意 SyncKey 是不断变化的,每次发送请求时都使用前一次响应带回的 SyncKey 信息:

r = self.session.post(url, params=params, data=json.dumps(data), headers=headers, proxies=self.proxies)
r.encoding = 'utf-8'
dic = json.loads(r.text)
self.sync_key = dic['SyncKey']
self.sync_key_str = '|'.join([str(keyVal['Key']) + '_' + str(keyVal['Val']) for keyVal in self.sync_key['List']])
self.account_me = dic['User']

六、__status_notify,上报登录状态信息

request:

urlhttps://wx.qq.com/cgi-bin/mmwebwx-bin/webwxstatusnotify
methodPOST
paramspass_ticket: AAwe7cZ4AQF3Wy%252BVRL30nJy8WWePRwl9BWViTSP7gu8pyV61n70B96v4x%252FS8WY3s
dataBaseRequest: {
DeviceID: e911771485005848
Sid: nEdiHXamlIgS+eRL
Skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084
Uin: 2137425061
}
ClientMsgId: 1600072449311 # 时间戳,int(time.time() * 1000)
Code: 3
FromUserName: @b41046f864104defff92de8b4410bd301f792e1cecf55e1abcc7c607184a58d7 # self.account_me['UserName']
ToUserName: @b41046f864104defff92de8b4410bd301f792e1cecf55e1abcc7c607184a58d7 # self.account_me['UserName']

response:

BaseResponse: {
	ErrMsg: ""
	Ret: 0
}
MsgID: "662549729070653462"

通过 POST 请求上报登录状态到手机端:

self.session.post(url, params=params, data=json.dumps(data), headers=headers, proxies=self.proxies)

七、__get_contact,获取联系人信息

request:

urlhttps://wx.qq.com/cgi-bin/mmwebwx-bin/webwxgetcontact
methodPOST
paramspass_ticket: AAwe7cZ4AQF3Wy%2BVRL30nJy8WWePRwl9BWViTSP7gu8pyV61n70B96v4x%2FS8WY3s
r: 1950363289 # 时间戳,int(time.time() * 1000)
seq: 0
skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084

response:

BaseResponse: {
	ErrMsg: ""
	Ret: 0
}
MemberCount: 262
MemberList: [
	{
		"Uin": 0,
		"UserName": "@c2faf944f9d0655aa23f29e3cf3e2eda", # "@" 开头表示联系人或公众号, "@@" 开头表示群组
		"NickName": "xxx", # 昵称
		"HeadImgUrl": "/cgi-bin/mmwebwx-bin/webwxgeticon?seq=633328356&username=@c2faf944f9d0655aa23f29e3cf3e2eda&skey=@crypt_af16f3b1_9ce63c8962ee0c50e999f828d023b8ea", # 头像图片地址信息
		"ContactFlag": 1,
		"MemberCount": 0,
		"MemberList": [],
		"RemarkName": "", # 备注名称
		"HideInputBarFlag": 0,
		"Sex": 0, # 性别,0 - 未知,1 - 男,2 - 女
		"Signature": "", # 签名
		"VerifyFlag": 8, # 8,24,56 表示公众号或微信官方账号,0 表示联系人
		"OwnerUin": 0,
		"PYInitial": "YSXT",
		"PYQuanPin": "yinshuixintang",
		"RemarkPYInitial": "",
		"RemarkPYQuanPin": "",
		"StarFriend": 0, # 是否为星标好友
		"AppAccountFlag": 0,
		"Statues": 0,
		"AttrStatus": 0,
		"Province": "Liaoning",
		"City": "Panjin",
		"Alias": "",
		"SnsFlag": 0,
		"UniFriend": 0,
		"DisplayName": "", # 群组使用,表示在群组里的显示名称
		"ChatRoomId": 0,
		"KeyWord": "gh_",
		"EncryChatRoomId": "",
		"IsOwner": 0 # 群组使用,表示是否为群主
	},
	...
]
Seq: 0

返回所有联系人,公众号,群组信息,但是群组信息中不包括组员信息,需要使用另外的请求获取。如果返回的 Seq 字段不为 0,则表示还有信息未读取完毕,则需要继续发送请求,请求中 Seq 为上次响应带回的值

r = self.session.post(url, params=params, headers=headers, timeout=180, proxies=self.proxies)
r.encoding = 'utf-8'
dic = json.loads(r.text)
member_list.extend(dic['MemberList'])

while dic["Seq"] != 0:
    params['seq'] = dic["Seq"]
    r = self.session.post(url, params=params, headers=headers, timeout=180, proxies=self.proxies)
    r.encoding = 'utf-8'
    dic = json.loads(r.text)
    member_list.extend(dic['MemberList'])

for member in member_list:
    if member['UserName'].find('@@') != -1:
        self.account_groups[member['UserName']] = member   # not include detail members info
    elif member['VerifyFlag'] & 8 != 0:
        self.account_subscriptions[member['UserName']] = member  # include weixin,weixinzhifu
    else:
        self.account_contacts[member['UserName']] = member # include filehelper

八、__get_group_members,获取群组成员信息

request:

urlhttps://wx.qq.com/cgi-bin/mmwebwx-bin/webwxbatchgetcontact
methodPOST
paramstype: ex
r: 1950363289 # 时间戳,int(time.time() * 1000)
lang: en_US
dataBaseRequest: {
DeviceID: e911771485005848
Sid: nEdiHXamlIgS+eRL
Skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084
Uin: 2137425061
}
Count: 15 # len(self.account_groups)
List: grouplist # grouplist.append({'UserName':group['UserName'], 'ChatRoomId':group['ChatRoomId']}) for group in self.account_groups.values()

response:

BaseResponse: {
	ErrMsg: ""
	Ret: 0
}
Count: 15
ContactList: [
	{
		"Uin": 0,
		"UserName": "@@8cc2e21be6342b93a9d179f8d9465838f38989f3afb4bd17e3a8db42028c7a32", # "@@" 开头表示群组
		"NickName": "xxx", # 群组昵称
		"HeadImgUrl": "/cgi-bin/mmwebwx-bin/webwxgeticon?seq=633328356&username=@c2faf944f9d0655aa23f29e3cf3e2eda&skey=@crypt_af16f3b1_9ce63c8962ee0c50e999f828d023b8ea", # 头像图片地址信息
		"ContactFlag": 1,
		"MemberCount": 5, # 群组成员个数
		"MemberList": [ # 群组成员列表
			{
				"Uin": 0,
				"UserName": "@31d8eccdefdd22d15297a98c0d7e6387",
				"NickName": "xxx", # 昵称
				"AttrStatus": 33626215,
				"PYInitial": "",
				"PYQuanPin": "",
				"RemarkPYInitial": "",
				"RemarkPYQuanPin": "",
				"MemberStatus": 0,
				"DisplayName": "xxx", # 该成员设置的自己在该群的显示名称,没有设置则为""
				"KeyWord": "Smo"
			},
			{
				"Uin": 0,
				"UserName": "@24503c11a83b31b286a8441d700bc52c",
				"NickName": "xxx",
				"AttrStatus": 231463,
				"PYInitial": "",
				"PYQuanPin": "",
				"RemarkPYInitial": "",
				"RemarkPYQuanPin": "",
				"MemberStatus": 0,
				"DisplayName": "",
				"KeyWord": "zha"
			}
			...
		],
		"RemarkName": "", # 备注名称
		"HideInputBarFlag": 0,
		"Sex": 0, 
		"Signature": "", # 签名
		"VerifyFlag": 0,
		"OwnerUin": 0,
		"PYInitial": "YSXT",
		"PYQuanPin": "yinshuixintang",
		"RemarkPYInitial": "",
		"RemarkPYQuanPin": "",
		"StarFriend": 0,
		"AppAccountFlag": 0,
		"Statues": 0,
		"AttrStatus": 0,
		"Province": "Liaoning",
		"City": "Panjin",
		"Alias": "",
		"SnsFlag": 0,
		"UniFriend": 0,
		"DisplayName": "",
		"ChatRoomId": 0,
		"KeyWord": "gh_",
		"EncryChatRoomId": "",
		"IsOwner": 0 # 群组使用,表示是否为群主
	},
	...
]

返回群组和组员信息:

resp = self.session.post(url, params=params, data=json.dumps(data), headers=headers, proxies=self.proxies)
resp.encoding = 'utf-8'
dic = json.loads(resp.text)
for member in dic['ContactList']:
    self.account_groups_members[member['UserName']] = member

九、__sync_check,检查是否收到新消息

request:

urlhttps://webpush.wx.qq.com/cgi-bin/mmwebwx-bin/synccheck
methodGET
paramsr: 1950363289 # 时间戳,int(time.time() * 1000)
skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084
sid: nEdiHXamlIgS+eRL
uin: 2137425061
deviceid: e628001908664392
synckey: 1_724394418|2_724396065|3_724395965|1000_1600068197
_: 1600072411495 # 时间戳,int(time.time() * 1000)

response:

window.synccheck={retcode:"0",selector:"2"}

retcode: 
	0: 正常
	1101: 手机端退出网页微信
	x: xxx # 待支持
selector:
	0: 正常
	2: 收到新的消息
	x: xxx # 待支持

通过正则表达式获取上述返回值信息:

r = self.session.get(url, params=params, headers=self.headers, proxies=self.proxies)
data = re.search(r'window.synccheck=\{retcode:"(\d+)",selector:"(\d+)"\}', r.text)
retcode = data.group(1)
selector = data.group(2)
return retcode, selector

十、__webwx_sync,获取接收的新消息

request:

urlhttps://wx.qq.com/cgi-bin/mmwebwx-bin/webwxsync
methodPOST
paramssid: nEdiHXamlIgS+eRL
skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084
pass_ticket: AAwe7cZ4AQF3Wy%2BVRL30nJy8WWePRwl9BWViTSP7gu8pyV61n70B96v4x%2FS8WY3s
dataBaseRequest: {
DeviceID: e911771485005848
Sid: nEdiHXamlIgS+eRL
Skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084
Uin: 2137425061
}
SyncKey: {
Count: 4
List: [
{ Key: 1, Val: 724394418 },
{ Key: 2, Val: 724396065},
{ Key: 3, Val: 724395965},
{ Key: 1000, Val: 1600068197 }
]
}
rr: -1950351786 # 时间戳,-int(time.time())

response:

BaseResponse: {
	ErrMsg: ""
	Ret: 0
}
AddMsgCount: 3
AddMsgList: [
	{
		"MsgId": "2403776930812656442", # 多媒体资源Id,用于下载接收到的图片,语音,视频等资源
		"FromUserName": "@da500aa9163f99fe32bfe2439e2467bcde8bc61c619f696a92e9f98c9d14c7ce",
		"ToUserName": "filehelper",
		"MsgType": 1, # 消息类型,1 - 文本,3 - 图片,34 - 语音,42 - 名片,43 - 视频,47 - 表情,49 - 文件
		"Content": "hello world", # 消息内容
		"Status": 3,
		"ImgStatus": 1,
		"CreateTime": 1599529496,
		"VoiceLength": 0, # 语音长度,单位毫秒
		"PlayLength": 0,  # 视频长度,单位秒
		"FileName": "",   # 文件名
		"FileSize": "",   # 文件大小,单位 Byte
		"MediaId": "",
		"Url": "",
		"AppMsgType": 0,
		"StatusNotifyCode": 0,
		"StatusNotifyUserName": "",
		"RecommendInfo": {
			"UserName": "",
			"NickName": "",
			"QQNum": 0,
			"Province": "",
			"City": "",
			"Content": "",
			"Signature": "",
			"Alias": "",
			"Scene": 0,
			"VerifyFlag": 0,
			"AttrStatus": 0,
			"Sex": 0,
			"Ticket": "",
			"OpCode": 0
		},
		"ForwardFlag": 0,
		"AppInfo": {
			"AppID": "",
			"Type": 0
		},
		"HasProductId": 0,
		"Ticket": "",
		"ImgHeight": 0,
		"ImgWidth": 0,
		"SubMsgType": 0, # 子消息类型,如文本消息细分为普通文本,链接,位置
		"NewMsgId": 2403776930812656442,
		"OriContent": "",
		"EncryFileName": ""
	},
	...
]
DelContactCount: 0
DelContactList: []
ModContactCount: 0
ModContactList: []
ModChatRoomMemberCount: 0
ModChatRoomMemberList: []
SyncKey: { # 更新 SyncKey 信息,用于下次请求
	Count: 5,
	List: [
		{ Key: 1, Val: 724394418 },
		{ Key: 2, Val: 724396065},
		{ Key: 3, Val: 724395965},
		{ Key: 1000, Val: 1600068197 },
		{ Key: 1001, Val: 1600063457 }
	]
}

AddMsgList 即接收到的消息列表:

r = self.session.post(url, params=params, data=json.dumps(data), headers=headers, proxies=self.proxies)
r.encoding = 'utf-8'
dic = json.loads(r.text)
if dic['BaseResponse']['Ret'] == 0:
    self.sync_key = dic['SyncKey']
    self.sync_key_str = '|'.join([str(keyVal['Key']) + '_' + str(keyVal['Val']) for keyVal in self.sync_key['List']])
    return dic['AddMsgList']

十一、__parse_msg,解析消息

当前支持消息来源: 群组、公众号、联系人、自己
当前支持消息类型: 文本,位置,链接,图片,语音,视频,名片,表情,文件,撤回
不同消息类型所携带的字段也不同,具体如下:

类型字段
必有字段'senderType': 字符串类型,取值 “GROUP/SUBSCRIPTION/CONTACT/MYSELF/UNSUPPORTED”, 表示消息来源于群组/公众号/联系人/自己/不支持
'senderName': 字符串类型,表示发送者的身份,由系统分配,@@开头表示群组,@开头表示联系人或者公众号
'msgType': 字符串类型,取值 “TEXT/POSITION/IMAGE/VOICE/VIDEO/CARD/ANIMATION/FILE/REVOKE/UNSUPPORTED”, 表示消息类型是文本/位置/图片/语音/视频/名片/表情/文件/撤回/不支持
'msgId': 字符串类型,表示消息的唯一 id,由系统分配
senderType:
GROUP'groupNickName': 字符串类型,表示发送者所在的群组昵称
'userNickName': 字符串类型,表示发送者的昵称
'userDisplayName': 字符串类型,表示发送者设置的自己在该群的显示名称,没有则为 ‘’
'meIsAt': 布尔类型,表示自己是否被 @
SUBSCRIPTION'subscriptionNickName': 字符串类型,表示发送者公众号昵称
CONTACT'contactNickName': 字符串类型,表示发送者昵称
'contactRemarkName': 字符串类型,表示发送者备注名
MYSELF'myNickName': 字符串类型,表示自己的昵称
msgType:
TEXT'content': 字符串类型,表示接收到的消息内容
POSITION'x': 字符串类型,浮点数,表示纬度
'y': 字符串类型,浮点数,表示经度
'scale': 字符串类型,整数,表示缩放比例
'label': 字符串类型,表示位置的标签名称
'poiname': 字符串类型,表示位置的具体名称
IMAGE'imgHeight': 整数类型,表示图片高度
'imgWidth': 整数类型,表示图片宽度
'mediaId': 字符串类型,表示图片在服务器的资源 id,由系统分配,用于下载使用
'downloadFunc': 函数类型,表示下载图片的函数
调用 msg['downloadFunc'](msg),将下载图片到当前目录,保存文件名为 img_mediaId.jpg
VOICE'voiceLength': 整数类型,表示语音时长,单位毫秒
'mediaId': 字符串类型,表示图片在服务器的资源 id,由系统分配,用于下载使用
'downloadFunc': 函数类型,表示下载语音的函数
调用 msg['downloadFunc'](msg),将下载语音到当前目录,保存文件名为 voice_mediaId.mp3
VIDEO'imgHeight': 整数类型,表示视频高度
'imgWidth': 整数类型,表示视频宽度
'playLength': 整数类型,表示视频时长,单位秒
'mediaId': 字符串类型,表示视频在服务器的资源 id,由系统分配,用于下载使用
'downloadFunc': 函数类型,表示下载视频的函数
调用 msg['downloadFunc'](msg),将下载视频到当前目录,保存文件名为 video_mediaId.mp4
CARD'username': 字符串类型,表示微信号
'nickname': 字符串类型,表示昵称
'alias': 字符串类型,表示别名
'province': 字符串类型,表示省
'city': 字符串类型,表示城市
'sex': 字符串类型,表示性别,0-未知 1-男 2-女
'regionCode': 字符串类型,表示注册地
ANIMATION'imgHeight': 整数类型,表示表情高度
'imgWidth': 整数类型,表示表情宽度
FILE'fileName': 字符串类型,表示文件名
'encryFileName': 字符串类型,表示 encry 文件名
'fileSize': 字符串类型,表示文件大小,单位字节
'mediaId': 字符串类型,表示视频多媒体 id,由系统分配,用于下载使用
'downloadFunc': 函数类型,表示下载文件的函数
调用 msg['downloadFunc'](msg),将下载文件到当前目录,保存文件名为 ‘fileName’ 字段值
REVOKE'revokedMsgId': 字符串类型,表示被撤回的那条消息的 id
UNSUPPORTED没有可选字段

代码实现如下:

def __parse_group_msg(self, msg, parsed_msg):
    parsed_msg['userNickName'] = ''
    parsed_msg['userDisplayName'] = ''
    parsed_msg['meIsAt'] = False

    found_flag1 = False
    found_flag2 = False
    ret = re.match('(@[0-9a-z]*?):<br/>(.*)$', msg['Content']) # username:<br>text
    user_name, text = ret.groups()
    for item in self.account_groups_members[msg['FromUserName']]['MemberList']:
        if item['UserName'] == user_name:
            parsed_msg['userNickName'] = item['NickName']
            parsed_msg['userDisplayName'] = item['DisplayName']
            found_flag1 = True

        if item['UserName'] == self.account_me['UserName']:
            my_displayname_in_group = item['DisplayName']
            found_flag2 = True

        if found_flag1 and found_flag2:
            break

    str_at = "@" + self.account_me['NickName'] + '\u2005'
    if my_displayname_in_group:
        str_at = "@" + my_displayname_in_group + '\u2005'

    if text.find(str_at) != -1:
        parsed_msg['meIsAt'] = True

def __parse_msg(self, msg):
    parsed_msg = {}
    parsed_msg['senderType'] = 'UNSUPPORTED'
    parsed_msg['senderName'] = msg['FromUserName']
    parsed_msg['msgType'] = 'UNSUPPORTED'
    parsed_msg['msgId'] = msg['MsgId']

    if self.account_groups.__contains__(msg['FromUserName']):
        parsed_msg['senderType'] = 'GROUP'
        parsed_msg['groupNickName'] = self.account_groups[msg['FromUserName']]['NickName']
        self.__parse_group_msg(msg, parsed_msg) # parse userNickName/userDisplayName/meIsAt
    elif self.account_subscriptions.__contains__(msg['FromUserName']):
        parsed_msg['senderType'] = 'SUBSCRIPTION'
        parsed_msg['subscriptionNickName'] = self.account_subscriptions[msg['FromUserName']]['NickName']
    elif self.account_contacts.__contains__(msg['FromUserName']):
        parsed_msg['senderType'] = 'CONTACT'
        parsed_msg['contactNickName'] = self.account_contacts[msg['FromUserName']]['NickName']
        parsed_msg['contactRemarkName'] = self.account_contacts[msg['FromUserName']]['RemarkName']
    elif self.account_me['UserName'] == msg['FromUserName']:
        parsed_msg['senderType'] = 'MYSELF'
        parsed_msg['myNickName'] = self.account_me['NickName']

    msg_type = msg['MsgType']
    if msg_type == 1: # text/link/position
        sub_msg_type = msg['SubMsgType']
        if sub_msg_type == 0: # text/link
            parsed_msg['msgType'] = 'TEXT'
            parsed_msg['content'] = msg['Content']
            if parsed_msg['senderType'] == 'GROUP':
                ret = re.match('(@[0-9a-z]*?):<br/>(.*)$', msg['Content'])
                parsed_msg['content'] = ret.groups()[1] # delete sender username info
        elif sub_msg_type == 48: # position
            parsed_msg['msgType'] = 'POSITION'
            doc = xml.dom.minidom.parseString(msg['OriContent']).documentElement
            node = doc.getElementsByTagName('location')[0]
            parsed_msg['x'] = node.getAttribute('x')
            parsed_msg['y'] = node.getAttribute('y')
            parsed_msg['scale'] = node.getAttribute('scale')
            parsed_msg['label'] = node.getAttribute('label')
            parsed_msg['poiname'] = node.getAttribute('poiname')
    elif msg_type == 3: # image
        parsed_msg['msgType'] = 'IMAGE'
        parsed_msg['mediaId'] = msg['MsgId']
        parsed_msg['imgHeight'] = msg['ImgHeight']
        parsed_msg['imgWidth'] = msg['ImgWidth']
        parsed_msg['downloadFunc'] = self.__img_download
    elif msg_type == 34: # voice
        parsed_msg['msgType'] = 'VOICE'
        parsed_msg['mediaId'] = msg['MsgId']
        parsed_msg['voiceLength'] = msg['VoiceLength']
        parsed_msg['downloadFunc'] = self.__voice_download
    elif msg_type == 42: # card
        parsed_msg['msgType'] = 'CARD'
        content = html.unescape(msg['Content']) # TODO: delete emoji info
        content = content.replace('<br/>', '\n')
        doc = xml.dom.minidom.parseString(content).documentElement
        parsed_msg['username'] = doc.getAttribute('username')
        parsed_msg['nickname'] = doc.getAttribute('nickname')
        parsed_msg['alias'] = doc.getAttribute('alias')
        parsed_msg['province'] = doc.getAttribute('province')
        parsed_msg['city'] = doc.getAttribute('city')
        parsed_msg['sex'] = doc.getAttribute('sex')
        parsed_msg['regionCode'] = doc.getAttribute('regionCode')
    elif msg_type == 43: # video
        parsed_msg['msgType'] = 'VIDEO'
        parsed_msg['mediaId'] = msg['MsgId']
        parsed_msg['playLength'] = msg['PlayLength']
        parsed_msg['imgHeight'] = msg['ImgHeight']
        parsed_msg['imgWidth'] = msg['ImgWidth']
        parsed_msg['downloadFunc'] = self.__video_download
    elif msg_type == 47: # animation
        parsed_msg['msgType'] = 'ANIMATION'
        parsed_msg['imgHeight'] = msg['ImgHeight']
        parsed_msg['imgWidth'] = msg['ImgWidth']
    elif msg_type == 49: # attachment
        app_msg_type = msg['AppMsgType']
        if app_msg_type == 6: # file
            parsed_msg['msgType'] = 'FILE'
            parsed_msg['mediaId'] = msg['MediaId']
            parsed_msg['fileName'] = msg['FileName']
            parsed_msg['encryFileName'] = msg['EncryFileName']
            parsed_msg['fileSize'] = msg['FileSize']
            parsed_msg['downloadFunc'] = self.__file_download
    elif msg_type == 10002: # revoke
        parsed_msg['msgType'] = 'REVOKE'
        parsed_msg['revokedMsgId'] = re.search('&lt;msgid&gt;(.*?)&lt;', msg['Content']).group(1)

    return parsed_msg

十二、__img_download,下载接收的图片到本地

request:

urlhttps://wx.qq.com/cgi-bin/mmwebwx-bin/webwxgetmsgimg
methodGET
paramsMsgID: 2526013959745980362
skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084

通过 GET 消息获取图片内容,然后写入到本地文件:

url = 'https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxgetmsgimg?MsgID=%s&skey=%s'%(msg_id, self.skey)
resp = self.session.get(url, stream=True, headers=self.headers, proxies=self.proxies)
file_name = 'img_' + msg_id + '.jpg'
with open(file_name, 'wb') as fptr:
    fptr.write(resp.content)

十三、__voice_download,下载接收的语音到本地

request:

urlhttps://wx.qq.com/cgi-bin/mmwebwx-bin/webwxgetvoice
methodGET
paramsmsgID: 621497633495131085
skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084

通过 GET 消息获取语音内容,然后写入到本地文件:

url = 'https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxgetvoice?msgID=%s&skey=%s'%(msg_id, self.skey)
resp = self.session.get(url, stream=True, headers=self.headers, proxies=self.proxies)
file_name = 'voice_' + msg_id + '.mp3'
with open(file_name, 'wb') as fptr:
    fptr.write(resp.content)

十四、__video_download,下载接收的视频到本地

request:

urlhttps://wx.qq.com/cgi-bin/mmwebwx-bin/webwxgetvideo
methodGET
paramsmsgID: 6534098630059551744
skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084

通过 GET 消息获取视频内容,然后写入到本地文件:

url = 'https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxgetvideo?msgID=%s&skey=%s'%(msg_id, self.skey)
headers = {
    'Range': 'bytes=0-',
    'User-Agent' : self.headers['User-Agent']
}
resp = self.session.get(url, stream=True, headers=headers, proxies=self.proxies)
file_name = 'video_' + msg_id + '.mp4'
with open(file_name, 'wb') as fptr:
    fptr.write(resp.content)

十五、__file_download,下载接收的普通文件到本地

request:

urlhttps://file.wx.qq.com/cgi-bin/mmwebwx-bin/webwxgetmedia
methodGET
paramssender: @b41046f864104defff92de8b4410bd301f792e1cecf55e1abcc7c607184a58d7 # msg['senderName']
mediaid: crypt_7ffae8d8_63f76c1db9efd9e1fcb412a4bb9188b005dfb172e34b6172da265300f4d09b4f0f18c0aef911afdf6c731a8a7d574698bec55c453e2038fbc44a9c80f6392346e5d3c59f105179215ea1ea860e8a3b9ccd0782f6e7a646a67374ccaa010cd7369f1a896c3f395842675d5e02c7b3890c6f4dc4f88d97b7c5fc5bda15eb0cff5c132174484cc6ee133390b8fec97dbeb8db14d7b296de4621057a347651ec06cf5a0467182061b59814275b29b86fa81631e7b3f0eb2f295ebf9184fa1f429227aba5608bc2b6ed0caed50892861f77fcb5e24adcb67ec28fd0cbe887187b904c # msg['mediaId']
encryfilename: configmanager%2Ejson # msg['encryFileName']
fromuser: 2137425061 # self.uin
pass_ticket: AAwe7cZ4AQF3Wy%2BVRL30nJy8WWePRwl9BWViTSP7gu8pyV61n70B96v4x%2FS8WY3s
webwx_data_ticket: gSdr8RGWron61xkus54F8BE8

通过 GET 消息获取视频内容,然后写入到本地文件:

resp = self.session.get(url, params=params, stream=True, headers=self.headers, proxies=self.proxies)
with open(msg['fileName'], 'wb') as fptr:
    fptr.write(resp.content)

十六、__upload_media,上传多媒体资源

request:

urlhttps://file.wx.qq.com/cgi-bin/mmwebwx-bin/webwxuploadmedia
methodPOST
paramsf: json
filesid: WU_FILE_0 # 每上传一个多媒体文件,末尾序号 +1
name: test.mp4
type: video/mp4
lastModifiedDate: Thu Jan 16 14:07:27 2020
size: 959662
chunks: 2 # 总上传次数,每次上传2的19方次的字节数据
chunk: 0 # 上传的次数索引,如果只需上传一次,则不需要 chunk/chunks 字段
mediatype: video # 取值 pic, video, doc
uploadmediarequest: {
UploadType: 2
BaseRequest: {
DeviceID: e911771485005848
Sid: nEdiHXamlIgS+eRL
Skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084
Uin: 2137425061
}
ClientMediaId: 1600313075093 # str(get_timestamp()) + str(random.random())[2:6]
TotalLen: 959662
StartPos: 0
DataLen: 959662
MediaType: 4
FromUserName: @b41046f864104defff92de8b4410bd301f792e1cecf55e1abcc7c607184a58d7 # self.account_me['UserName']
ToUserName: filehelper
FileMd5: e29a55fe9213035ad92d8a40c0adc19
}
webwx_data_ticket: gSdr8RGWron61xkus54F8BE8
pass_ticket: AAwe7cZ4AQF3Wy%2BVRL30nJy8WWePRwl9BWViTSP7gu8pyV61n70B96v4x%2FS8WY3s
filename: (binary) # (os.path.basename(file_name), fptr.read(1 << 19), file_type.split('/')[1])

response:

BaseResponse: {
	ErrMsg: ""
	Ret: 0
}
MediaId: "@crypt_xxxxx", # 最后一次上传完成才返回 MediaId, 之前返回 ''
StartPos: 0

通过 POST 消息填充 files 字段上传多媒体资源:

files = {
    'id': (None, 'WU_FILE_%s' % str(self.file_index)),
    'name': (None, os.path.basename(file_name)),
    'type': (None, file_type),
    'lastModifiedDate': (None, '%s' % time.ctime(os.path.getmtime(file_name))),
    'size': (None, str(file_len)),
    'mediatype': (None, media_type),
    'uploadmediarequest': (None, json.dumps({
        'UploadType': 2,
        'BaseRequest': self.base_request,
        'ClientMediaId': get_msg_id(),
        'TotalLen': str(file_len),
        'StartPos': 0,
        'DataLen': str(file_len),
        'MediaType': 4,
        'FromUserName': self.account_me['UserName'],
        'ToUserName': to_user_name,
        'FileMd5': md5
    })),
    'webwx_data_ticket': (None, self.session.cookies['webwx_data_ticket']),
    'pass_ticket': (None, self.pass_ticket),
}

fptr = open(file_name, 'rb')
chunks = int((file_len - 1) / (1 << 19)) + 1 # one time upload 524288 bytes
if chunks > 1:
    for chunk in range(chunks):
        f_bytes = fptr.read(1 << 19)
        files['chunks'] = (None, str(chunks)) # only chunks > 1, have chunk&chunks IEs
        files['chunk'] = (None, str(chunk))
        files['filename'] = (os.path.basename(file_name), f_bytes, file_type.split('/')[1])
        resp = self.session.post(url, files=files, headers=self.headers, proxies=self.proxies)
else:
    f_bytes = fptr.read(1 << 19)
    files['filename'] = (os.path.basename(file_name), f_bytes, file_type.split('/')[1])
    resp = self.session.post(url, files=files, headers=self.headers, proxies=self.proxies)
dic = json.loads(resp.text)
fptr.close()
self.file_index += 1

return dic['MediaId']

十七、send_text,公有函数,发送文本消息

request:

urlhttps://wx.qq.com/cgi-bin/mmwebwx-bin/webwxsendmsg
methodPOST
paramspass_ticket: AAwe7cZ4AQF3Wy%2BVRL30nJy8WWePRwl9BWViTSP7gu8pyV61n70B96v4x%2FS8WY3s
dataBaseRequest: {
DeviceID: e911771485005848
Sid: nEdiHXamlIgS+eRL
Skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084
Uin: 2137425061
}
Msg: {
ClientMsgId: 16000731732860028 # str(int(time.time() * 1000)) + str(random.random())[2:6]
Content: hello world
FromUserName: @b41046f864104defff92de8b4410bd301f792e1cecf55e1abcc7c607184a58d7 # self.account_me['UserName']
LocalID: 16000731732860028 # str(int(time.time() * 1000)) + str(random.random())[2:6]
ToUserName: filehelper
Type: 1
}
Scene: 0

response:

BaseResponse: {
	ErrMsg: ""
	Ret: 0
}
MsgID: "7154066925497929403",
LocalID: "16000731732860028"

通过 POST 消息发送数据到联系人:

self.session.post(url, params=params, data=json.dumps(data, ensure_ascii=False).encode('utf8'), headers=headers, proxies=self.proxies)

十八、send_image,公有函数,发送图片消息

request:

urlhttps://wx.qq.com/cgi-bin/mmwebwx-bin/webwxsendmsgimg
methodPOST
paramsfunc: async
f: json
lang: en_US
pass_ticket: AAwe7cZ4AQF3Wy%2BVRL30nJy8WWePRwl9BWViTSP7gu8pyV61n70B96v4x%2FS8WY3s
dataBaseRequest: {
DeviceID: e911771485005848
Sid: nEdiHXamlIgS+eRL
Skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084
Uin: 2137425061
}
Msg: {
ClientMsgId: 16000731732860028 # str(int(time.time() * 1000)) + str(random.random())[2:6]
Content: ‘’
MediaId: @crypt_xxxxx # self.__upload_media()
FromUserName: @b41046f864104defff92de8b4410bd301f792e1cecf55e1abcc7c607184a58d7 # self.account_me['UserName']
LocalID: 16000731732860028 # str(int(time.time() * 1000)) + str(random.random())[2:6]
ToUserName: filehelper
Type: 3
}
Scene: 0

response:

BaseResponse: {
	ErrMsg: ""
	Ret: 0
}
MsgID: "7154066925497929403",
LocalID: "16000731732860028"

通过 POST 消息发送数据到联系人:

self.session.post(url, params=params, data=json.dumps(data, ensure_ascii=False).encode('utf8'), headers=headers, proxies=self.proxies)

十九、send_video,公有函数,发送视频消息

request:

urlhttps://wx.qq.com/cgi-bin/mmwebwx-bin/webwxsendvideomsg
methodPOST
paramsfunc: async
f: json
lang: en_US
pass_ticket: AAwe7cZ4AQF3Wy%2BVRL30nJy8WWePRwl9BWViTSP7gu8pyV61n70B96v4x%2FS8WY3s
dataBaseRequest: {
DeviceID: e911771485005848
Sid: nEdiHXamlIgS+eRL
Skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084
Uin: 2137425061
}
Msg: {
ClientMsgId: 16000731732860028 # str(int(time.time() * 1000)) + str(random.random())[2:6]
Content: ‘’
MediaId: @crypt_xxxxx # self.__upload_media()
FromUserName: @b41046f864104defff92de8b4410bd301f792e1cecf55e1abcc7c607184a58d7 # self.account_me['UserName']
LocalID: 16000731732860028 # str(int(time.time() * 1000)) + str(random.random())[2:6]
ToUserName: filehelper
Type: 43
}
Scene: 0

response:

BaseResponse: {
	ErrMsg: ""
	Ret: 0
}
MsgID: "7154066925497929403",
LocalID: "16000731732860028"

通过 POST 消息发送数据到联系人:

self.session.post(url, params=params, data=json.dumps(data, ensure_ascii=False).encode('utf8'), headers=headers, proxies=self.proxies)

二十、send_file,公有函数,发送普通文件消息

request:

urlhttps://wx.qq.com/cgi-bin/mmwebwx-bin/webwxsendappmsg
methodPOST
paramsfunc: async
f: json
lang: en_US
pass_ticket: AAwe7cZ4AQF3Wy%2BVRL30nJy8WWePRwl9BWViTSP7gu8pyV61n70B96v4x%2FS8WY3s
dataBaseRequest: {
DeviceID: e911771485005848
Sid: nEdiHXamlIgS+eRL
Skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084
Uin: 2137425061
}
Msg: {
ClientMsgId: 16000731732860028 # str(int(time.time() * 1000)) + str(random.random())[2:6]
Content: xxx # 由文件名,文件大小,mediaId组成的 xml 数据
FromUserName: @b41046f864104defff92de8b4410bd301f792e1cecf55e1abcc7c607184a58d7 # self.account_me['UserName']
LocalID: 16000731732860028 # str(int(time.time() * 1000)) + str(random.random())[2:6]
ToUserName: filehelper
Type: 6
}
Scene: 0

response:

BaseResponse: {
	ErrMsg: ""
	Ret: 0
}
MsgID: "7154066925497929403",
LocalID: "16000731732860028"

通过 POST 消息发送数据到联系人:

self.session.post(url, params=params, data=json.dumps(data, ensure_ascii=False).encode('utf8'), headers=headers, proxies=self.proxies)

二十一、login,公有函数,扫码登录或缓存自动登录

该函数接收一个参数,用以支持不同的登录方式:

  1. enable_relogin
    如果为 True,则首先加载上次登录时保存的缓存信息,如果缓存仍然有效,则无需再次扫码登录,直接可以进行消息收发处理;如果缓存信息失效,或者第一次登录没有缓存信息,则开始扫码登录过程
    如果为 False,则不读取缓存信息,直接扫码登录
    该参数默认取值 True,表示使用缓存自动登录
def login(self, enable_relogin=True):
    if enable_relogin == False or self.__load_pickle() == False:
        self.__get_uuid()
        self.__gen_qrcode()
        self.__login()
        self.__get_params()
        self.__initinate()
        self.__status_notify()
        self.__get_contact()
        self.__get_group_members()
        if enable_relogin:
            self.__save_pickle()
     print('login success')

二十二、run,公有函数,循环接收并处理消息

循环处理以下步骤:

  1. 调用 __sync_check,检查是否有接收到新消息
  2. 调用 __parse_msg 进行接收消息解析
  3. 检查接收的是否是自己发送的控制命令
    如果收到自己发送的消息内容是 "enable",则调用 __process_msg 进行消息处理
    如果收到自己发送的消息内容是 "disable",则不进行任何消息处理
    如果收到自己发送的消息内容是 "logout",则退出登录
    如果手机端点击退出登录,则退出登录
def run(self):
    handle_enable = True

    while True:
        retcode, selector = self.__sync_check()
        if retcode == '0':
            if selector == '2': # recv new msg
                msg_list = self.__webwx_sync()
                for msg in msg_list:
                    parsed_msg = self.__parse_msg(msg)
                    if parsed_msg['senderType'] == 'MYSELF' and parsed_msg['msgType'] == 'TEXT':
                        if parsed_msg['content'] == 'enable':
                            handle_enable = True
                            print('enable msg handle')
                            continue
                        elif parsed_msg['content'] == 'disable':
                            handle_enable = False
                            print('disable msg handle')
                            continue
                        elif parsed_msg['content'] == 'logout':
                            self.logout()
                            return

                    if handle_enable:
                        self.__handle_msg(parsed_msg)
        elif retcode == '1101': # logout by phone
            self.__delete_pickle()
            print('logout success')
            return
        else:
            print('unsupported retcode:%s' %retcode)
        time.sleep(1)

二十三、register_msg_handle,公有函数,自定义消息处理函数

__process_msg 默认消息处理函数,只是打印解析后的消息内容,用户可以通过 register_msg_handle 注册自定义消息处理函数替换默认处理函数:

import webwx

def msg_handle(self, msg):
	if msg['msgType'] == 'TEXT' and msg['senderType'] == 'CONTACT':
        if msg['contactRemarkName']:
            print(msg['contactRemarkName'])
        else:
            print(msg['contactNickName'])
        print('    ' + msg['content'])

weChat = webwx.webwx()
weChat.register_msg_handle(msg_handle)
weChat.login()
weChat.run()

源码

https://github.com/chenwenhuiGithub/pythonScript/tree/master/webwx

待实现的功能

  1. 优化缓存,提高加载效率
  2. emoji 表情内容过滤
  3. 增加异常处理
  • 2
    点赞
  • 13
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值