文章目录
- 前言
- 一、__get_uuid,获取 uuid
- 二、__gen_qrcode,生成二维码
- 三、__login,手机扫码登录
- 四、__get_params,获取登录参数信息
- 五、__initinate,获取最近联系人信息
- 六、__status_notify,上报登录状态信息
- 七、__get_contact,获取联系人信息
- 八、__get_group_members,获取群组成员信息
- 九、__sync_check,检查是否收到新消息
- 十、__webwx_sync,获取接收的新消息
- 十一、__parse_msg,解析消息
- 十二、__img_download,下载接收的图片到本地
- 十三、__voice_download,下载接收的语音到本地
- 十四、__video_download,下载接收的视频到本地
- 十五、__file_download,下载接收的普通文件到本地
- 十六、__upload_media,上传多媒体资源
- 十七、send_text,公有函数,发送文本消息
- 十八、send_image,公有函数,发送图片消息
- 十九、send_video,公有函数,发送视频消息
- 二十、send_file,公有函数,发送普通文件消息
- 二十一、login,公有函数,扫码登录或缓存自动登录
- 二十二、run,公有函数,循环接收并处理消息
- 二十三、register_msg_handle,公有函数,自定义消息处理函数
- 源码
- 待实现的功能
前言
之前用 python 实现了一个微信客户端,支持功能:二维码扫码登录、缓存自动登录、联系人/群组/公众号的消息接收、文本/图片/语音/视频/位置/表情/文件/撤回等消息类型的解析
本篇文章作为系列第一篇文章,将分析实现过程中各步骤的协议以及代码实现主体
系列其它文章请参考:
python web微信应用(二) webwx 模块源码
python web微信应用(五) 自动下载接收的图片/语音/视频
一、__get_uuid,获取 uuid
request:
url | https://login.wx.qq.com/jslogin |
---|---|
method | GET |
params | appid: wx782c26e4c19acffb # 固定值 redirect_uri: https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxnewloginpage # 固定值 fun: new lang: en_US _: 1600072411492 # 时间戳,int(time.time() * 1000) |
response:
window.QRLogin.code = 200; window.QRLogin.uuid = "QbKzIfgT3w==";
通过正则表达式获取 uuid 值:
r = self.session.get(url, params=params, headers=self.headers, proxies=self.proxies)
regx = r'window.QRLogin.code = (\d+); window.QRLogin.uuid = "(\S+?)";'
data = re.search(regx, r.text)
if data.group(1) == '200':
self.uuid = data.group(2)
二、__gen_qrcode,生成二维码
直接打印 url 二维码到终端,等待手机端扫码登录:
url = 'https://login.weixin.qq.com/l/' + self.uuid
qr = qrcode.QRCode()
qr.add_data(url)
qr.print_ascii(invert=False)
三、__login,手机扫码登录
request:
url | https://login.wx.qq.com/jslogin |
---|---|
method | GET |
params | loginicon: true uuid: QbKzIfgT3w== tip: 1 # 0 - 表示已扫码,1 - 表示未扫码 r: -1950363289 # 时间戳,-int(time.time()) _: 1600072411493 # 时间戳,int(time.time() * 1000) |
登录过程需要两次 GET 请求:
第一次是扫码,tip=1
,返回码为 201, 表示扫码成功
response:
window.code=201;window.userAvatar = 'xxx';
第二次是登录,tip=0
,返回码为 200, 表示登录成功
response:
window.code=200;
window.redirect_uri="https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxnewloginpage?ticket=AS7a6pAQk9lfSpMhdCh_gfhs@qrticket_0&uuid=QbKzIfgT3w==&lang=en_US&scan=1600072446";
通过正则表达式可以获取 redirect_uri
信息,用于后续请求使用:
tip = 1
while not self.isLogin:
r = self.session.get(url, params=params, headers=self.headers, proxies=self.proxies)
data = re.search(r'window.code=(\d+)', r.text)
if data.group(1) == '408': # timeout
tip = 1
elif data.group(1) == '201': # scaned
tip = 0
print("scan success")
elif data.group(1) == '200': # success
param = re.search(r'window.redirect_uri="(\S+?)";', r.text)
self.redirect_uri = param.group(1)
self.isLogin = True
print("login success")
四、__get_params,获取登录参数信息
request:
url | https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxnewloginpage |
---|---|
method | GET |
params | ticket: AS7a6pAQk9lfSpMhdCh_gfhs@qrticket_0 uuid: QbKzIfgT3w== lang: en_US scan: 1600072446 fun: new version: v2 |
其中 ticket,uuid,lang,scan 参数已经包含在上步存储的 redirect_uri 中了 |
response:
<error>
<ret>0</ret>
<message></message>
<skey>@crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084</skey>
<wxsid>nEdiHXamlIgS+eRL</wxsid>
<wxuin>2137425061</wxuin>
<pass_ticket>AAwe7cZ4AQF3Wy%2BVRL30nJy8WWePRwl9BWViTSP7gu8pyV61n70B96v4x%2FS8WY3s</pass_ticket>
<isgrayscale>1</isgrayscale>
</error>
通过 xml 解析解析 response 信息, 可以获取 skey,sid,uin,pass_ticket
等值,并赋值构造 base_request
,用于后续请求使用,其中 device_id
由字母’e’开头外加15位数字组成,代码实现:'e' + repr(random.random())[2:17]
url = self.redirect_uri + '&fun=new&version=v2'
r = self.session.get(url, headers=self.headers, allow_redirects=False, proxies=self.proxies)
nodes = xml.dom.minidom.parseString(r.text).documentElement.childNodes
for node in nodes:
if node.nodeName == 'skey':
self.skey = node.childNodes[0].data
self.base_request['Skey'] = self.skey
elif node.nodeName == 'wxsid':
self.sid = node.childNodes[0].data
self.base_request['Sid'] = self.sid
elif node.nodeName == 'wxuin':
self.uin = node.childNodes[0].data
self.base_request['Uin'] = self.uin
elif node.nodeName == 'pass_ticket':
self.pass_ticket = node.childNodes[0].data
self.base_request['DeviceID'] = self.device_id
五、__initinate,获取最近联系人信息
request:
url | https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxinit |
---|---|
method | POST |
params | r: -1950363289 # 时间戳,-int(time.time()) pass_ticket: AAwe7cZ4AQF3Wy%252BVRL30nJy8WWePRwl9BWViTSP7gu8pyV61n70B96v4x%252FS8WY3s |
data | BaseRequest: { DeviceID: e911771485005848 Sid: nEdiHXamlIgS+eRL Skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084 Uin: 2137425061 } |
返回信息包含最近联系的联系人,公众号,自己的账户等信息。 注意 SyncKey
是不断变化的,每次发送请求时都使用前一次响应带回的 SyncKey
信息:
r = self.session.post(url, params=params, data=json.dumps(data), headers=headers, proxies=self.proxies)
r.encoding = 'utf-8'
dic = json.loads(r.text)
self.sync_key = dic['SyncKey']
self.sync_key_str = '|'.join([str(keyVal['Key']) + '_' + str(keyVal['Val']) for keyVal in self.sync_key['List']])
self.account_me = dic['User']
六、__status_notify,上报登录状态信息
request:
url | https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxstatusnotify |
---|---|
method | POST |
params | pass_ticket: AAwe7cZ4AQF3Wy%252BVRL30nJy8WWePRwl9BWViTSP7gu8pyV61n70B96v4x%252FS8WY3s |
data | BaseRequest: { DeviceID: e911771485005848 Sid: nEdiHXamlIgS+eRL Skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084 Uin: 2137425061 } ClientMsgId: 1600072449311 # 时间戳,int(time.time() * 1000) Code: 3 FromUserName: @b41046f864104defff92de8b4410bd301f792e1cecf55e1abcc7c607184a58d7 # self.account_me['UserName'] ToUserName: @b41046f864104defff92de8b4410bd301f792e1cecf55e1abcc7c607184a58d7 # self.account_me['UserName'] |
response:
BaseResponse: {
ErrMsg: ""
Ret: 0
}
MsgID: "662549729070653462"
通过 POST 请求上报登录状态到手机端:
self.session.post(url, params=params, data=json.dumps(data), headers=headers, proxies=self.proxies)
七、__get_contact,获取联系人信息
request:
url | https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxgetcontact |
---|---|
method | POST |
params | pass_ticket: AAwe7cZ4AQF3Wy%2BVRL30nJy8WWePRwl9BWViTSP7gu8pyV61n70B96v4x%2FS8WY3s r: 1950363289 # 时间戳,int(time.time() * 1000) seq: 0 skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084 |
response:
BaseResponse: {
ErrMsg: ""
Ret: 0
}
MemberCount: 262
MemberList: [
{
"Uin": 0,
"UserName": "@c2faf944f9d0655aa23f29e3cf3e2eda", # "@" 开头表示联系人或公众号, "@@" 开头表示群组
"NickName": "xxx", # 昵称
"HeadImgUrl": "/cgi-bin/mmwebwx-bin/webwxgeticon?seq=633328356&username=@c2faf944f9d0655aa23f29e3cf3e2eda&skey=@crypt_af16f3b1_9ce63c8962ee0c50e999f828d023b8ea", # 头像图片地址信息
"ContactFlag": 1,
"MemberCount": 0,
"MemberList": [],
"RemarkName": "", # 备注名称
"HideInputBarFlag": 0,
"Sex": 0, # 性别,0 - 未知,1 - 男,2 - 女
"Signature": "", # 签名
"VerifyFlag": 8, # 8,24,56 表示公众号或微信官方账号,0 表示联系人
"OwnerUin": 0,
"PYInitial": "YSXT",
"PYQuanPin": "yinshuixintang",
"RemarkPYInitial": "",
"RemarkPYQuanPin": "",
"StarFriend": 0, # 是否为星标好友
"AppAccountFlag": 0,
"Statues": 0,
"AttrStatus": 0,
"Province": "Liaoning",
"City": "Panjin",
"Alias": "",
"SnsFlag": 0,
"UniFriend": 0,
"DisplayName": "", # 群组使用,表示在群组里的显示名称
"ChatRoomId": 0,
"KeyWord": "gh_",
"EncryChatRoomId": "",
"IsOwner": 0 # 群组使用,表示是否为群主
},
...
]
Seq: 0
返回所有联系人,公众号,群组信息,但是群组信息中不包括组员信息,需要使用另外的请求获取。如果返回的 Seq
字段不为 0,则表示还有信息未读取完毕,则需要继续发送请求,请求中 Seq
为上次响应带回的值
r = self.session.post(url, params=params, headers=headers, timeout=180, proxies=self.proxies)
r.encoding = 'utf-8'
dic = json.loads(r.text)
member_list.extend(dic['MemberList'])
while dic["Seq"] != 0:
params['seq'] = dic["Seq"]
r = self.session.post(url, params=params, headers=headers, timeout=180, proxies=self.proxies)
r.encoding = 'utf-8'
dic = json.loads(r.text)
member_list.extend(dic['MemberList'])
for member in member_list:
if member['UserName'].find('@@') != -1:
self.account_groups[member['UserName']] = member # not include detail members info
elif member['VerifyFlag'] & 8 != 0:
self.account_subscriptions[member['UserName']] = member # include weixin,weixinzhifu
else:
self.account_contacts[member['UserName']] = member # include filehelper
八、__get_group_members,获取群组成员信息
request:
url | https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxbatchgetcontact |
---|---|
method | POST |
params | type: ex r: 1950363289 # 时间戳,int(time.time() * 1000) lang: en_US |
data | BaseRequest: { DeviceID: e911771485005848 Sid: nEdiHXamlIgS+eRL Skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084 Uin: 2137425061 } Count: 15 # len(self.account_groups) List: grouplist # grouplist.append({'UserName':group['UserName'], 'ChatRoomId':group['ChatRoomId']}) for group in self.account_groups.values() |
response:
BaseResponse: {
ErrMsg: ""
Ret: 0
}
Count: 15
ContactList: [
{
"Uin": 0,
"UserName": "@@8cc2e21be6342b93a9d179f8d9465838f38989f3afb4bd17e3a8db42028c7a32", # "@@" 开头表示群组
"NickName": "xxx", # 群组昵称
"HeadImgUrl": "/cgi-bin/mmwebwx-bin/webwxgeticon?seq=633328356&username=@c2faf944f9d0655aa23f29e3cf3e2eda&skey=@crypt_af16f3b1_9ce63c8962ee0c50e999f828d023b8ea", # 头像图片地址信息
"ContactFlag": 1,
"MemberCount": 5, # 群组成员个数
"MemberList": [ # 群组成员列表
{
"Uin": 0,
"UserName": "@31d8eccdefdd22d15297a98c0d7e6387",
"NickName": "xxx", # 昵称
"AttrStatus": 33626215,
"PYInitial": "",
"PYQuanPin": "",
"RemarkPYInitial": "",
"RemarkPYQuanPin": "",
"MemberStatus": 0,
"DisplayName": "xxx", # 该成员设置的自己在该群的显示名称,没有设置则为""
"KeyWord": "Smo"
},
{
"Uin": 0,
"UserName": "@24503c11a83b31b286a8441d700bc52c",
"NickName": "xxx",
"AttrStatus": 231463,
"PYInitial": "",
"PYQuanPin": "",
"RemarkPYInitial": "",
"RemarkPYQuanPin": "",
"MemberStatus": 0,
"DisplayName": "",
"KeyWord": "zha"
}
...
],
"RemarkName": "", # 备注名称
"HideInputBarFlag": 0,
"Sex": 0,
"Signature": "", # 签名
"VerifyFlag": 0,
"OwnerUin": 0,
"PYInitial": "YSXT",
"PYQuanPin": "yinshuixintang",
"RemarkPYInitial": "",
"RemarkPYQuanPin": "",
"StarFriend": 0,
"AppAccountFlag": 0,
"Statues": 0,
"AttrStatus": 0,
"Province": "Liaoning",
"City": "Panjin",
"Alias": "",
"SnsFlag": 0,
"UniFriend": 0,
"DisplayName": "",
"ChatRoomId": 0,
"KeyWord": "gh_",
"EncryChatRoomId": "",
"IsOwner": 0 # 群组使用,表示是否为群主
},
...
]
返回群组和组员信息:
resp = self.session.post(url, params=params, data=json.dumps(data), headers=headers, proxies=self.proxies)
resp.encoding = 'utf-8'
dic = json.loads(resp.text)
for member in dic['ContactList']:
self.account_groups_members[member['UserName']] = member
九、__sync_check,检查是否收到新消息
request:
url | https://webpush.wx.qq.com/cgi-bin/mmwebwx-bin/synccheck |
---|---|
method | GET |
params | r: 1950363289 # 时间戳,int(time.time() * 1000) skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084 sid: nEdiHXamlIgS+eRL uin: 2137425061 deviceid: e628001908664392 synckey: 1_724394418|2_724396065|3_724395965|1000_1600068197 _: 1600072411495 # 时间戳,int(time.time() * 1000) |
response:
window.synccheck={retcode:"0",selector:"2"}
retcode:
0: 正常
1101: 手机端退出网页微信
x: xxx # 待支持
selector:
0: 正常
2: 收到新的消息
x: xxx # 待支持
通过正则表达式获取上述返回值信息:
r = self.session.get(url, params=params, headers=self.headers, proxies=self.proxies)
data = re.search(r'window.synccheck=\{retcode:"(\d+)",selector:"(\d+)"\}', r.text)
retcode = data.group(1)
selector = data.group(2)
return retcode, selector
十、__webwx_sync,获取接收的新消息
request:
url | https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxsync |
---|---|
method | POST |
params | sid: nEdiHXamlIgS+eRL skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084 pass_ticket: AAwe7cZ4AQF3Wy%2BVRL30nJy8WWePRwl9BWViTSP7gu8pyV61n70B96v4x%2FS8WY3s |
data | BaseRequest: { DeviceID: e911771485005848 Sid: nEdiHXamlIgS+eRL Skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084 Uin: 2137425061 } SyncKey: { Count: 4 List: [ { Key: 1, Val: 724394418 }, { Key: 2, Val: 724396065}, { Key: 3, Val: 724395965}, { Key: 1000, Val: 1600068197 } ] } rr: -1950351786 # 时间戳,-int(time.time()) |
response:
BaseResponse: {
ErrMsg: ""
Ret: 0
}
AddMsgCount: 3
AddMsgList: [
{
"MsgId": "2403776930812656442", # 多媒体资源Id,用于下载接收到的图片,语音,视频等资源
"FromUserName": "@da500aa9163f99fe32bfe2439e2467bcde8bc61c619f696a92e9f98c9d14c7ce",
"ToUserName": "filehelper",
"MsgType": 1, # 消息类型,1 - 文本,3 - 图片,34 - 语音,42 - 名片,43 - 视频,47 - 表情,49 - 文件
"Content": "hello world", # 消息内容
"Status": 3,
"ImgStatus": 1,
"CreateTime": 1599529496,
"VoiceLength": 0, # 语音长度,单位毫秒
"PlayLength": 0, # 视频长度,单位秒
"FileName": "", # 文件名
"FileSize": "", # 文件大小,单位 Byte
"MediaId": "",
"Url": "",
"AppMsgType": 0,
"StatusNotifyCode": 0,
"StatusNotifyUserName": "",
"RecommendInfo": {
"UserName": "",
"NickName": "",
"QQNum": 0,
"Province": "",
"City": "",
"Content": "",
"Signature": "",
"Alias": "",
"Scene": 0,
"VerifyFlag": 0,
"AttrStatus": 0,
"Sex": 0,
"Ticket": "",
"OpCode": 0
},
"ForwardFlag": 0,
"AppInfo": {
"AppID": "",
"Type": 0
},
"HasProductId": 0,
"Ticket": "",
"ImgHeight": 0,
"ImgWidth": 0,
"SubMsgType": 0, # 子消息类型,如文本消息细分为普通文本,链接,位置
"NewMsgId": 2403776930812656442,
"OriContent": "",
"EncryFileName": ""
},
...
]
DelContactCount: 0
DelContactList: []
ModContactCount: 0
ModContactList: []
ModChatRoomMemberCount: 0
ModChatRoomMemberList: []
SyncKey: { # 更新 SyncKey 信息,用于下次请求
Count: 5,
List: [
{ Key: 1, Val: 724394418 },
{ Key: 2, Val: 724396065},
{ Key: 3, Val: 724395965},
{ Key: 1000, Val: 1600068197 },
{ Key: 1001, Val: 1600063457 }
]
}
AddMsgList
即接收到的消息列表:
r = self.session.post(url, params=params, data=json.dumps(data), headers=headers, proxies=self.proxies)
r.encoding = 'utf-8'
dic = json.loads(r.text)
if dic['BaseResponse']['Ret'] == 0:
self.sync_key = dic['SyncKey']
self.sync_key_str = '|'.join([str(keyVal['Key']) + '_' + str(keyVal['Val']) for keyVal in self.sync_key['List']])
return dic['AddMsgList']
十一、__parse_msg,解析消息
当前支持消息来源: 群组、公众号、联系人、自己
当前支持消息类型: 文本,位置,链接,图片,语音,视频,名片,表情,文件,撤回
不同消息类型所携带的字段也不同,具体如下:
类型 | 字段 |
---|---|
必有字段 | 'senderType': 字符串类型,取值 “GROUP/SUBSCRIPTION/CONTACT/MYSELF/UNSUPPORTED”, 表示消息来源于群组/公众号/联系人/自己/不支持 'senderName': 字符串类型,表示发送者的身份,由系统分配,@@开头表示群组,@开头表示联系人或者公众号 'msgType': 字符串类型,取值 “TEXT/POSITION/IMAGE/VOICE/VIDEO/CARD/ANIMATION/FILE/REVOKE/UNSUPPORTED”, 表示消息类型是文本/位置/图片/语音/视频/名片/表情/文件/撤回/不支持 'msgId': 字符串类型,表示消息的唯一 id,由系统分配 |
senderType: | |
GROUP | 'groupNickName': 字符串类型,表示发送者所在的群组昵称 'userNickName': 字符串类型,表示发送者的昵称 'userDisplayName' : 字符串类型,表示发送者设置的自己在该群的显示名称,没有则为 ‘’ 'meIsAt': 布尔类型,表示自己是否被 @ |
SUBSCRIPTION | 'subscriptionNickName': 字符串类型,表示发送者公众号昵称 |
CONTACT | 'contactNickName': 字符串类型,表示发送者昵称 'contactRemarkName': 字符串类型,表示发送者备注名 |
MYSELF | 'myNickName': 字符串类型,表示自己的昵称 |
msgType: | |
TEXT | 'content': 字符串类型,表示接收到的消息内容 |
POSITION | 'x': 字符串类型,浮点数,表示纬度 'y': 字符串类型,浮点数,表示经度 'scale': 字符串类型,整数,表示缩放比例 'label': 字符串类型,表示位置的标签名称 'poiname': 字符串类型,表示位置的具体名称 |
IMAGE | 'imgHeight': 整数类型,表示图片高度 'imgWidth': 整数类型,表示图片宽度 'mediaId': 字符串类型,表示图片在服务器的资源 id,由系统分配,用于下载使用 'downloadFunc': 函数类型,表示下载图片的函数 调用 msg['downloadFunc'](msg) ,将下载图片到当前目录,保存文件名为 img_mediaId.jpg |
VOICE | 'voiceLength': 整数类型,表示语音时长,单位毫秒 'mediaId': 字符串类型,表示图片在服务器的资源 id,由系统分配,用于下载使用 'downloadFunc': 函数类型,表示下载语音的函数 调用 msg['downloadFunc'](msg) ,将下载语音到当前目录,保存文件名为 voice_mediaId.mp3 |
VIDEO | 'imgHeight': 整数类型,表示视频高度 'imgWidth': 整数类型,表示视频宽度 'playLength': 整数类型,表示视频时长,单位秒 'mediaId': 字符串类型,表示视频在服务器的资源 id,由系统分配,用于下载使用 'downloadFunc': 函数类型,表示下载视频的函数 调用 msg['downloadFunc'](msg) ,将下载视频到当前目录,保存文件名为 video_mediaId.mp4 |
CARD | 'username': 字符串类型,表示微信号 'nickname': 字符串类型,表示昵称 'alias': 字符串类型,表示别名 'province': 字符串类型,表示省 'city': 字符串类型,表示城市 'sex': 字符串类型,表示性别,0-未知 1-男 2-女 'regionCode': 字符串类型,表示注册地 |
ANIMATION | 'imgHeight': 整数类型,表示表情高度 'imgWidth': 整数类型,表示表情宽度 |
FILE | 'fileName': 字符串类型,表示文件名 'encryFileName': 字符串类型,表示 encry 文件名 'fileSize': 字符串类型,表示文件大小,单位字节 'mediaId': 字符串类型,表示视频多媒体 id,由系统分配,用于下载使用 'downloadFunc': 函数类型,表示下载文件的函数 调用 msg['downloadFunc'](msg) ,将下载文件到当前目录,保存文件名为 ‘fileName’ 字段值 |
REVOKE | 'revokedMsgId': 字符串类型,表示被撤回的那条消息的 id |
UNSUPPORTED | 没有可选字段 |
代码实现如下:
def __parse_group_msg(self, msg, parsed_msg):
parsed_msg['userNickName'] = ''
parsed_msg['userDisplayName'] = ''
parsed_msg['meIsAt'] = False
found_flag1 = False
found_flag2 = False
ret = re.match('(@[0-9a-z]*?):<br/>(.*)$', msg['Content']) # username:<br>text
user_name, text = ret.groups()
for item in self.account_groups_members[msg['FromUserName']]['MemberList']:
if item['UserName'] == user_name:
parsed_msg['userNickName'] = item['NickName']
parsed_msg['userDisplayName'] = item['DisplayName']
found_flag1 = True
if item['UserName'] == self.account_me['UserName']:
my_displayname_in_group = item['DisplayName']
found_flag2 = True
if found_flag1 and found_flag2:
break
str_at = "@" + self.account_me['NickName'] + '\u2005'
if my_displayname_in_group:
str_at = "@" + my_displayname_in_group + '\u2005'
if text.find(str_at) != -1:
parsed_msg['meIsAt'] = True
def __parse_msg(self, msg):
parsed_msg = {}
parsed_msg['senderType'] = 'UNSUPPORTED'
parsed_msg['senderName'] = msg['FromUserName']
parsed_msg['msgType'] = 'UNSUPPORTED'
parsed_msg['msgId'] = msg['MsgId']
if self.account_groups.__contains__(msg['FromUserName']):
parsed_msg['senderType'] = 'GROUP'
parsed_msg['groupNickName'] = self.account_groups[msg['FromUserName']]['NickName']
self.__parse_group_msg(msg, parsed_msg) # parse userNickName/userDisplayName/meIsAt
elif self.account_subscriptions.__contains__(msg['FromUserName']):
parsed_msg['senderType'] = 'SUBSCRIPTION'
parsed_msg['subscriptionNickName'] = self.account_subscriptions[msg['FromUserName']]['NickName']
elif self.account_contacts.__contains__(msg['FromUserName']):
parsed_msg['senderType'] = 'CONTACT'
parsed_msg['contactNickName'] = self.account_contacts[msg['FromUserName']]['NickName']
parsed_msg['contactRemarkName'] = self.account_contacts[msg['FromUserName']]['RemarkName']
elif self.account_me['UserName'] == msg['FromUserName']:
parsed_msg['senderType'] = 'MYSELF'
parsed_msg['myNickName'] = self.account_me['NickName']
msg_type = msg['MsgType']
if msg_type == 1: # text/link/position
sub_msg_type = msg['SubMsgType']
if sub_msg_type == 0: # text/link
parsed_msg['msgType'] = 'TEXT'
parsed_msg['content'] = msg['Content']
if parsed_msg['senderType'] == 'GROUP':
ret = re.match('(@[0-9a-z]*?):<br/>(.*)$', msg['Content'])
parsed_msg['content'] = ret.groups()[1] # delete sender username info
elif sub_msg_type == 48: # position
parsed_msg['msgType'] = 'POSITION'
doc = xml.dom.minidom.parseString(msg['OriContent']).documentElement
node = doc.getElementsByTagName('location')[0]
parsed_msg['x'] = node.getAttribute('x')
parsed_msg['y'] = node.getAttribute('y')
parsed_msg['scale'] = node.getAttribute('scale')
parsed_msg['label'] = node.getAttribute('label')
parsed_msg['poiname'] = node.getAttribute('poiname')
elif msg_type == 3: # image
parsed_msg['msgType'] = 'IMAGE'
parsed_msg['mediaId'] = msg['MsgId']
parsed_msg['imgHeight'] = msg['ImgHeight']
parsed_msg['imgWidth'] = msg['ImgWidth']
parsed_msg['downloadFunc'] = self.__img_download
elif msg_type == 34: # voice
parsed_msg['msgType'] = 'VOICE'
parsed_msg['mediaId'] = msg['MsgId']
parsed_msg['voiceLength'] = msg['VoiceLength']
parsed_msg['downloadFunc'] = self.__voice_download
elif msg_type == 42: # card
parsed_msg['msgType'] = 'CARD'
content = html.unescape(msg['Content']) # TODO: delete emoji info
content = content.replace('<br/>', '\n')
doc = xml.dom.minidom.parseString(content).documentElement
parsed_msg['username'] = doc.getAttribute('username')
parsed_msg['nickname'] = doc.getAttribute('nickname')
parsed_msg['alias'] = doc.getAttribute('alias')
parsed_msg['province'] = doc.getAttribute('province')
parsed_msg['city'] = doc.getAttribute('city')
parsed_msg['sex'] = doc.getAttribute('sex')
parsed_msg['regionCode'] = doc.getAttribute('regionCode')
elif msg_type == 43: # video
parsed_msg['msgType'] = 'VIDEO'
parsed_msg['mediaId'] = msg['MsgId']
parsed_msg['playLength'] = msg['PlayLength']
parsed_msg['imgHeight'] = msg['ImgHeight']
parsed_msg['imgWidth'] = msg['ImgWidth']
parsed_msg['downloadFunc'] = self.__video_download
elif msg_type == 47: # animation
parsed_msg['msgType'] = 'ANIMATION'
parsed_msg['imgHeight'] = msg['ImgHeight']
parsed_msg['imgWidth'] = msg['ImgWidth']
elif msg_type == 49: # attachment
app_msg_type = msg['AppMsgType']
if app_msg_type == 6: # file
parsed_msg['msgType'] = 'FILE'
parsed_msg['mediaId'] = msg['MediaId']
parsed_msg['fileName'] = msg['FileName']
parsed_msg['encryFileName'] = msg['EncryFileName']
parsed_msg['fileSize'] = msg['FileSize']
parsed_msg['downloadFunc'] = self.__file_download
elif msg_type == 10002: # revoke
parsed_msg['msgType'] = 'REVOKE'
parsed_msg['revokedMsgId'] = re.search('<msgid>(.*?)<', msg['Content']).group(1)
return parsed_msg
十二、__img_download,下载接收的图片到本地
request:
url | https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxgetmsgimg |
---|---|
method | GET |
params | MsgID: 2526013959745980362 skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084 |
通过 GET 消息获取图片内容,然后写入到本地文件:
url = 'https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxgetmsgimg?MsgID=%s&skey=%s'%(msg_id, self.skey)
resp = self.session.get(url, stream=True, headers=self.headers, proxies=self.proxies)
file_name = 'img_' + msg_id + '.jpg'
with open(file_name, 'wb') as fptr:
fptr.write(resp.content)
十三、__voice_download,下载接收的语音到本地
request:
url | https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxgetvoice |
---|---|
method | GET |
params | msgID: 621497633495131085 skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084 |
通过 GET 消息获取语音内容,然后写入到本地文件:
url = 'https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxgetvoice?msgID=%s&skey=%s'%(msg_id, self.skey)
resp = self.session.get(url, stream=True, headers=self.headers, proxies=self.proxies)
file_name = 'voice_' + msg_id + '.mp3'
with open(file_name, 'wb') as fptr:
fptr.write(resp.content)
十四、__video_download,下载接收的视频到本地
request:
url | https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxgetvideo |
---|---|
method | GET |
params | msgID: 6534098630059551744 skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084 |
通过 GET 消息获取视频内容,然后写入到本地文件:
url = 'https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxgetvideo?msgID=%s&skey=%s'%(msg_id, self.skey)
headers = {
'Range': 'bytes=0-',
'User-Agent' : self.headers['User-Agent']
}
resp = self.session.get(url, stream=True, headers=headers, proxies=self.proxies)
file_name = 'video_' + msg_id + '.mp4'
with open(file_name, 'wb') as fptr:
fptr.write(resp.content)
十五、__file_download,下载接收的普通文件到本地
request:
url | https://file.wx.qq.com/cgi-bin/mmwebwx-bin/webwxgetmedia |
---|---|
method | GET |
params | sender: @b41046f864104defff92de8b4410bd301f792e1cecf55e1abcc7c607184a58d7 # msg['senderName'] mediaid: crypt_7ffae8d8_63f76c1db9efd9e1fcb412a4bb9188b005dfb172e34b6172da265300f4d09b4f0f18c0aef911afdf6c731a8a7d574698bec55c453e2038fbc44a9c80f6392346e5d3c59f105179215ea1ea860e8a3b9ccd0782f6e7a646a67374ccaa010cd7369f1a896c3f395842675d5e02c7b3890c6f4dc4f88d97b7c5fc5bda15eb0cff5c132174484cc6ee133390b8fec97dbeb8db14d7b296de4621057a347651ec06cf5a0467182061b59814275b29b86fa81631e7b3f0eb2f295ebf9184fa1f429227aba5608bc2b6ed0caed50892861f77fcb5e24adcb67ec28fd0cbe887187b904c # msg['mediaId'] encryfilename: configmanager%2Ejson # msg['encryFileName'] fromuser: 2137425061 # self.uin pass_ticket: AAwe7cZ4AQF3Wy%2BVRL30nJy8WWePRwl9BWViTSP7gu8pyV61n70B96v4x%2FS8WY3s webwx_data_ticket: gSdr8RGWron61xkus54F8BE8 |
通过 GET 消息获取视频内容,然后写入到本地文件:
resp = self.session.get(url, params=params, stream=True, headers=self.headers, proxies=self.proxies)
with open(msg['fileName'], 'wb') as fptr:
fptr.write(resp.content)
十六、__upload_media,上传多媒体资源
request:
url | https://file.wx.qq.com/cgi-bin/mmwebwx-bin/webwxuploadmedia |
---|---|
method | POST |
params | f: json |
files | id: WU_FILE_0 # 每上传一个多媒体文件,末尾序号 +1 name: test.mp4 type: video/mp4 lastModifiedDate: Thu Jan 16 14:07:27 2020 size: 959662 chunks: 2 # 总上传次数,每次上传2的19方次的字节数据 chunk: 0 # 上传的次数索引,如果只需上传一次,则不需要 chunk/chunks 字段 mediatype: video # 取值 pic, video, doc uploadmediarequest: { UploadType: 2 BaseRequest: { DeviceID: e911771485005848 Sid: nEdiHXamlIgS+eRL Skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084 Uin: 2137425061 } ClientMediaId: 1600313075093 # str(get_timestamp()) + str(random.random())[2:6] TotalLen: 959662 StartPos: 0 DataLen: 959662 MediaType: 4 FromUserName: @b41046f864104defff92de8b4410bd301f792e1cecf55e1abcc7c607184a58d7 # self.account_me['UserName'] ToUserName: filehelper FileMd5: e29a55fe9213035ad92d8a40c0adc19 } webwx_data_ticket: gSdr8RGWron61xkus54F8BE8 pass_ticket: AAwe7cZ4AQF3Wy%2BVRL30nJy8WWePRwl9BWViTSP7gu8pyV61n70B96v4x%2FS8WY3s filename: (binary) # (os.path.basename(file_name), fptr.read(1 << 19), file_type.split('/')[1]) |
response:
BaseResponse: {
ErrMsg: ""
Ret: 0
}
MediaId: "@crypt_xxxxx", # 最后一次上传完成才返回 MediaId, 之前返回 ''
StartPos: 0
通过 POST 消息填充 files
字段上传多媒体资源:
files = {
'id': (None, 'WU_FILE_%s' % str(self.file_index)),
'name': (None, os.path.basename(file_name)),
'type': (None, file_type),
'lastModifiedDate': (None, '%s' % time.ctime(os.path.getmtime(file_name))),
'size': (None, str(file_len)),
'mediatype': (None, media_type),
'uploadmediarequest': (None, json.dumps({
'UploadType': 2,
'BaseRequest': self.base_request,
'ClientMediaId': get_msg_id(),
'TotalLen': str(file_len),
'StartPos': 0,
'DataLen': str(file_len),
'MediaType': 4,
'FromUserName': self.account_me['UserName'],
'ToUserName': to_user_name,
'FileMd5': md5
})),
'webwx_data_ticket': (None, self.session.cookies['webwx_data_ticket']),
'pass_ticket': (None, self.pass_ticket),
}
fptr = open(file_name, 'rb')
chunks = int((file_len - 1) / (1 << 19)) + 1 # one time upload 524288 bytes
if chunks > 1:
for chunk in range(chunks):
f_bytes = fptr.read(1 << 19)
files['chunks'] = (None, str(chunks)) # only chunks > 1, have chunk&chunks IEs
files['chunk'] = (None, str(chunk))
files['filename'] = (os.path.basename(file_name), f_bytes, file_type.split('/')[1])
resp = self.session.post(url, files=files, headers=self.headers, proxies=self.proxies)
else:
f_bytes = fptr.read(1 << 19)
files['filename'] = (os.path.basename(file_name), f_bytes, file_type.split('/')[1])
resp = self.session.post(url, files=files, headers=self.headers, proxies=self.proxies)
dic = json.loads(resp.text)
fptr.close()
self.file_index += 1
return dic['MediaId']
十七、send_text,公有函数,发送文本消息
request:
url | https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxsendmsg |
---|---|
method | POST |
params | pass_ticket: AAwe7cZ4AQF3Wy%2BVRL30nJy8WWePRwl9BWViTSP7gu8pyV61n70B96v4x%2FS8WY3s |
data | BaseRequest: { DeviceID: e911771485005848 Sid: nEdiHXamlIgS+eRL Skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084 Uin: 2137425061 } Msg: { ClientMsgId: 16000731732860028 # str(int(time.time() * 1000)) + str(random.random())[2:6] Content: hello world FromUserName: @b41046f864104defff92de8b4410bd301f792e1cecf55e1abcc7c607184a58d7 # self.account_me['UserName'] LocalID: 16000731732860028 # str(int(time.time() * 1000)) + str(random.random())[2:6] ToUserName: filehelper Type: 1 } Scene: 0 |
response:
BaseResponse: {
ErrMsg: ""
Ret: 0
}
MsgID: "7154066925497929403",
LocalID: "16000731732860028"
通过 POST 消息发送数据到联系人:
self.session.post(url, params=params, data=json.dumps(data, ensure_ascii=False).encode('utf8'), headers=headers, proxies=self.proxies)
十八、send_image,公有函数,发送图片消息
request:
url | https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxsendmsgimg |
---|---|
method | POST |
params | func: async f: json lang: en_US pass_ticket: AAwe7cZ4AQF3Wy%2BVRL30nJy8WWePRwl9BWViTSP7gu8pyV61n70B96v4x%2FS8WY3s |
data | BaseRequest: { DeviceID: e911771485005848 Sid: nEdiHXamlIgS+eRL Skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084 Uin: 2137425061 } Msg: { ClientMsgId: 16000731732860028 # str(int(time.time() * 1000)) + str(random.random())[2:6] Content: ‘’ MediaId: @crypt_xxxxx # self.__upload_media() FromUserName: @b41046f864104defff92de8b4410bd301f792e1cecf55e1abcc7c607184a58d7 # self.account_me['UserName'] LocalID: 16000731732860028 # str(int(time.time() * 1000)) + str(random.random())[2:6] ToUserName: filehelper Type: 3 } Scene: 0 |
response:
BaseResponse: {
ErrMsg: ""
Ret: 0
}
MsgID: "7154066925497929403",
LocalID: "16000731732860028"
通过 POST 消息发送数据到联系人:
self.session.post(url, params=params, data=json.dumps(data, ensure_ascii=False).encode('utf8'), headers=headers, proxies=self.proxies)
十九、send_video,公有函数,发送视频消息
request:
url | https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxsendvideomsg |
---|---|
method | POST |
params | func: async f: json lang: en_US pass_ticket: AAwe7cZ4AQF3Wy%2BVRL30nJy8WWePRwl9BWViTSP7gu8pyV61n70B96v4x%2FS8WY3s |
data | BaseRequest: { DeviceID: e911771485005848 Sid: nEdiHXamlIgS+eRL Skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084 Uin: 2137425061 } Msg: { ClientMsgId: 16000731732860028 # str(int(time.time() * 1000)) + str(random.random())[2:6] Content: ‘’ MediaId: @crypt_xxxxx # self.__upload_media() FromUserName: @b41046f864104defff92de8b4410bd301f792e1cecf55e1abcc7c607184a58d7 # self.account_me['UserName'] LocalID: 16000731732860028 # str(int(time.time() * 1000)) + str(random.random())[2:6] ToUserName: filehelper Type: 43 } Scene: 0 |
response:
BaseResponse: {
ErrMsg: ""
Ret: 0
}
MsgID: "7154066925497929403",
LocalID: "16000731732860028"
通过 POST 消息发送数据到联系人:
self.session.post(url, params=params, data=json.dumps(data, ensure_ascii=False).encode('utf8'), headers=headers, proxies=self.proxies)
二十、send_file,公有函数,发送普通文件消息
request:
url | https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxsendappmsg |
---|---|
method | POST |
params | func: async f: json lang: en_US pass_ticket: AAwe7cZ4AQF3Wy%2BVRL30nJy8WWePRwl9BWViTSP7gu8pyV61n70B96v4x%2FS8WY3s |
data | BaseRequest: { DeviceID: e911771485005848 Sid: nEdiHXamlIgS+eRL Skey: @crypt_af16f3b1_e1df9a5ea92377384a3b2907b0ecf084 Uin: 2137425061 } Msg: { ClientMsgId: 16000731732860028 # str(int(time.time() * 1000)) + str(random.random())[2:6] Content: xxx # 由文件名,文件大小,mediaId组成的 xml 数据 FromUserName: @b41046f864104defff92de8b4410bd301f792e1cecf55e1abcc7c607184a58d7 # self.account_me['UserName'] LocalID: 16000731732860028 # str(int(time.time() * 1000)) + str(random.random())[2:6] ToUserName: filehelper Type: 6 } Scene: 0 |
response:
BaseResponse: {
ErrMsg: ""
Ret: 0
}
MsgID: "7154066925497929403",
LocalID: "16000731732860028"
通过 POST 消息发送数据到联系人:
self.session.post(url, params=params, data=json.dumps(data, ensure_ascii=False).encode('utf8'), headers=headers, proxies=self.proxies)
二十一、login,公有函数,扫码登录或缓存自动登录
该函数接收一个参数,用以支持不同的登录方式:
enable_relogin
如果为True
,则首先加载上次登录时保存的缓存信息,如果缓存仍然有效,则无需再次扫码登录,直接可以进行消息收发处理;如果缓存信息失效,或者第一次登录没有缓存信息,则开始扫码登录过程
如果为False
,则不读取缓存信息,直接扫码登录
该参数默认取值True
,表示使用缓存自动登录
def login(self, enable_relogin=True):
if enable_relogin == False or self.__load_pickle() == False:
self.__get_uuid()
self.__gen_qrcode()
self.__login()
self.__get_params()
self.__initinate()
self.__status_notify()
self.__get_contact()
self.__get_group_members()
if enable_relogin:
self.__save_pickle()
print('login success')
二十二、run,公有函数,循环接收并处理消息
循环处理以下步骤:
- 调用
__sync_check
,检查是否有接收到新消息 - 调用
__parse_msg
进行接收消息解析 - 检查接收的是否是自己发送的控制命令
如果收到自己发送的消息内容是"enable"
,则调用__process_msg
进行消息处理
如果收到自己发送的消息内容是"disable"
,则不进行任何消息处理
如果收到自己发送的消息内容是"logout"
,则退出登录
如果手机端点击退出登录,则退出登录
def run(self):
handle_enable = True
while True:
retcode, selector = self.__sync_check()
if retcode == '0':
if selector == '2': # recv new msg
msg_list = self.__webwx_sync()
for msg in msg_list:
parsed_msg = self.__parse_msg(msg)
if parsed_msg['senderType'] == 'MYSELF' and parsed_msg['msgType'] == 'TEXT':
if parsed_msg['content'] == 'enable':
handle_enable = True
print('enable msg handle')
continue
elif parsed_msg['content'] == 'disable':
handle_enable = False
print('disable msg handle')
continue
elif parsed_msg['content'] == 'logout':
self.logout()
return
if handle_enable:
self.__handle_msg(parsed_msg)
elif retcode == '1101': # logout by phone
self.__delete_pickle()
print('logout success')
return
else:
print('unsupported retcode:%s' %retcode)
time.sleep(1)
二十三、register_msg_handle,公有函数,自定义消息处理函数
__process_msg
默认消息处理函数,只是打印解析后的消息内容,用户可以通过 register_msg_handle
注册自定义消息处理函数替换默认处理函数:
import webwx
def msg_handle(self, msg):
if msg['msgType'] == 'TEXT' and msg['senderType'] == 'CONTACT':
if msg['contactRemarkName']:
print(msg['contactRemarkName'])
else:
print(msg['contactNickName'])
print(' ' + msg['content'])
weChat = webwx.webwx()
weChat.register_msg_handle(msg_handle)
weChat.login()
weChat.run()
源码
https://github.com/chenwenhuiGithub/pythonScript/tree/master/webwx
待实现的功能
- 优化缓存,提高加载效率
- emoji 表情内容过滤
- 增加异常处理