今天遇到了 一个比较烦人的问题,爬取一个网站的时候,登陆返回的cookie和通过抓包获取的数据的cookie不一样,其中有个参数,找了半天,没找到。
网址:https://i.keking.cn/user_index.html
登陆返回的cookie是这个样子:
acw_tc=2f624a7115548746919093682e53ca410b002b05e6d61724dbcfaaa50d7b58; UM_distinctid=16a05ca88f1231-051276fcfa61fb-7a1437-100200-16a05ca88f214a; companyName=%E6%B7%B1%E5%9C%B3%E9%AA%90%E7%BF%94%E7%89%A9%E6%B5%81%E5%86%87%E9%99%90%E8%B4%A3%E4%BB%BB%E5%85%AC%E5%8F%B8; token=eyJlbmNyeXB0ZWREYXRhIjoiWHY2YU1JZjZPTEhqT3pOeFJmZzBERlNCbVVWUCtuUTFjc3BSd0E5bGtWNXUyWnhrV2ZJdFUzeGxZU3F2Y0Z1Z2NWR1BIQlNVVTY2WkRqc1lRUnhENUI1dnZDUUU0MmdQN1hjM2pwNUNXSk9rWTNQQ1JsTjRGclVqZ3g4K1VYTDN0MW13KzMwLzkySERkNFBqalVDc1lwejJpcGg4MlZHMElGcHQyM05OQ1JJPSIsIndyYXBwZWRLZXkiOiJtRTIvcTlRb2RmUUxNRi85UEFIc3NsNVJoNEJ3aE95Y3RXUkVhYVhTU3VldW1ZZTlWTk5TZk80cDBSS1FPLzNaQi9PbVBQRnNONHNGWFNlZms1SmFkMkxZSmkyNVphdWRXOWVJYlhyNElTbWdScWtDZVdDcHZmdzJiTzJCMHc3MldFZFk3TkF2YWFMOE0xOXJxTFI3VlRwVVpUVVVyc0FuR0JCam9ZZ3Y1Q0k9In0=;
抓包数据所在的cookie是这个样子:
Cookie: acw_tc=2f624a7115548746919093682e53ca410b002b05e6d61724dbcfaaa50d7b58; UM_distinctid=16a05ca88f1231-051276fcfa61fb-7a1437-100200-16a05ca88f214a; companyName=%E6%B7%B1%E5%9C%B3%E9%AA%90%E7%BF%94%E7%89%A9%E6%B5%81%E5%86%87%E9%99%90%E8%B4%A3%E4%BB%BB%E5%85%AC%E5%8F%B8; token=eyJlbmNyeXB0ZWREYXRhIjoiWHY2YU1JZjZPTEhqT3pOeFJmZzBERlNCbVVWUCtuUTFjc3BSd0E5bGtWNXUyWnhrV2ZJdFUzeGxZU3F2Y0Z1Z2NWR1BIQlNVVTY2WkRqc1lRUnhENUI1dnZDUUU0MmdQN1hjM2pwNUNXSk9rWTNQQ1JsTjRGclVqZ3g4K1VYTDN0MW13KzMwLzkySERkNFBqalVDc1lwejJpcGg4MlZHMElGcHQyM05OQ1JJPSIsIndyYXBwZWRLZXkiOiJtRTIvcTlRb2RmUUxNRi85UEFIc3NsNVJoNEJ3aE95Y3RXUkVhYVhTU3VldW1ZZTlWTk5TZk80cDBSS1FPLzNaQi9PbVBQRnNONHNGWFNlZms1SmFkMkxZSmkyNVphdWRXOWVJYlhyNElTbWdScWtDZVdDcHZmdzJiTzJCMHc3MldFZFk3TkF2YWFMOE0xOXJxTFI3VlRwVVpUVVVyc0FuR0JCam9ZZ3Y1Q0k9In0=; tmsToken="eyJlbmNyeXB0ZWREYXRhIjoid3lYNWRhc0dEeGMxTEpYdEFRMlNWYlAxVFVFNldySlI2L1pLdkJ5Zjc4Yms3RCs2cEMyWTF3UHdqK1E1L2YrMXpyVDQxdGRNbXBVU0R3dmR5TjV1VkZzaXRXbkhkM0hERVhBT1RFa3BIaU1QOW16bDRJeGIwUk94N1h2WnB6WDZJMDFHTDhUN0IwbFNHNU9lM2h4YVNxaUd1UVlxZjVySEI1Z3p4RTBQOHVpcGZidk1uMzJYY0E5dUsxenZwdHUvRFhNeSsrWlcvMU5ZUEtxblJwRHpmc251TkNmK0pCVzVwdlppd1FRVGdBL3hnY1dJei9GUkNJUVZuL0R5bCtZSE5zTlpBb2VuVzhFZjBMVVhYODBnUzdTWWRRU3FIN2xubjBDR2I4dlJ2YVRTdElyNGsvajVRaGpPR2dLSVlqVGdkNWU3bnNyYms3S3dINmlJdVN6UnlKZ3YxV3Bqemp6bzJTVXV3S3NVaUhVM0l5M3BHU0FoS0g0VWk5eWJoTk1XaVlqWUJqRWxadmJCYkJxM0IwQ3hGTzUwNCtxRjFOTjNSWFdMUUk4Ni9NVmt3c1UrcnpJQlI0ZFFwZ1BxZXNkMG5mRmpFamFDekxJSk9reU5YZEZBaC91aS9tYXRFUk50NFFGQnQzNnNmTTlSSjFFeE1nSXdia3V4dUNvdngrMnNRd21KbW9LM0F2RE41bHA3a0FraGZSVGg0bnNSMklBMmR6TzU1T0JTbHM3T2k3c01ZRit5SS9aMXBCVTg0bm1LT21oL083U0tBc1RJSUVVSDlieHN2R296Q0xHaEJsUjlWcWRzUHJLQ2JsUkRmK2ViS0xzV3NwTVlON3hyOFlhWVVCWmtiSUlpaFM5aEpYYTh3cHJwVWxFUWpHQXlKWndCUTZ0bTFwNGI5d1plVkxrYitGUzlPVVhXUjA0emVYb0k0OXNxU1oxWVE1L0NVditYU00wV3VCZHlzTHBqV09IazhrekRYd05ac2pTZEk3OUNsaHZFNUpJTDh5WkhTMEtOdnEyWUI2T3BvQkFHbENiQ3JJandIeUpJdHhFWXdLV1Z3MzNPUTkyMGY3TUU3Z3JEdWsxeTBKZHkrTk5FWG1PcFZyUTRQbnVaZ1BUd3IxRVVNWmZiYWtGSnNnais4c1E1R0NpSTRabHNtbDE0MVBPcTl1cmJCWG9RNlVLaFJXenZUZ3dnR2VZQk1NZGNpcXU4UFlFNTRDVGlDUzlONlRpZEwwcmJkRzFXdUIwWUFPZzIwc0E3TzRqeWVtMmFkUkN2dUx4M0tUVzFVRWdZOTFYNWZMTyt6UWxEdFRUZmc0Ri81YlNVRzNIWXRZanhkNTM5Vk9jZTBIalZROWRxQjR5d1dUeWVRTFB5NWdMcUZ5RnF3aWg1VW5QU1JHQWorSEgwaHlDeHFUQW5SN20rRnhRYkI1dHNIZDRoSzJvZDZxVXZLeE5LVmdJNS9HV3E2NGZpSFkya0w4VXY5TE5Yd1p2bS9VZmdQaVJhUU9qV3VEZUdxdm81WEFPV1l6NHZtQ25GeWpaQWd2VFJHYnF2bWlhZ2F5bGVodGF5SURIazQ5SWpSWmphL2ViUExqc2lwTHpGR0lmR2E3NTAwYWhsdlN0MGlyV1UvcnBTOUsydzlTR1M1RkdCd2V3S25mdTM4THBoSGY3bGpmcU5ZUVVvUVdvdzk5b1NzdVdjcWt5NkNjTG5TcXFoTTRMSnJlNDJGV0NDeUV4MDNSd2JHUUVKSCt4ZENqZlc0WHpHUlByVlRuWWkyenNDUmpKYVlGRk92T0R3Nm1naWpoQktUYUNoTUxSME9EV1NNU2d5NFlTd3NCY3o1dDRjS1ljWXE4RCs3dGIyOUVSZW1GRzhvbVF2M285TEFqRjMvbEw3ZHlPeVdlOFQ0RGg2WUFTZWxsTUtyNE1GUmYzYmR6N0Jicm0vSm5WS1ZYcFVGaG5FTlpkUXN3ZFFVZ0xwUm50LzFEaVV3ZWxDWVhMRFNNQWR1cUZkaWdzUFRuNkM1Zy9CWUlBZStMVjgxT05BYUE2SjlVZTVLSTBqdVVydXVYajVqRXBFQlhzNERpN1E5QT09Iiwid3JhcHBlZEtleSI6ImNYcTB5STZkaU1pUWQrL1NJdUxSNXVucUFRT0tLUTJBTVR1QThDaGRaYTV2NjRjWXVJbkZPUGVnZ29OMHgzZ3I0bysyV0RKR2YrY1VIYmlMN0xORDFnPT0ifQ=="
多出了一个tmstoken的参数,这个参数哪来的?
寻找set-cookie的接口,我们来看看这个set-cookie是怎么产生的。刚开始以为js加密的,搞了半天没找到任何有关的迹象。
然后又回到抓包,继续分析请求。开心的是,在抓包中,居然找到了,相同的cookie:
这个请求的url如下:
/app/approval/verifyOpenState?ignoreLoadingBar=true&userCenterToken=eyJlbmNyeXB0ZWREYXRhIjoiamJXZ0hPZUkrYWtCNW9QWEpCTHpOOHVtZW1oakhKVnRKM09aMjhEQllySlA3RW93OGJuaUtGNloyMnVialBhVE5iV2RsRSthMEx4MytSbDFsWHA1MUUyMDl2dXIwend5RVZFMVdtUEFmYnBQSjU4blFvVGd4VVoyM1hjY0xINXdYb0wyelp3Y3NEblhyRVZDUHV0YlgzUHdqZ2NXT0FoaFlCM3lESnU4eXRJd3JtVnVtaHhtcFZ6M2p4UUVHbVJjUnJ3TDBMaWdKQjA4MG95TE5rYkplNDBOR1dRVlZHS1UvYnpoT3liRlNyV0xTWk9McjlHcUh1c2lscFFCb2YzN3VDbEd2azJUdFAzd0RWZWF1KzQyb1RWdCtYOFlDTk4xMXVEOHZnZm5EVzZiYjkvR0xHTlJEL05NUThKSXlUUXJLVks5STRuenlMV2dBeVd2Q1JGUnFkdWkwMXdPeHd3MjVGMlJJdU5aMkZHQXIyYitXZjVyODZvTEFBQ01jenE1NkhzaWJ2elZ3Z3lrbjk5dEV1SzVkZ09YaWo1bXRkOGhFZ0kwdjE3T0toc295MVJ1dXh2SHVFai9KRlVDZUZqNit3Sk40Q2JZMlhNQzgxclYyMHhMWllPRDZEQ1hnSGh6Zityei9hcHRCYWM9Iiwid3JhcHBlZEtleSI6IlhpcEUrQlhKbGJOdldtRGZDVkRRVEhUTjFBUVMxMHMzT2c0RjlXM05sUEQ3UTh2SXBhRVVkUk1WQ3hqZ1hsWlR5L1RMeUJldUdaL01aSE5YYnlxR1pkUEhFS2RqcENGNE94MW1SNFJQWENlMmFQN3VRT2ppbUFmSE9HaHVPcHF5d2F6UFF1L0N5TWJyL09TcHgyL3JxUGZteUFFeHJlQjJndDBXVEMxbnBMUT0ifQ
其中又又一个userCenterToken参数。我们只要拿到这个url然后用sessiong保持回话,不就可以了?
接下来的目标就是找这个userCenterToken的参数。
继续逆向分析,又看到了这个:
通过对比,发现这个appToken和userCenterToken的值是一样的,开发人员为了迷惑我们,特意将参数名给换掉了。
https://customer.api.keking.cn/product/getProductOpened/
我们只要访问这个接口不就可以找到userCenterToken,但是在这个请求头中,我们无奈的发现,还有一个token的参数。怎么办?
继续想办法。通过往上继续分析,发现这个token和我们刚才登陆后返回的cookie是一致的。
那么不是大功告成了。
整个流程就是这样。
代码如下:
import re
import requests
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import time
import json
import traceback
class KaiJing():
def __init__(self,username,password):
self.username = username
self.password = password
self.s = requests.Session()
def get_product(self):
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36',
'Referer': 'https://i.keking.cn/user_index.html',
"token": self.token
}
url = 'https://customer.api.keking.cn/product/getProductOpened'
r = self.s.get(url, headers=headers)
print(r.text)
self.productId = re.findall('"productId":"(.*?)"', r.text)[1]
self.corpId = re.findall('"corpId":"(.*?)"', r.text)[1]
# print(self.productId)
# print(self.corpId)
self.userCenterToken = re.findall('"productAccessUrl":"http://cloud.keking.cn/#/transfer\?appToken=(.*?)"', r.text)[0]
def get_tms_token(self):
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36',
'Referer': 'http://cloud.keking.cn/',
# 'Cookie': 'token={0}'.format(self.token)
}
url1 = 'https://tms.api.keking.cn/app/approval/verifyOpenState?ignoreLoadingBar=true&userCenterToken={0}'.format(self.userCenterToken.replace('=', ''))
url2 = 'https://tms.api.keking.cn/app/approval/tokenLogin?ignoreLoadingBar=true&userCenterToken={0}'.format(self.userCenterToken.replace('=', ''))
r = self.s.get(url1, headers=headers)
print(r.text)
print(r.headers)
r = self.s.get(url2, headers=headers)
print(r.text)
print(r.headers)
def start_to_pay(self):
url = 'https://tms.api.keking.cn/api/tms/pay/listDeparturePay?actualPayee=&applyDateFirst=2019-04-01&applyDateLast=2019-04-30&arriveCity=&arriveDistrict=&arriveProvince=&carNo=&carType=¤tPage=1&driverName=&globalCondition=&isCanLoan=&projects=&receiver=&rows=10&searchCondition=&searchContent=&searchMode=global&sendCity=&sendDistrict=&sendProvince=&supplierName='
headers = {
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36',
'Referer': 'http://cloud.keking.cn/?v={0}'.format(time.time()*1000),
}
res = self.s.get(url, headers=headers)
print(res.text)
if __name__ == '__main__':
kj = KaiJing('******','*****')
kj.login()
kj.get_product()
# kj.get_tms_token()
# kj.start_to_pay()