python3通过request多进程获取驾校一点通试题库

最新推荐文章于 2022-08-16 14:25:28 发布

哎呦喂，别急

最新推荐文章于 2022-08-16 14:25:28 发布

阅读量470

点赞数

分类专栏： python 文章标签： python

本文链接：https://blog.csdn.net/qq_42259275/article/details/111943894

版权

python 专栏收录该内容

12 篇文章 0 订阅

订阅专栏

通过开发者工具找到试题链接地址；
对试题链接的url进行分析，发现index是试题id名称，构造随机数，可使用range或者excel拉出全部；
对json数据进行字段分析
我这里分开写了两个脚本，一个是获取数据一个是转成excel，本文主要为多进程获取数据
开发环境python3.9.1/windows10/vscode

#coding:utf-8
import requests
from concurrent.futures import ProcessPoolExecutor
import json

# 通过url获取数据
# url = 'http://mnks.jxedt.com/get_question?r=0.5376675619396274&index=3'
urls_list = []
with open('D:/YYFX/ip.txt','r') as f:
    for line in f:
        #print line,
        urls_list.append(line.replace('\n', ''))
#模拟浏览器header
hea = {'User-Agent':'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36'}
#进程
pool = ProcessPoolExecutor(20)
def get_page(url):
    #requests.get 自带 json.load
    
    response = requests.get('http://%s'%(url),headers = hea,timeout = 30 ,verify=False)
    response = response.content
    #将bytes转换成字符串
    response = response.decode('utf-8')
    return  response

def read_data(future,*args,**kwargs):
    response = future.result()
    state = json.loads(response)
#     print(response.status_code,response.url)

    print (state)
    #product = response1["question"]+'\n'
    with open('%s.json'%'data','a',encoding='utf-8') as f:
#保存json数据防止乱码
        f.write(json.dumps(state,ensure_ascii=False) + '\n')
    f.close()


def main():
    for url in urls_list:
        done = pool.submit(get_page,url)
        done.add_done_callback(read_data)
if __name__ == '__main__':
    main()
    pool.shutdown(wait=True)
    f.close()

哎呦喂，别急

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
python3通过request多进程获取驾校一点通试题库

通过开发者工具找到试题链接地址；对试题链接的url进行分析，发现index是试题id名称，构造随机数，可使用range或者excel拉出全部；对json数据进行字段分析我这里分开写了两个脚本，一个是获取数据一个是转成excel，本文主要为多进程获取数据开发环境python3.9.1/windows10/vscode #coding:utf-8import requestsfrom concurrent.futures import ProcessPoolExecutorimpo
复制链接

扫一扫

专栏目录