web_day7

这次爬丁香园还是优点爬不出来,我明天再爬一下

import requests,re
from lxml import etree
from bs4 import BeautifulSoup
temp_cookie = ('DXY_USER_GROUP=94; _ga=GA1.2.1871919213.1551584788; __auc=b9e4194916942b5c086bda9844a; _gid=GA1.2.1190762227.1551910319; JUTE_BBS_DATA=a8af2e763e978851dff3a9239d38a02d7c6b0489f02eb861a7a460171664f34a9bd67cde26d1b090ce1a560cb426bf0c0381768dff7b5934a2750aee4f8a1437f0d2ab09f9ed0fb52453cafec73b6cae; JUTE_SESSION_ID=82c3ed3c-4a92-4126-a41e-02835aaceb23; JUTE_TOKEN=ad09358c-8d63-436a-b05c-243161d7ee9b; __utmc=3004402; __utmz=3004402.1551910345.8.4.utmcsr=open.weixin.qq.com|utmccn=(referral)|utmcmd=referral|utmcct=/; JSESSIONID=472407EBA68314DCB616347322B31BE0; __utma=3004402.1796184836.1551584668.1551910345.1551926410.9; __utmt=1; __utmb=3004402.2.10.1551926410; JUTE_SESSION=283af6700f44e65fa077bf34acf8b1babce45dc800dc1578a6a3f2c69ad1ac6161aeee98c80130e62e3363d493f47ea09dfb7e97f73cd0746528b054defce3757d78105b17910ddd')
Cookie ={}
for i in temp_cookie.split(';'):
    key_value = i.strip().split('=')
    Cookie[key_value[0]]=key_value[1]
headers = {
    'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36',
    'Connection':'keep-alive',
    'Referer':'http://3g.dxy.cn/bbs/topic/509959'
}

url = 'http://3g.dxy.cn/bbs/topic/509959'
r = requests.get(url=url,headers=headers,cookies=Cookie)
#print(r.text)
print(r.text)
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值