python模拟浏览器怎么设置_Python爬虫(2) 设置代理服务器、模拟浏览器

add_header() 添加header头

例:from urllib import request as sa

url = 'https://blog.csdn.net/dQCFKyQDXYm3F8rB0/article/details/84302896'

r = sa.Request(url)

r.add_header('User-Agent','Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.26 Safari/537.36 Core/1.63.6776.400 QQBrowser/10.3.2601.400')

d = sa.urlopen(r).read()

发送post数据,urlencode() 整理数据、encode() 转换编码

例:from urllib import request as sa

from urllib import parse as sp

url = 'http://www.iqianyue.com/mypost/'

p = sp.urlencode({

'name':111,

'pass':222,

}).encode('utf-8')

r = sa.Request(url,p)

d = sa.urlopen(r).read()

http://yum.iqianyue.com/proxy 代理服务器地址

使用代理服务器爬取网站信息

ProxyHandler() 设置对应的代理服务器信息

build_opener() 创建opener工具

install_opener() 创建全局opener对象

例:from urllib import request as sa

from urllib import parse as sp

def up(p,url):

pr = sa.ProxyHandler({'http':p})

op = sa.build_opener(pr,sa.HTTPHandler)

sa.install_opener(op)

da = sa.urlopen(url).read().decode('utf-8')

return da

p = '219.234.5.128:3128'

url = 'http://www.baidu.com'

da = up(p,url)

print(da)

DebugLog设置

HTTPHandler() debuglevel=1

HTTPSHandler() debuglevel=1

build_opener() 创建opener对象并使用HTTPHandler、HTTPSHandler设置的参数

install_opener() 创建全局默认opener对象

例:from urllib import request as sa

ht = sa.HTTPHandler(debuglevel=1)

hs = sa.HTTPSHandler(debuglevel=1)

op = sa.build_opener(ht,hs)

sa.install_opener(op)

da = sa.urlopen("http://edu.51cto.com")

print(da)

send: b'GET / HTTP/1.1\r\nAccept-Encoding: identity\r\nHost: edu.51cto.com\r\nUser-Agent: Python-urllib/3.7\r\nConnection: close\r\n\r\n'

reply: 'HTTP/1.1 200 OK\r\n'

header: Date: Tue, 27 Nov 2018 03:30:42 GMT

header: Content-Type: text/html; charset=UTF-8

header: Transfer-Encoding: chunked

header: Connection: close

header: Set-Cookie: acw_tc=276aedef15432894423986507e64d81a2f2aba60d34ca9de13e960bac343d2;path=/;HttpOnly;Max-Age=2678401

header: Server: nginx

header: Vary: Accept-Encoding

header: Vary: Accept-Encoding

header: X-Powered-By: PHP/7.1.9

header: Set-Cookie: acw_tc=276aedef15432894423986507e64d81a2f2aba60d34ca9de13e960bac343d2;path=/;HttpOnly;Max-Age=2678401

header: Set-Cookie: acw_tc=276aedef15432894423986507e64d81a2f2aba60d34ca9de13e960bac343d2;path=/;HttpOnly;Max-Age=2678401

header: Set-Cookie: acw_tc=276aedef15432894423986507e64d81a2f2aba60d34ca9de13e960bac343d2;path=/;HttpOnly;Max-Age=2678401

header: Load-Balancing: web01

header: Load-Balancing: web01

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值