python抓取网页

最新推荐文章于 2024-05-17 08:30:00 发布

Jredreamer

最新推荐文章于 2024-05-17 08:30:00 发布

阅读量786

点赞数

分类专栏： python

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/zhang_Red/article/details/8856187

版权

python 专栏收录该内容

18 篇文章 0 订阅

订阅专栏

#-*- coding:utf8 -*-

import urllib.parse
import urllib.request


'''
根据返回数据获取网页编码格式
'''
def getCharSet(response) :
    contype = response.headers['Content-Type']
    pos = contype.find('=')
    if -1 != pos:
       contype = contype[pos+1:len(contype)]
    return contype

'''
根据URL获取网页的字符串内容
'''
def getData(url, *params) :
    theurl=url
    if params :
        data=urllib.parse.urlencode(values)
        theurl=url+"?"+data

    req=urllib.request.Request(theurl)
    response=urllib.request.urlopen(req)

    contype = getCharSet(response)
    return response.read().decode(contype,'ignore')

if __name__ == '__main__':
    url='http://list.taobao.com/browse/cat-0.htm'
    data=getData(url)
    print(data)

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
python抓取网页

#-*- coding:utf8 -*-import urllib.parseimport urllib.request'''根据返回数据获取网页编码格式'''def getCharSet(response) : contype = response.headers['Content-Type'] pos = contype.find('=') if -1
复制链接

扫一扫

专栏目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。