爬虫入门实例七

最新推荐文章于 2024-10-29 19:06:15 发布

秋瑾先生

最新推荐文章于 2024-10-29 19:06:15 发布

阅读量137

点赞数

分类专栏：日常笔记文章标签：爬虫

本文链接：https://blog.csdn.net/dldl1718/article/details/88603778

版权

日常笔记专栏收录该内容

16 篇文章 0 订阅

订阅专栏

import urllib.request
import urllib.parse
import string

def get_method_params():
    url = "http://www.baidu.com/s?wd="
    name = "美女"
    #name = {
    #           "wd":"中文"
    #      }
            #字典传参
    #str_params = urllib.parse.urlencode(d)
    #final_url = url + str_params
    final_url = url + name
    #转译含有汉字的网址
    change_url = urllib.parse.quote(final_url,safe=string.printable)
    r = urllib.request.urlopen(change_url)
    print(r)#返回一个请求对象
    #UnicodeEncodeError: 'ascii' codec can't encode
    # characters in position 10-11: ordinal not in range(128)
    #若name内容是英文字符串没有问题，中文字符有问题
    # 使用safe=string.printable
    print(urllib.parse.quote('afe||*beff/c')) #未编码斜线,但是编码其他符号

    print(urllib.parse.quote_plus('afdsdsf&b/c'))  #编码了斜线，也编码了其他符号

    print(urllib.parse.unquote('9+2'))  #不解码加号
    #'9+2'
    print(urllib.parse.quote_plus('9+2'))  #把加号解码为空格
    #quote()输出和quote_plus一样
    #9%2B2
    data = r.read().decode()
    # print(data)
    print(type(data))
    # 字符串类型
    # 若没有decode(),是字节串。
    with open("02-get_params.html","w",encoding="utf-8") as f:
        f.write(data)



get_method_params()