网址链接中的中文编码
中文的gbk(GB2312)编码: 一个汉字对应两组%xx,即%xx%xx
中文的UTF-8编码: 一个汉字对应三组%xx,即%xx%xx%xx
可以利用百度进行URL编码解码 默认gbk
https://www.baidu.com/s?wd=%E4%B8%AD%E5%9B%BDpython3编码解码示例
# -*- coding: utf-8 -*-# @File : urldecode_demo.py# @Date : 2018-05-11from urllib.request import quote, unquote# 编码url1 = "https://www.baidu.com/s?wd=中国"# utf8编码,指定安全字符ret1 = quote(url1, safe=";/?:@&=+$,", encoding="utf-8")print(ret1)# https://www.baidu.com/s?wd=%E4%B8%AD%E5%9B%BD# gbk编码ret2 = quote(url1, encoding="gbk")print(ret2)# https%3A//www.baidu.com/s%3Fwd%3D%D6%D0%B9%FA# 解码url3 = "https://www.baidu.com/s?wd=%E4%B8%AD%E5%9B%BD"ret3 = unquote(url3, encoding="utf-8")print(ret3)# https://www.baidu.com/s?wd=中国
参考:
Python进行URL解码