urllib.parse分为URL parsing and URL quoting,即网址解析和网址引用。
URL解析函数专注于将URL字符串拆分为其组件,或将URL组件组合到URL字符串中。
urllib.parse.
urlparse
(urlstring, scheme='', allow_fragments=True)
>>> from urllib.parse import urlparse >>> o = urlparse('http://www.cwi.nl:80/%7Eguido/Python.html') >>> o ParseResult(scheme='http', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html', params='', query='', fragment='') >>> o.scheme 'http' >>> o.port 80 >>> o.geturl() 'http://www.cwi.nl:80/%7Eguido/Python.html'
>>> from urllib.parse import urlparse >>> urlparse('//www.cwi.nl:80/%7Eguido/Python.html') ParseResult(scheme='', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html', params='', query='', fragment='') >>> urlparse('www.cwi.nl/%7Eguido/Python.html') ParseResult(scheme='', netloc='', path='www.cwi.nl/%7Eguido/Python.html', params='', query='', fragment='') >>> urlparse('help/Python.html') ParseResult(scheme='', netloc='', path='help/Python.html', params='', query='', fragment='')
urllib.parse.
urlsplit
(urlstring, scheme='', allow_fragments=True)
使用方法同上
合并URL
urll.parse.urlencode()
urllib.parse.
urlunsplit
(parts)
urllib.parse.
urljoin
(base, url, allow_fragments=True)
>>> from urllib.parse import urljoin >>> urljoin('http://www.cwi.nl/%7Eguido/Python.html', 'FAQ.html') 'http://www.cwi.nl/%7Eguido/FAQ.html'