urlparse :
url = ’http://netloc/path;param?query=arg#frag’
parsed = urlparse(url)
print parsed
结果:ParseResult(scheme=’http’, netloc=’netloc’, path=’/path’,params=’param’, query=’query=arg’, fragment=’frag’)
urlsplit()
parsed = urlsplit(url)
print parsed
结果:SplitResult(scheme=’http’, netloc=’user:pwd@NetLoc:80’,path=’/p1;param/p2;param’, query=’query=arg’, fragment=’frag’)注意,urlsplit比urlparse的数组少了一项!
urldefrag() 过滤掉了fragment
parsed = urlparse(url)
print parsed.geturl() 结果为原url
urlunparse url重构,丢弃url多余的部分
urljoin
print urljoin(’http://www.example.com/path/file.html’,’anotherfile.html’) 结果:http://www.example.com/subpath/file.html
print urljoin(’http://www.example.com/path/file.html’,’../anotherfile.html’)结果:http://www.example.com/path/subpath/file.