Python操作url库——yarl

简介

yarl 便于 URL 解析和修改




安装

pip install yarl




初试

 http://user:pass@example.com:8042/over/there?name=ferret#nose
 \__/   \__/ \__/ \_________/ \__/\_________/ \_________/ \__/
  |      |    |        |       |      |           |        |
scheme  user password host    port   path       query   fragment



from yarl import URL

url = URL('https://www.python.org/~guido?arg=1#frag')
print(url)  # URL('https://www.python.org/~guido?arg=1#frag')
print(url.scheme)  # https
print(url.host)  # www.python.org
print(url.path)  # /~guido
print(url.query_string)  # arg=1
print(url.query)  # <MultiDictProxy('arg': '1')>
print(url.fragment)  # frag
print(url.parts)  # ('/', '~guido')

print(URL('http://example.com:8080').explicit_port)  # 8080
print(URL('http://example.com/path/to.txt').suffix)  # .txt

更多属性查阅 URL properties




操作URL

  • /:连路径
  • %:加参数
  • 字符串会自动编码
  • url.human_repr():人类可读
from yarl import URL

url = URL('https://www.python.org/~guido?arg=1#frag')
print(url.parent / 'downloads/source')  # URL('https://www.python.org/downloads/source')

url = URL('https://www.python.org')
print(url / 'foo' / 'bar')  # URL('https://www.python.org/foo/bar')
print(url / 'foo' % {'bar': 'baz'})  # URL('https://www.python.org/foo?bar=baz')

url = URL('https://www.python.org/你好')
print(url)  # URL('https://www.python.org/%E4%BD%A0%E5%A5%BD')
print(url.human_repr())  # URL('https://www.python.org/你好')




绝对和相对URL

from yarl import URL

print(URL('http://example.com').is_absolute())  # True
print(URL('//example.com').is_absolute())  # True
print(URL('/path/to').is_absolute())  # False
print(URL('path').is_absolute())  # False




生成URL

from yarl import URL

print(URL.build())  # URL('')
print(URL.build(scheme='http', host='example.com'))  # URL('http://example.com')
print(URL.build(scheme='http', host='example.com', query={'a': 'b'}))  # URL('http://example.com/?a=b')
print(URL.build(scheme='http', host='example.com', query_string='a=b'))  # URL('http://example.com/?a=b')

print(URL('http://example.com/path/to').with_host('python.org'))  # URL('http://python.org/path/to')
print(URL('http://example.com:8888').with_port(9999))  # URL('http://example.com:9999')
print(URL('http://example.com:8888').with_port(None))  # URL('http://example.com')

print(URL('http://example.com/').with_path('/path/to'))  # URL('http://example.com/path/to')

print(URL('http://example.com/path?a=b').with_query('c=d'))  # URL('http://example.com/path?c=d')
print(URL('http://example.com/path?a=b').with_query({'c': 'd'}))  # URL('http://example.com/path?c=d')
print(URL('http://example.com/path?a=b').with_query({'c': [1, 2]}))  # URL('http://example.com/path?c=1&c=2')
print(URL('http://example.com/path?a=b').with_query(None))  # URL('http://example.com/path')
print(URL('http://example.com/path?a=b&b=1').with_query(b='2'))  # URL('http://example.com/path?b=2')
print(URL('http://example.com/path?a=b&b=1').with_query([('b', '2')]))  # URL('http://example.com/path?b=2')

print(URL('http://example.com/path?a=b').update_query('c=d'))  # URL('http://example.com/path?a=b&c=d')
print(URL('http://example.com/path?a=b').update_query({'c': 'd'}))  # URL('http://example.com/path?a=b&c=d')
print(URL('http://example.com/path?a=b').update_query({'c': [1, 2]}))  # URL('http://example.com/path?a=b&c=1&c=2')
print(URL('http://example.com/path?a=b&b=1').update_query(b='2'))  # URL('http://example.com/path?a=b&b=2')
print(URL('http://example.com/path?a=b&b=1').update_query([('b', '2')]))  # URL('http://example.com/path?a=b&b=2')
print(URL('http://example.com/path?a=b&c=e&c=f').update_query(c='d'))  # URL('http://example.com/path?a=b&c=d')
print(URL('http://example.com/path?a=b').update_query('c=d&c=f'))  # URL('http://example.com/path?a=b&c=d&c=f')
print(URL('http://example.com/path?a=b') % {'c': 'd'})  # URL('http://example.com/path?a=b&c=d')

print(URL('http://example.com/path/to?arg#frag').with_name('new'))  # URL('http://example.com/path/new')

print(URL('http://example.com/path/to?arg#frag').parent)  # URL('http://example.com/path')

print(URL('http://example.com/path/to?arg#frag').origin())  # URL('http://example.com')
print(URL('http://user:pass@example.com/path').origin())  # URL('http://example.com')

print(URL('http://example.com/path/to?arg#frag').relative())  # URL('/path/to?arg#frag')

base = URL('http://example.com/path/index.html')  # URL('http://python.org/page.html')
print(base.join(URL('page.html')))




是否为默认端口

SchemePort
http80
https443
ws80
wss443
from yarl import URL

print(URL('http://example.com').is_default_port())  # True
print(URL('http://example.com:80').is_default_port())  # True
print(URL('http://example.com:8080').is_default_port())  # False
print(URL('/path/to').is_default_port())  # False




其他




参考文献

  1. yarl Documentation
  2. Python中这样操作url也太爽了吧
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

XerCis

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值