爬虫urllib库parse模块API详解二

最新推荐文章于 2023-05-27 18:30:00 发布

chengqiuming

最新推荐文章于 2023-05-27 18:30:00 发布

阅读量319

点赞数

分类专栏：爬虫 python 文章标签：爬虫 python

本文链接：https://blog.csdn.net/chengqiuming/article/details/86314376

版权

一 urlunparse()

1 代码

#它接受的参数是一个可迭代对象，但是它的长度必须是6，否则会抛出参数数量不足或者过多的问题。
from urllib.parse import urlunparse

data = ['http', 'www.baidu.com', 'index.html', 'user', 'a=6', 'comment']
print(urlunparse(data))

2 结果

E:\WebSpider\venv\Scripts\python.exe E:/WebSpider/3_1_3.py
http://www.baidu.com/index.html;user?a=6#comment

3 说明

这里参数data用了列表类型。当然，你也可以用其他类型，比如元组或者特定的数据结构。

这样我们就成功实现了URL的构造。

二 urlsplit()

1 代码1

from urllib.parse import urlsplit

# 这个方法和urlparse()方法非常相似，只不过它不再单独解析params这一部分，只返回5个结果。
# params会合并到path中
result = urlsplit('http://www.baidu.com/index.html;user?id=5#comment')
print(result)

2 结果1

最低0.47元/天解锁文章

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

chengqiuming

关注关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
爬虫urllib库parse模块API详解二

一 urlunparse()1 代码#它接受的参数是一个可迭代对象，但是它的长度必须是6，否则会抛出参数数量不足或者过多的问题。from urllib.parse import urlunparsedata = ['http', 'www.baidu.com', 'index.html', 'user', 'a=6', 'comment']print(urlunparse(dat...
复制链接

扫一扫