python怎么把字符串分开,如何在python中将网址字符串拆分为单独的部分?

I decided that I'll learn python tonight :)

I know C pretty well (wrote an OS in it) so I'm not a noob in programming so everything in python seems pretty easy, but I don't know how to solve this problem :

let's say I have this address:

http://example.com/random/folder/path.html

Now how can I create two strings from this, one containing the "base" name of the server, so in this example it would be

http://example.com/

and another containing the thing without the last filename, so in this example it would be

http://example.com/random/folder/

.

Also I of course know the possibility to just find the 3rd and last slash respectively but maybe you know a better way :]

Also it would be cool to have the trailing slash in both cases but I don't care since it can be added easily.

So anyone has a good, fast, effective solution for this? Or is there only "my" solution, finding the slashes?

Thanks!

解决方案

The urlparse module in python 2.x (or urllib.parse in python 3.x) would be the way to do it.

>>> from urllib.parse import urlparse

>>> url = 'http://example.com/random/folder/path.html'

>>> parse_object = urlparse(url)

>>> parse_object.netloc

'example.com'

>>> parse_object.path

'/random/folder/path.html'

>>> parse_object.scheme

'http'

>>>

If you wanted to do more work on the path of the file under the url, you can use the posixpath module :

>>> from posixpath import basename, dirname

>>> basename(parse_object.path)

'path.html'

>>> dirname(parse_object.path)

'/random/folder'

After that, you can use posixpath.join to glue the parts together.

EDIT: I totally forgot that windows users will choke on the path separator in os.path. I read the posixpath module docs, and it has a special reference to URL manipulation, so all's good.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值