python怎么把字符串分开,如何在python中将网址字符串拆分为单独的部分？

最新推荐文章于 2023-02-01 20:13:13 发布

優嫿

最新推荐文章于 2023-02-01 20:13:13 发布

阅读量314

点赞数

文章标签： python怎么把字符串分开

I decided that I'll learn python tonight :)

I know C pretty well (wrote an OS in it) so I'm not a noob in programming so everything in python seems pretty easy, but I don't know how to solve this problem :

let's say I have this address:

http://example.com/random/folder/path.html

Now how can I create two strings from this, one containing the "base" name of the server, so in this example it would be

http://example.com/

and another containing the thing without the last filename, so in this example it would be

http://example.com/random/folder/

Also I of course know the possibility to just find the 3rd and last slash respectively but maybe you know a better way :]

Also it would be cool to have the trailing slash in both cases but I don't care since it can be added easily.

So anyone has a good, fast, effective solution for this? Or is there only "my" solution, finding the slashes?

Thanks!

解决方案

The urlparse module in python 2.x (or urllib.parse in python 3.x) would be the way to do it.

>>> from urllib.parse import urlparse

>>> url = 'http://example.com/random/folder/path.html'

>>> parse_object = urlparse(url)

>>> parse_object.netloc

'example.com'

>>> parse_object.path

'/random/folder/path.html'

>>> parse_object.scheme

'http'

>>>

If you wanted to do more work on the path of the file under the url, you can use the posixpath module :

>>> from posixpath import basename, dirname

>>> basename(parse_object.path)

'path.html'

>>> dirname(parse_object.path)

'/random/folder'

After that, you can use posixpath.join to glue the parts together.

EDIT: I totally forgot that windows users will choke on the path separator in os.path. I read the posixpath module docs, and it has a special reference to URL manipulation, so all's good.