python读url_Python:获取URL路径部分

最好的选择是在处理url的路径组件时使用^{}模块。此模块与^{}具有相同的接口,在基于POSIX和Windows NT的平台上使用时,它始终在POSIX路径上运行。

示例代码:#!/usr/bin/env python3

import urllib.parse

import sys

import posixpath

import ntpath

import json

def path_parse( path_string, *, normalize = True, module = posixpath ):

result = []

if normalize:

tmp = module.normpath( path_string )

else:

tmp = path_string

while tmp != "/":

( tmp, item ) = module.split( tmp )

result.insert( 0, item )

return result

def dump_array( array ):

string = "[ "

for index, item in enumerate( array ):

if index > 0:

string += ", "

string += "\"{}\"".format( item )

string += " ]"

return string

def test_url( url, *, normalize = True, module = posixpath ):

url_parsed = urllib.parse.urlparse( url )

path_parsed = path_parse( urllib.parse.unquote( url_parsed.path ),

normalize=normalize, module=module )

sys.stdout.write( "{}\n --[n={},m={}]-->\n {}\n".format(

url, normalize, module.__name__, dump_array( path_parsed ) ) )

test_url( "http://eg.com/hithere/something/else" )

test_url( "http://eg.com/hithere/something/else/" )

test_url( "http://eg.com/hithere/something/else/", normalize = False )

test_url( "http://eg.com/hithere/../else" )

test_url( "http://eg.com/hithere/../else", normalize = False )

test_url( "http://eg.com/hithere/../../else" )

test_url( "http://eg.com/hithere/../../else", normalize = False )

test_url( "http://eg.com/hithere/something/./else" )

test_url( "http://eg.com/hithere/something/./else", normalize = False )

test_url( "http://eg.com/hithere/something/./else/./" )

test_url( "http://eg.com/hithere/something/./else/./", normalize = False )

test_url( "http://eg.com/see%5C/if%5C/this%5C/works", normalize = False )

test_url( "http://eg.com/see%5C/if%5C/this%5C/works", normalize = False,

module = ntpath )

代码输出:http://eg.com/hithere/something/else

--[n=True,m=posixpath]-->

[ "hithere", "something", "else" ]

http://eg.com/hithere/something/else/

--[n=True,m=posixpath]-->

[ "hithere", "something", "else" ]

http://eg.com/hithere/something/else/

--[n=False,m=posixpath]-->

[ "hithere", "something", "else", "" ]

http://eg.com/hithere/../else

--[n=True,m=posixpath]-->

[ "else" ]

http://eg.com/hithere/../else

--[n=False,m=posixpath]-->

[ "hithere", "..", "else" ]

http://eg.com/hithere/../../else

--[n=True,m=posixpath]-->

[ "else" ]

http://eg.com/hithere/../../else

--[n=False,m=posixpath]-->

[ "hithere", "..", "..", "else" ]

http://eg.com/hithere/something/./else

--[n=True,m=posixpath]-->

[ "hithere", "something", "else" ]

http://eg.com/hithere/something/./else

--[n=False,m=posixpath]-->

[ "hithere", "something", ".", "else" ]

http://eg.com/hithere/something/./else/./

--[n=True,m=posixpath]-->

[ "hithere", "something", "else" ]

http://eg.com/hithere/something/./else/./

--[n=False,m=posixpath]-->

[ "hithere", "something", ".", "else", ".", "" ]

http://eg.com/see%5C/if%5C/this%5C/works

--[n=False,m=posixpath]-->

[ "see\", "if\", "this\", "works" ]

http://eg.com/see%5C/if%5C/this%5C/works

--[n=False,m=ntpath]-->

[ "see", "if", "this", "works" ]

注:在基于Windows NT的平台上^{}是^{}

在基于Unix/Posix的平台上^{}是^{}

^{}无法正确处理反斜杠(\)(请参阅代码/输出中的最后两种情况)-这就是建议使用^{}的原因。

记住使用^{}

考虑使用^{}

多路径分隔符(/)的语义不是由RFC 3986定义的。但是,^{}折叠多个相邻的路径分隔符(即,它将///、//和/视为相同)

尽管POSIX和URL路径具有相似的语法和语义,但它们并不相同。

规范性引用文件:

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值