1、为什么会要URL Encoding
RFC 1738: Uniform Resource Locators (URL) specification
The specification for URLs (RFC 1738, Dec. '94) poses a problem, in that it limits the use of allowed characters in URLs to only a limited subset of the US-ASCII character set:
"...Only alphanumerics [0-9a-zA-Z], the special characters "$-_.+!*'()," [not including the quotes - ed], and reserved characters used for their reserved purposes may be used unencoded within a URL."
只有字母和数字[0-9a-zA-Z]、一些特殊符号" $-_.+!*'(),"[不包括双引号]、以及 某些保留字,才可以不经过编码直接用于URL。
根据官方表明:url只能使用ASCII字符集来通过因特网进行发送。
2、如何进行URL Encoding
URL encoding of a character consists of a "%" symbol, followed by the two-digit hexadecimal representation (case-insensitive) of the ISO-Latin code point for the character.
Example
- Space = decimal code point 32 in the ISO-Latin set.
- 32 decimal = 20 in hexadecimal
- The URL encoded representation will be "%20"
字符 进行URL encoding 用一个%号和十六进制来表示。
注:空格ASCII码是32,对应16进制是20,那么urlencode编码结果是:%20,但在新标准中空格对应的是+,见RFC-1738。
3、base64编码和URL Encode编码区别
base64主要用于二进制数据编码、用于初步的加密(非明文可见)和安全的网络传输。
URL Encode主要用于编码 url 和安全传输 url, RFC 1738做了硬性规定。