iOS开发:URL编码解码

引出问题:当我们进行网络请求的时候,URL中有中文和特殊字符时,请求就会报错(基本都是Get请求),这个时候就需要对请求链接URL进行encode编码。

Objective-C中的URL编码解码

encode

- (NSString*)urlEncode
{
    NSString *encode = [self stringByAddingPercentEncodingWithAllowedCharacters:[NSCharacterSet URLQueryAllowedCharacterSet]];
    if (encode.length) {
        return encode;
    }
    return self;
}

编码用到了[NSCharacterSet URLQueryAllowedCharacterSet],这个我们稍后详细看一下。

decode

- (NSString*)urlDecode
{
    NSString *decode = [self stringByRemovingPercentEncoding];
    if (decode.length) {
        return decode;
    }
    return self;
}

NSCharacterSet字符集

NSCharacterSet对象表示一组Unicode兼容字符,我们对字符串进行编码用到的API是:

// Returns a new string made from the receiver by replacing all characters not in the allowedCharacters set with percent encoded characters. UTF-8 encoding is used to determine the correct percent encoded characters. Entire URL strings cannot be percent-encoded. This method is intended to percent-encode a URL component or subcomponent string, NOT the entire URL string. Any characters in allowedCharacters outside of the 7-bit ASCII range are ignored.
- (nullable NSString *)stringByAddingPercentEncodingWithAllowedCharacters:(NSCharacterSet *)allowedCharacters API_AVAILABLE(macos(10.9), ios(7.0), watchos(2.0), tvos(9.0));

通过将不在allowedCharacters集合中的所有字符替换为百分比编码字符,返回从接收器生成的新字符串。UTF-8编码用于确定编码字符的正确百分比。不能对整个URL字符串进行百分比编码。此方法旨在对URL组件或子组件字符串进行百分比编码,而不是对整个URL字符串进行百分比。allowedCharacters中超出7位ASCII范围的任何字符都将被忽略。(

意思就是:会对这个字符串进行Unicode(UTF-8)编码,另外将不在allowedCharacters集合中的所有字符替换为百分比编码字符,但你也不能对整个URL字符串进行编码,应该区别对待scheme、host、path、query。
注意点:不在allowedCharacters集合中的字符!不在allowedCharacters集合中的字符!不在allowedCharacters集合中的字符!这一点是其他博客都没说明的。

allowedCharacters这个字符集你可以自定义集合,也可以使用NSCharacterSet的类属性。

常用字符集

NSCharacterSet类属性API

@interface NSCharacterSet (NSURLUtilities)
// Predefined character sets for the six URL components and subcomponents which allow percent encoding. These character sets are passed to -stringByAddingPercentEncodingWithAllowedCharacters:.

// Returns a character set containing the characters allowed in a URL's user subcomponent.
@property (class, readonly, copy) NSCharacterSet *URLUserAllowedCharacterSet API_AVAILABLE(macos(10.9), ios(7.0), watchos(2.0), tvos(9.0));

// Returns a character set containing the characters allowed in a URL's password subcomponent.
@property (class, readonly, copy) NSCharacterSet *URLPasswordAllowedCharacterSet API_AVAILABLE(macos(10.9), ios(7.0), watchos(2.0), tvos(9.0));

// Returns a character set containing the characters allowed in a URL's host subcomponent.
@property (class, readonly, copy) NSCharacterSet *URLHostAllowedCharacterSet API_AVAILABLE(macos(10.9), ios(7.0), watchos(2.0), tvos(9.0));

// Returns a character set containing the characters allowed in a URL's path component. ';' is a legal path character, but it is recommended that it be percent-encoded for best compatibility with NSURL (-stringByAddingPercentEncodingWithAllowedCharacters: will percent-encode any ';' characters if you pass the URLPathAllowedCharacterSet).
@property (class, readonly, copy) NSCharacterSet *URLPathAllowedCharacterSet API_AVAILABLE(macos(10.9), ios(7.0), watchos(2.0), tvos(9.0));

// Returns a character set containing the characters allowed in a URL's query component.
@property (class, readonly, copy) NSCharacterSet *URLQueryAllowedCharacterSet API_AVAILABLE(macos(10.9), ios(7.0), watchos(2.0), tvos(9.0));

// Returns a character set containing the characters allowed in a URL's fragment component.
@property (class, readonly, copy) NSCharacterSet *URLFragmentAllowedCharacterSet API_AVAILABLE(macos(10.9), ios(7.0), watchos(2.0), tvos(9.0));

@end

这几个类属性有什么区别呢?只去看官方文档真不好理解有什么具体的区别。我们写一段代码简单测试一下,用这几个属性分别对 https://小明:pwd123@192.168.1.1:80/app/home/list?name=中国&address=BJ&page=2&pageCount=&role=1#index 进行编码

URL结构

                    hierarchical part
        ┌───────────────────┴─────────────────────┐
                    authority               path
        ┌───────────────┴───────────────┐┌───┴────┐
  abc://username:password@example.com:123/path/data?key=value&key2=value2#fragid1
  └┬┘   └───────┬───────┘ └────┬────┘ └┬┘           └─────────┬─────────┘ └──┬──┘
scheme  user information     host     port                  query         fragment

  urn:example:mammal:monotreme:echidna
  └┬┘ └────────────┬───────────────┘
scheme              path

URL结构拆解

schemehostpathqueryportuserpasswordfragment
https192.168.1.1/app/home/listname=中国&address=BJ&page=2&pageCount=&role=180小明pwd123index

编码结果

类属性编码后文本
URLUserAllowedCharacterSethttps%3A%2F%2F%E5%B0%8F%E6%98%8E%3Apwd123%40192.168.1.1%3A80%2Fapp%2Fhome%2Flist%3Fname=%E4%B8%AD%E5%9B%BD&address=BJ&page=2&pageCount=&role=1%23index
URLPasswordAllowedCharacterSethttps%3A%2F%2F%E5%B0%8F%E6%98%8E%3Apwd123%40192.168.1.1%3A80%2Fapp%2Fhome%2Flist%3Fname=%E4%B8%AD%E5%9B%BD&address=BJ&page=2&pageCount=&role=1%23index
URLHostAllowedCharacterSethttps%3A%2F%2F%E5%B0%8F%E6%98%8E%3Apwd123%40192.168.1.1%3A80%2Fapp%2Fhome%2Flist%3Fname=%E4%B8%AD%E5%9B%BD&address=BJ&page=2&pageCount=&role=1%23index
URLPathAllowedCharacterSethttps%3A//%E5%B0%8F%E6%98%8E:pwd123@192.168.1.1:80/app/home/list%3Fname=%E4%B8%AD%E5%9B%BD&address=BJ&page=2&pageCount=&role=1%23index
URLQueryAllowedCharacterSethttps://%E5%B0%8F%E6%98%8E:pwd123@192.168.1.1:80/app/home/list?name=%E4%B8%AD%E5%9B%BD&address=BJ&page=2&pageCount=&role=1%23index
URLFragmentAllowedCharacterSethttps://%E5%B0%8F%E6%98%8E:pwd123@192.168.1.1:80/app/home/list?name=%E4%B8%AD%E5%9B%BD&address=BJ&page=2&pageCount=&role=1%23index

通过上面的表格看细节不太好比较,但是我们知道他们所编码的部分和字符集是不一样的,网络上大部分流传是这样的:

URLFragmentAllowedCharacterSet  "#%<>[\]^`{|}
URLHostAllowedCharacterSet      "#%/<>?@\^`{|}
URLPasswordAllowedCharacterSet  "#%/:<>?@[\]^`{|}
URLPathAllowedCharacterSet      "#%;<>?[\]^`{|}
URLQueryAllowedCharacterSet     "#%<>[\]^`{|}
URLUserAllowedCharacterSet      "#%/:<>?@[\]^`

那么对不对呢?依据是什么?我在Apple官网也没找到相关的资料证明这个,索性我们做一次实验吧:把ASCII中的字符用NSCharacterSet编码。
要编码的字符串是:NSString code = @" !"#$%&'()+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~" ASCII编码表中的32位到126位。

编码结果

类属性编码后文本被编码的字符集
URLUserAllowedCharacterSet%20!%22%23$%25&'()*+,-.%2F0123456789%3A;%3C=%3E%3F%40ABCDEFGHIJKLMNOPQRSTUVWXYZ%5B%5C%5D%5E_%60abcdefghijklmnopqrstuvwxyz%7B%7C%7D~@" "#%/:<>?@[\]^`{
URLPasswordAllowedCharacterSet%20!%22%23$%25&'()*+,-.%2F0123456789%3A;%3C=%3E%3F%40ABCDEFGHIJKLMNOPQRSTUVWXYZ%5B%5C%5D%5E_%60abcdefghijklmnopqrstuvwxyz%7B%7C%7D~@" "#%/:<>?@[\]^`{
URLHostAllowedCharacterSet%20!%22%23$%25&'()*+,-.%2F0123456789%3A;%3C=%3E%3F%40ABCDEFGHIJKLMNOPQRSTUVWXYZ%5B%5C%5D%5E_%60abcdefghijklmnopqrstuvwxyz%7B%7C%7D~@" "#%/:<>?@[\]^`{
URLPathAllowedCharacterSet%20!%22%23$%25&'()*+,-./0123456789:%3B%3C=%3E%3F@ABCDEFGHIJKLMNOPQRSTUVWXYZ%5B%5C%5D%5E_%60abcdefghijklmnopqrstuvwxyz%7B%7C%7D~@" "#%;<>?[\]^`{
URLQueryAllowedCharacterSet%20!%22%23$%25&'()*+,-./0123456789:;%3C=%3E?@ABCDEFGHIJKLMNOPQRSTUVWXYZ%5B%5C%5D%5E_%60abcdefghijklmnopqrstuvwxyz%7B%7C%7D~@" "#%<>[\]^`{
URLFragmentAllowedCharacterSet%20!%22%23$%25&'()*+,-./0123456789:;%3C=%3E?@ABCDEFGHIJKLMNOPQRSTUVWXYZ%5B%5C%5D%5E_%60abcdefghijklmnopqrstuvwxyz%7B%7C%7D~@" "#%<>[\]^`{

结论:网上流传的并不对,这个是我亲身实践得出的,开发中一般使用 URLQueryAllowedCharacterSetURLFragmentAllowedCharacterSet(他俩支持的字符集一样),这样就不会对URL常出现的 ?/: 进行编码了。

自定义字符集

经过上面的分析,我们对编码有了一定了解,那么像 '()*+,-. 等几个特殊字符,URLQueryAllowedCharacterSet 并不支持编码,和其他平台传输有乱码现象怎么办呢?这个时候就需要自定义字符集了。

    NSString *code = @" !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~";
    NSCharacterSet *invertedSet = [[NSCharacterSet characterSetWithCharactersInString:@" \"#%<>[\\]^`{|}'()*+,-."] invertedSet];
    NSString *encode = [code stringByAddingPercentEncodingWithAllowedCharacters:invertedSet];

//编码后encode: %20!%22%23$%25&%27%28%29%2A%2B%2C%2D%2E/0123456789:;%3C=%3E?@ABCDEFGHIJKLMNOPQRSTUVWXYZ%5B%5C%5D%5E_%60abcdefghijklmnopqrstuvwxyz%7B%7C%7D~

变量 "#%<>[\]^``{|}'()*+,-. 为什么要 invertedSet 反转集合呢?因为 stringByAddingPercentEncodingWithAllowedCharacters 入参的字符集合是不会被编码的集合,我们反转之后就是对我们自定义的变量里面的字符进行编码了。

End。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

wuwuFQ

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值