u r l url url不能出现中文,导致不能传输中文数据,解决方法有下面两个:
- 统一使用 b a s e 64 base64 base64转换一下;
- 把中文用UTF-8存储,然后对其进行可视化编码,这也是浏览器的做法。
先来看下,一条 u r l url url在浏览器中会被转化成什么样,可以用在线的网址转一下。
before
:https://blog.csdn.net/FlushHip?type=1&name=老王&str={}%
after
:https://blog.csdn.net/FlushHip?type=1&name=%E8%80%81%E7%8E%8B&str=%7B%7D%25
可以看到,把UTF-8中的每个字节十六进制可视化出来,然后在前面加上%
就好了;
由于ASCII和UTF-8是兼容的(第一个字节的最高位是0),因此,大部分ASCII不用转换,当然,少数的ASCII是需要转换的,这里把所有的ASCII可视字符都用在线的网址转一下,得到下列字符是需要转化的:"%<>[]^_` {|}
因此,可以写出下列代码
struct UTF8Url
{
static std::string Encode(const std::string & url);
static std::string Decode(const std::string & url);
private:
static const std::string & HEX_2_NUM_MAP();
static const std::string & ASCII_EXCEPTION();
static unsigned char NUM_2_HEX(const char h, const char l);
};
const std::string & UTF8Url::HEX_2_NUM_MAP()
{
static const std::string str("0123456789ABCDEF");
return str;
}
const std::string & UTF8Url::ASCII_EXCEPTION()
{
static const std::string str(R"("%<>[\]^_`{|})");
return str;
}
unsigned char UTF8Url::NUM_2_HEX(const char h, const char l)
{
unsigned char hh = std::find(std::begin(HEX_2_NUM_MAP()), std::end(HEX_2_NUM_MAP()), h) - std::begin(HEX_2_NUM_MAP());
unsigned char ll = std::find(std::begin(HEX_2_NUM_MAP()), std::end(HEX_2_NUM_MAP()), l) - std::begin(HEX_2_NUM_MAP());
return (hh << 4) + ll;
}
std::string UTF8Url::Encode(const std::string & url)
{
std::string ret;
for (auto it = url.begin(); it != url.end(); ++it)
{
if (((*it >> 7) & 1) || (std::count(std::begin(ASCII_EXCEPTION()), std::end(ASCII_EXCEPTION()), *it)))
{
ret.push_back('%');
ret.push_back(HEX_2_NUM_MAP()[(*it >> 4) & 0x0F]);
ret.push_back(HEX_2_NUM_MAP()[*it & 0x0F]);
}
else
{
ret.push_back(*it);
}
}
return ret;
}
std::string UTF8Url::Decode(const std::string & url)
{
std::string ret;
for (auto it = url.begin(); it != url.end(); ++it)
{
if (*it == '%')
{
if (std::next(it++) == url.end())
{
throw std::invalid_argument("url is invalid");
}
ret.push_back(NUM_2_HEX(*it, *std::next(it)));
if (std::next(it++) == url.end())
{
throw std::invalid_argument("url is invalid");
}
}
else
{
ret.push_back(*it);
}
}
return ret;
}
可以来试试看
int main()
{
std::freopen("output", "w", stdout);
std::string url = u8"https://blog.csdn.net/FlushHip?type=1&name=老王&str={}%";
std::cout << "before encode: " << url << std::endl;
std::cout << "after encode: " << UTF8Url::Encode(url) << std::endl;
std::cout << "after decode: " << UTF8Url::Decode(UTF8Url::Encode(url)) << std::endl;
return 0;
}
得到如下结果
before encode: https://blog.csdn.net/FlushHip?type=1&name=老王&str={}%
after encode: https://blog.csdn.net/FlushHip?type=1&name=%E8%80%81%E7%8E%8B&str=%7B%7D%25
after decode: https://blog.csdn.net/FlushHip?type=1&name=老王&str={}%