c5 URLs and URIs - x-www-form-urlencoded

为什么需要encoded ?

操作系统的差异会对URL造成一些问题,比如,有些系统允许filename中有空格,有些则不允许。大部分操作系统允许filename中出现#,但#在URL中有特殊的含义,filename的结束,其后跟着fragment identifier。其他特殊字符,或者非alpha,numeric字符在URL或者在其他操作系统有类似的问题。而且Unicode编码在Web发明时还未出现,所以并非所有系统都能处理了Unicode中的一些字符,比如汉字。

为解决这个问题,使用在URL中的字符必须是ASCII的固定子集:

大小写的a-z,A-Z;数字0-9;符号- _ . ! ~ * ' (and ,),符合 : / & ? @ # ; $ + = and %有特殊的用途,如果这些符号出现在path或者query中,则需要encoded。

encoding很简单,对不是ASCII letter,numeric,特定符合的,都将其转成一个字节,用2位16进制表示,%后跟2位16进制数。URL类不会自动encoding或decoding,要程序员手动进行。


URLEncoder

String encoded = URLEncoder.encode("This*string*has*asterisks", "UTF-8");

encoding很简单,URLEncoder.encode(),会encoding所有不是ASCII 字符,数字, 空格  _  -  .  *以外的所有字符!即使 ~ '  ! ( ) 也encoding。

尽管可以选择character set,但最好用“UTF-8”,因为它的兼容性最好!

public class EncoderTest {
  public static void main(String[] args) {
    try {
      System.out.println(URLEncoder.encode("This string has spaces",
                                              "UTF-8"));
      System.out.println(URLEncoder.encode("This*string*has*asterisks",
                                              "UTF-8"));
      System.out.println(URLEncoder.encode("This%string%has%percent%signs",
                                              "UTF-8"));
      System.out.println(URLEncoder.encode("This+string+has+pluses",
                                              "UTF-8"));
      System.out.println(URLEncoder.encode("This/string/has/slashes",
                                              "UTF-8"));
      System.out.println(URLEncoder.encode("This\"string\"has\"quote\"marks",
                                              "UTF-8"));
      System.out.println(URLEncoder.encode("This:string:has:colons",
                                              "UTF-8"));
      System.out.println(URLEncoder.encode("This~string~has~tildes",
                                              "UTF-8"));
      System.out.println(URLEncoder.encode("This(string)has(parentheses)",
                                              "UTF-8"));
      System.out.println(URLEncoder.encode("This.string.has.periods",
                                              "UTF-8"));
      System.out.println(URLEncoder.encode("This=string=has=equals=signs",
                                              "UTF-8"));
      System.out.println(URLEncoder.encode("This&string&has&ersands",
                                              "UTF-8"));
      System.out.println(URLEncoder.encode("Thiséstringéhasé
                                              non-ASCII characters", "UTF-8"));
    } catch (UnsupportedEncodingException ex) {
      throw new RuntimeException("Broken VM does not support UTF-8");
    }
  }
}

output:

This*string*has*asterisks
This%25string%25has%25percent%25signs
This%2Bstring%2Bhas%2Bpluses
This%2Fstring%2Fhas%2Fslashes
This%22string%22has%22quote%22marks
This%3Astring%3Ahas%3Acolons
This%7Estring%7Ehas%7Etildes
This%28string%29has%28parentheses%29
This.string.has.periods
This%3Dstring%3Dhas%3Dequals%3Dsigns
This%26string%26has%26ampersands
This%C3%A9string%C3%A9has%C3%A9non-ASCII+characters</programlisting>

注意encode()会把 /  & :  = 也encode了,而不去管这些符号在URL中的特殊用处。因此,应该encode URL piece by piece,而不是encoding entire URl in one method  call !

例如

String query = URLEncoder.encode(
    "https://www.google.com/search?hl=en&as_q=Java&as_epq=I/O", "UTF-8");
System.out.println(query);
output:

https%3A%2F%2Fwww.google.com%2Fsearch%3Fhl%3Den%26as_q%3DJava%26as_epq%3DI%2FO

而应该这么:

String url = "https://www.google.com/search?";
url += URLEncoder.encode("hl", "UTF-8");
url += "=";
url += URLEncoder.encode("en", "UTF-8");
url += "&";
url += URLEncoder.encode("as_q", "UTF-8");
url += "=";
url += URLEncoder.encode("Java", "UTF-8");
url += "&";
url += URLEncoder.encode("as_epq", "UTF-8");
url += "=";
url += URLEncoder.encode("I/O", "UTF-8");
System.out.println(url);
https://www.google.com/search?hl=en&as_q=Java&as_epq=I%2FO

下面这个类可以用于encode query

public class QueryString {
  private StringBuilder query = new StringBuilder();
  public QueryString() {
  }
  public synchronized void add(String name, String value) {
    query.append('&');
    encode(name, value);
  }
  private synchronized void encode(String name, String value) {
    try {
      query.append(URLEncoder.encode(name, "UTF-8"));
      query.append('=');
      query.append(URLEncoder.encode(value, "UTF-8"));
    } catch (UnsupportedEncodingException ex) {
      throw new RuntimeException("Broken VM does not support UTF-8");
    }
  }
  public synchronized String getQuery() {
    return query.toString();
  }
  @Override
  public String toString() {
    return getQuery();
  }
}
使用:

QueryString qs = new QueryString();
qs.add("hl", "en");
qs.add("as_q", "Java");
qs.add("as_epq", "I/O");
String url = "http://www.google.com/search?" + qs;
System.out.println(url);

URLDecoder

public static String decode(String s, String encoding)
    throws UnsupportedEncodingException

String input = "https://www.google.com/" +
    "search?hl=en&as_q=Java&as_epq=I%2FO";
String output = URLDecoder.decode(input, "UTF-8");
System.out.println(output);


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值