java d encoding_java中encoding总结

URLConnection 乱码1

2

3

4URL realUrl = new URL(""urlNameString"");

URLConnection connection = realUrl.openConnection();

OutputStreamWriter out = new OutputStreamWriter(connection

.getOutputStream(), "UTF-8");

在获取OutputStreamWriter需要指定编码格式, 否则使用的是默认的编码, 查看OutputStreamWriter的没有指定编码的构造函数:

1

2

3

4

5

6

7

8

9

10

11

12

13/**

* Creates an OutputStreamWriter that uses the default character encoding.

*

* @param out An OutputStream

*/

public OutputStreamWriter(OutputStream out){

super(out);

try {

se = StreamEncoder.forOutputStreamWriter(out, this, (String)null);

} catch (UnsupportedEncodingException e) {

throw new Error(e);

}

}

查看StreamEncoder的forOutputStreamWriter方法:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15// Factories for java.io.OutputStreamWriter

public static StreamEncoder forOutputStreamWriter(OutputStream out,

Object lock,

String charsetName)

throws UnsupportedEncodingException

{

String csn = charsetName;

if (csn == null)

csn = Charset.defaultCharset().name();

try {

if (Charset.isSupported(csn))

return new StreamEncoder(out, lock, Charset.forName(csn));

} catch (IllegalCharsetNameException x) { }

throw new UnsupportedEncodingException (csn);

}

可以看到, 如果没有传入编码名称,用的是默认的编码方式,这个Charset.defaultCharset().name()在windows上默认是GBK,这个可以在JDK启动的时候指定参数:

1-Dfile.encoding=UTF-8

Tomcat乱码

URI编码

指定为UTF-8

1

2

3

4

maxSpareThreads="75" enableLookups="false" redirectPort="8443"

acceptCount="100" debug="99" connectionTimeout="20000"

disableUploadTimeout="true" URIEncoding="UTF-8"/>

tomcat 对URI默认的编码是ISO-8859-1,在Connector中配置URIEncoding=”UTF-8” 就可以指定编码。

tomcat中关于编码的代码:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49//org.apache.catalina.connector.CoyoteAdapter#convertURI

/**

* Character conversion of the URI.

*/

protected void convertURI(MessageBytes uri, Request request)

throws Exception{

ByteChunk bc = uri.getByteChunk();

int length = bc.getLength();

CharChunk cc = uri.getCharChunk();

cc.allocate(length, -1);

String enc = connector.getURIEncoding();

if (enc != null) {

B2CConverter conv = request.getURIConverter();

try {

if (conv == null) {

conv = new B2CConverter(enc, true);

request.setURIConverter(conv);

} else {

conv.recycle();

}

} catch (IOException e) {

log.error("Invalid URI encoding; using HTTP default");

connector.setURIEncoding(null);

}

if (conv != null) {

try {

conv.convert(bc, cc, true);

uri.setChars(cc.getBuffer(), cc.getStart(), cc.getLength());

return;

} catch (IOException ioe) {

// Should never happen as B2CConverter should replace

// problematic characters

request.getResponse().sendError(

HttpServletResponse.SC_BAD_REQUEST);

}

}

}

// Default encoding: fast conversion for ISO-8859-1

byte[] bbuf = bc.getBuffer();

char[] cbuf = cc.getBuffer();

int start = bc.getStart();

for (int i = 0; i < length; i++) {

cbuf[i] = (char) (bbuf[i + start] & 0xff);

}

uri.setChars(cbuf, 0, length);

}

Request的编码

设置了上述编码后,获取request的参数还是有可能乱码, 此时需要指定对应的filter。

Tomcat

tomcat中也实现了一个编码的filter:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28//org.apache.catalina.filters.SetCharacterEncodingFilter

/**

* Select and set (if specified) the character encoding to be used to

* interpret request parameters for this request.

*

* @param request The servlet request we are processing

* @param response The servlet response we are creating

* @param chain The filter chain we are processing

*

* @exception IOException if an input/output error occurs

* @exception ServletException if a servlet error occurs

*/

@Override

public void doFilter(ServletRequest request, ServletResponse response,

FilterChain chain)

throws IOException, ServletException{

// Conditionally select and set the character encoding to be used

if (ignore || (request.getCharacterEncoding() == null)) {

String characterEncoding = selectEncoding(request);

if (characterEncoding != null) {

request.setCharacterEncoding(characterEncoding);

}

}

// Pass control on to the next filter

chain.doFilter(request, response);

}

在web.xml中的配置:

1

2

3

4

5

6

7

8

9

10

11

12

SetCharacterEncoding

filters.SetCharacterEncodingFilter

encoding

GBK

SetCharacterEncoding

/*

SpringMVC

在spring mvc中可以做如下的配置:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

encodingFilter

org.springframework.web.filter.CharacterEncodingFilter

encoding

UTF-8

forceEncoding

true

true

encodingFilter

/*

一定要配置成第一个filter,否则还是不会生效。

它的实现也很简单:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18//org.springframework.web.filter.CharacterEncodingFilter#doFilterInternal

@Override

protected void doFilterInternal(

HttpServletRequest request, HttpServletResponse response, FilterChain filterChain)

throws ServletException, IOException{

String encoding = getEncoding();

if (encoding != null) {

if (isForceRequestEncoding() || request.getCharacterEncoding() == null) {

request.setCharacterEncoding(encoding);

}

if (isForceResponseEncoding()) {

response.setCharacterEncoding(encoding);

}

}

filterChain.doFilter(request, response);

}

setCharacterEncoding是Servlet规范中定义的方法, 看下tomcat的实现:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32//org.apache.catalina.core.ApplicationHttpRequest#mergeParameters

/**

* Merge the parameters from the saved query parameter string (if any), and

* the parameters already present on this request (if any), such that the

* parameter values from the query string show up first if there are

* duplicate parameter names.

*/

private void mergeParameters(){

if ((queryParamString == null) || (queryParamString.length() < 1))

return;

HashMap queryParameters = new HashMap();

String encoding = getCharacterEncoding();

if (encoding == null)

encoding = "ISO-8859-1";

RequestUtil.parseParameters(queryParameters, queryParamString,

encoding);

Iterator keys = parameters.keySet().iterator();

while (keys.hasNext()) {

String key = keys.next();

Object value = queryParameters.get(key);

if (value == null) {

queryParameters.put(key, parameters.get(key));

continue;

}

queryParameters.put

(key, mergeValues(value, parameters.get(key)));

}

parameters = queryParameters;

}

可以看到默认的编码是"ISO-8859-1", 为什么要设置成第一个filter呢,找到调用的地方看:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18//org.apache.catalina.core.ApplicationHttpRequest#parseParameters

/**

* Parses the parameters of this request.

*

* If parameters are present in both the query string and the request

* content, they are merged.

*/

void parseParameters(){

if (parsedParams) {

return;

}

parameters = new HashMap();

parameters = copyMap(getRequest().getParameterMap());

mergeParameters();

parsedParams = true;

}

可以看到,parseParameters只会调用一次,如果在前面的filter中尝试获取Parameters中的参数,这个tomcat就会用默认的编码去解析传入的参数了。

参考

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值