写在前面
无意中看到ch1ng师傅的文章觉得很有趣,不得不感叹师傅太厉害了,但我一看那长篇的函数总觉得会有更骚的东西,所幸还真的有,借此机会就发出来一探究竟,同时也不得不感慨下RFC文档的妙处,当然本文针对的技术也仅仅只是在流量层面上waf的绕过
前置
这里简单说一下师傅的思路
部署与处理上传war的servlet是 org.apache.catalina.manager.HTMLManagerServlet
在文件上传时最终会通过处理 org.apache.catalina.manager.HTMLManagerServlet#upload
调用的是其子类实现类 org.apache.catalina.core.ApplicationPart#getSubmittedFileName
这里获取filename的时候的处理很有趣
看到这段注释,发现在RFC 6266文档当中也提出这点
Avoid including the "\" character in the quoted-string form of the filename parameter, as escaping is not implemented by some user agents, and "\" can be considered an illegal path character.
那么我们的tomcat是如何处理的嘞?这里它通过函数 HttpParser.unquote
去进行处理
public static String unquote(String input) { if (input == null || input.length() < 2) { return input; } int start; int end; // Skip surrounding quotes if there are any if (input.charAt(0) == '"') { start = 1; end = input.length() - 1; } else { start = 0; end = input.length(); } StringBuilder result = new StringBuilder(); for (int i = start ; i < end; i++) { char c = input.charAt(i); if (input.charAt(i) == '\\') { i++; result.append(input.charAt(i)); } else { result.append(c); } } return result.toString(); }
简单做个总结如果首位是 "
(前提条件是里面有 \
字符),那么就会去掉跳过从第二个字符开始,并且末尾也会往前移动一位,同时会忽略字符 \
,师傅只提到了类似 test.\war
这样的例子
但其实根据这个我们还可以进一步构造一些看着比较恶心的比如 filename=""y\4.\w\arK"
继续深入
还是在 org.apache.catalina.core.ApplicationPart#getSubmittedFileName
当中,一看到这个将字符串转换成map的操作总觉得里面会有更骚的东西(这里先是解析传入的参数再获取,如果解析过程有利用点那么也会影响到后面参数获取),不扯远继续回到正题
首先它会获取header参数 Content-Disposition
当中的值,如果以 form-data
或者 attachment
开头就会进行我们的解析操作,跟进去一看果不其然,看到 RFC2231Utility
瞬间不困了
后面这一坨就不必多说了,相信大家已经很熟悉啦支持QP编码,忘了的可以考古看看我之前写的文章 Java文件上传大杀器-绕waf(针对commons-fileupload组件) ,这里就不再重复这个啦,我们重点看三元运算符前面的这段
既然如此,我们先来看看这个hasEncodedValue判断标准是什么,字符串末尾是否带 *
public static boolean hasEncodedValue(final String paramName) { if (paramName != null) { return paramName.lastIndexOf('*') == (paramName.length() - 1); } return false; }
在看解密函数之前我们可以先看看 RFC 2231 文档当中对此的描述,英文倒是很简单不懂的可以在线翻一下,这里就不贴中文了
Asterisks ("*") are reused to provide the indicator that language and character set information is present and encoding is being used. A single quote ("'") is used to delimit the character set and language information at the beginning of the parameter value. Percent signs ("%") are used as the encoding flag, which agrees with RFC 2047. Specifically, an asterisk at the end of a parameter name acts as an indicator that character set and language information may appear at the beginning of the parameter value. A single quote is used to separate the character set, language, and actual value information in the parameter value string, and an percent sign is used to flag octets encoded in hexadecimal. For example: Content-Type: application/x-stuff; title*=us-ascii'en-us'This%20is%20%2A%2A%2Afun%2A%2A%2A
接下来回到正题,我们继续看看这个解码做了些什么
public static String decodeText(final String encodedText) throws UnsupportedEncodingException { final int langDelimitStart = encodedText.indexOf('\''); if (langDelimitStart == -1) { // missing charset return encodedText; } final String mimeCharset = encodedText.substring(0, langDelimitStart); final int langDelimitEnd = encodedText.indexOf('\'', langDelimitStart + 1); if (langDelimitEnd == -1) { // missing language return encodedText; } final byte[] bytes = fromHex(encodedText.substring(langDelimitEnd + 1)); return new String(bytes, getJavaCharset(mimeCharset)); }
结合注释可以看到标准格式 @param encodedText - Text to be decoded has a format of {@code <charset>'<language>'<encoded_value>}
,分别是编码,语言和待解码的字符串,同时这里还适配了对url编码的解码,也就是 fromHex
函数,具体代码如下,其实就是url解码
private static byte[] fromHex(final String text) { final int shift = 4; final ByteArrayOutputStream out = new ByteArrayOutputStream(text.length()); for (int i = 0; i < text.length();) { final char c = text.charAt(i++); if (c == '%') { if (i > text.length() - 2) { break; // unterminated sequence } final byte b1 = HEX_DECODE[text.charAt(i++) & MASK]; final byte b2 = HEX_DECODE[text.charAt(i++) & MASK]; out.write((b1 << shift) | b2); } else { out.write((byte) c); } } return out.toByteArray(); }
因此我们将值当中值得注意的点梳理一下
- 支持编码的解码
- 值当中可以进行url编码
- @code<charset>'