Multibyte character exploits - PHP/MySQL

Summary. Yes, the issue is that, in some character encodings (like UTF-8), a single character is represented as multiple bytes. One way that some programmers try prevent SQL injection is to escape all single quotes in untrusted input, before inserting it into their SQL query. However, many standard quote-escaping functions are ignorant of the character encoding that the database will use and process their input as a sequence of bytes, oblivious to the fact that a single character might fill up several bytes. This means that the quote-escaping function is interpreting the string differently than the database will. As a result, there are some cases where the quote-escaping function might fail to escape portions of the string that the database will interpret as a multi-byte encoding of a single quote; or might inadvertently break up a multi-byte character encoding in a way that introduces a single quote where one was not previously present. Thus, multi-byte character exploits give attackers a way to do SQL injection attacks even when the programmer thought they were adequately escaping their inputs to the database.

The impact. If you use prepared/parametrized statements to form all database connections, you are safe. Multi-byte attacks will fail. (Barring bugs in the database and the library, of course. But empirically, those seem to be rare.)

However, if you try to escape untrusted inputs and then form a SQL query dynamically using string concatenation, you may be vulnerable to multi-byte attacks. Whether you are in fact vulnerable depends upon specific details of the escaping function you use, the database you use, the character encoding that you're using with the database, and possibly other factors. It can be hard to predict whether multi-byte attacks will succeed. As a result, forming SQL queries using string concatenation is fragile and not recommended.

Technical details. If you'd like to read about the details of the attacks, I can provide you with a number of links that explain the attacks in great detail. There are several attacks:

  • Basic attacks on, e.g., UTF-8 and other character encodings by eating up extra backslashes/quotes introduced by the quoting function: see, e.g., here.

  • Sneaky attacks on, e.g., GBK, that work by tricking the quoting function to introduce an extra quote for you: see, e.g., Chris Shiflett's bloghere, or here.

  • Attacks on, e.g., UTF-8, that conceal the presence of a quote by using an invalid non-canonical (over-long) encoding of the single quote: see, e.g., here. Basically, the normal way of encoding a single quote has it fit into a single-byte sequence (namely, 0x27). However, there are also multi-byte sequences that the database might decode as a single quote, and that do not contain the 0x27 byte or any other suspicious byte value. As a result, standard quote-escaping functions may fail to escape those quotes.



    Mutli-byte attacks are not limited to SQL Injection. In a general sense multi-byte attacks lead to a "byte consumption" condition in which the attacker is removing control characters. This is the opposite of the classic ' or 1=1--, in which the attacker is introducing the single-quote control character. For mysql there is mysql_real_escape_string() which is designed to take care of character encoding problems. Parametrized query libraries like PDO will automatically use this function. MySQLi actually sends the parameters of the query as a separate element within a struct, which avoids the problem entirely.

    If an HTML page is rendered via Shift-JIS then it is possible to consume control characters to obtain XSS. An excellent example of this was provided in "A Tangled Web" (fantastic book!) on page 207:

    <img src="http://fuzzybunnies.com/[0xE0]">
    ...this is still a part of the mkarup...
    ...but the srever dosn't know...
    " onload="alert('this will execute!')"
    <div>
    ...page content continues...
    </div>
    

    In this case the 0xE0 is a special byte that signifies start of a 3 byte symbol. When the browser renders this html the flowing "> will be consumed and turned into a single Shift-JIS symbol. If the attacker controls the following input by means of another variable then he can introduce an event handler to obtain code execution.


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值