|
Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com |
From: Stephen (sa7ori_at_broken.blackroses.com)
Date: Tue Oct 08 2002 - 15:53:09 CDT
Many people have proported to be able to go from the hex of the shellcode
back to the actual human readable asm. Many people, dont seem to do it
properly. So I started writing something on my own to do it, one of the
biggest difficulties I had is (specifically on x86) basically demonstrated
below. Many assume that all that is needed to do is construct a big
struct or array of the hex values of all the x86 commands, and simply step
through the shellcode doing the translation back to the corresponding asm
instruction. Using this method is REALLY unreliable, and is basically
impossible because of the way x86 handles some instructions based on the
operands etc.
for example:
0x80483b0 <main+20>: mov $0xb,%eax
0x80483b5 <main+25>: mov %esi,%ebx
two mov instructions that presumably have the same opcode right!?
so if we x/bx main+20 and main+25 the same hex opcode should presumably
be there. this isnt the case.
(gdb) x/bx main+20
0x80483b0 <main+20>: 0xb8
(gdb) x/bx main+25
0x80483b5 <main+25>: 0x89
if you get the INtel x86 developers notes you can generally get a list
of the hex opcodes for the instructions (24319101.pdf).
We can see that MOV has many faces one of which is 0x89, but as
demonstrated above, we cant rely on this as a general rule, so it is not
as easy as it looks.
Many "disassemblers" just construct large matrix of opcodes, their sizes
and such, but this really isnt accurate. What I see that most people
have done is to take the hex opcodes and then to convert them to binary
and take the bits that correspond to the actual x86 command and OR them
with the values of the operands of the operation (registers, etc) and then
convert them back to hex and test if they match with values in the
shellcode string. THis is VERY painstaking, and again considerably
unreliable. I suggest perhaps perusing the source
code of gdb to see how it does the OR and all its stuff (x86).
opcodes/i386-dis.c is a good place to get started (in the gdb src tree).
When it comes down to it, x86 is VERY nasty. good luck, I would try to
start small, and just keep building upon the routines that do the
coversion. Using the bitwise OR is just a good a method to start with as
any. For most x86 shellcode building a really rough matrix of coversion
values and doing ORs has worked in most GENERAL cases.
On Tue, 8 Oct 2002, Sean Zadig wrote:
> Hi,
> I'm doing some research into creating variants of common attacks, but I ran
> into a problem of sorts. For most of the attacks I have, the shellcode
> consists of the overflow and the actual malicious code that is run. I want
> to be able to isolate the overflow from the rest of the shellcode and use
> that to create attack variants. Problem is, I don't know where one ends and
> the other begins! I figure if I turn the hex-encoded shellcode back into
> assembly code, I could probably figure it out. I'm familiar with how to do
> the reverse in gdb, but is it possible to do what I want? To restate:
> shellcode -> asm is what I need. If this is a simple thing, my apologies -
> but the security-basics list rejected my post =)
> -Sean Zadig
>
> -----
> Sean Zadig
> Student, UC Davis
> PGP Key ID: 0xDE44A79F
> 7EE1 C80A A0C1 B224 45CE F74B 5835 0115 DE44 A79F
>
>
> _________________________________________________________________
> Chat with friends online, try MSN Messenger: http://messenger.msn.com
>
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]