Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

Decompiling

Name: Anonymous 2009-02-21 3:19

decompilers (fail to) work as such: find which architecture -> disassemble -> pattern finder/search for common code segments/idioms, language dependent -> turn into higher level code. it would seem to me what is needed is a good database of the high-level -> assembly idioms, that the decompiler accesses at run time and is always updated as new links are found (thinking that the decompiler would be part of a suite with other tools to help find these links). how many different ways do different compilers really have available to make "if(a==b)"? and more complex code is just combination of these smaller parts. assuming they optimize when compiling, that's what's really needed, a way to keeping track of these different ways of doing the same thing, slight variations, etc., and of course a way to find them in the first place (the suite i mentioned, which i also have ideas for; properly formatted assembly for starters). i don't get why decompilers don't work.

Name: Anonymous 2009-02-22 5:03

>>13
A skilled reverse engineer can read auto-generated asm fluently in a lot of cases, so he doesn't need a decompiler. Automated decompilers such as Hex Rays can be helpful when one is dealing with fairly simply but repetitive code, but it's not really required.

I'm a bit curious how better would an automatic decompiler work if it were being fed all the needed structures from header files and prototypes from PDBs. However as you say, a reverser's job is to extract meaning from code, which usually means giving proper names to functions and documenting them, and this is not something a decompiler can do.

As for those that say that decompilation is NP-Complete, this depends on your definition of decompilation.

The compilation process will discard/transform some unnecesarry information for generating an executable ( such as comments, function names, types, macros(in C),code structure and so on ), and there's no way a decompiler can recover most of them, at best it can infer about some things(example: if x is an integer and the code is treating it as signed ).

If you think of decompilation as a reverse transformation of code to the language it was originally written in(or some other high level one ), then that is possible ( and I don't mean simply translating asm instructions to C directly, but trying to recover a similar code to what was originally there bar the macros/comments/unrecoverable types(unless user provides them)/compiler optimizations/function names ).

If you think of decompilation as recovering the exact same original code as it was before compiling, then that is impossible.

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List