≡ Menu

Changelog

User-defined Function Declarations

The main feature of this release allows you (the advanced user) to modify the C Decompiler’s program model by providing declarations for functions at the boundary of the model: an example can be found further down, and a detailed feature description is here.

Improved scrolling and navigation speed in the GUI

We solved the problem of the GUI being slow.

GUI Zoom

We added the zoom capability (Ctrl-MouseWheel) for all GUI screens.

Token Navigation becomes Character Navigation

We changed the token-wise navigation in the source code pane to character-wise navigation. The source code pane behaves much more like an editor now.

More

Did I mention the bugs we fixed and the decompilation process improvements we introduced?

User-defined Function Declaration Example

Consider decompiling the HelloWorld example with the default settings in the Load Dialog.

UserDefinedFunctionDecl-Example-1

The analysis of the root function wmainCRTStartup comes up with a 64-bit integer (long long) return value. We know from the original C source that wmainCRTStartup is defined strictly with a 32-bit return value.

C4Decompiler cannot detect this, because it does not have full access to this function’s signature. The caller’s use of wmainCRTStartup’s return value would usually define the return value’s size; however, the caller is part of the Windows OS and not available for analysis. Therefore, the C Decompiler uses the location defined by the root function’s ABI. This is RAX (64-bit) and translates to a ‘long long’ return value.

Our correction:

UserDefinedFunctionDecl-Example-2

We select “Edit Function Declaration” from wmainCRTStartup’s context menu …

UserDefinedFunctionDecl-Example-3

… and change the return type from ‘long long’ to ‘int’.

The result:

UserDefinedFunctionDecl-Example-4

The return value of wmainCRTStartup changes as expected. Besides applying the requested change itself, the C Decompiler propagates the update throughout the dependencies in the program: thus, the variable’s type and the return type of the callee function _tmainCRTStartup change to unsigned int.

As we observe, small changes at the outside of the model can have a huge impact on the model and the resulting C source.

Basic Vector Detection

The address calculation code for vector access is: vectorBaseAdr + index * elementSize. The Decompiler detects vector access and combines it with other type information to translate most vectors correctly.

For static vectors of structures the C Compiler combines the vector’s address and the structure field offset to a single constant, i.e. the address calculation becomes ((vectorBaseAdr + fieldOffset) + index * elementSize). This case is not correctly handled and results in the definition of multiple, overlapping vectors.

If-Switch

The C-Compiler translates C switches to translation and jump vectors or an if-else chain. This new version detects and handles most switch implementations correctly.

Function Tail Call

Function tail calls are a C Compiler optimization where the last call in a function is implemented using a jump instead of a call to save resources. These are now handled correctly by the C4Decompiler.

Shared Conditional Jump

The normal, not optimized code for conditions combines an instruction (1) that sets the Flag register and a conditional jump instruction (2) that jumps (or not) according to the Flag register content. In the example below a single jump is used for two Flag setting instructions with different operands.

The ‘flag’ variable represents the result from the flag setting instructions for the shared conditional jump.

C main() Function Detection

The signature files distributed with C4Decompiler allow the C Decompiler to recognize known static C runtime functions. They also provide the necessary information to detect the user provided ‘main’ function.

FunctionTree-Main

C Strings

The detection of C Strings is a guess, based on the initial content of static variables and the use of data types (e.g. char *) that can indicate a C String. C4Decompiler lists now possible C Strings as comment for the definition and use of static variables.

CStringDetection

GUI improvements

The C Decompiler GUI received many improvements to simplify navigation and display more information. The key improvements are

  • Incremental search in the Function Tree and Source Windows

GUI-IncrementalSearch

  • Rework of the Tooltip and Property Window content

GUI-Tooltip

  • New Decompile Progress Bar

Bug Fixes

A number of bugs were fixed aswell.

Keyboard Shortcuts

Global
F1 Help
F5 Re-Analyse
Ctrl-N New Session
Ctrl-F4 Close current Window
Ctrl-X Close Session
Alt-F4 Close Application
Alt-O Options
Alt-1 Recent File 1
Alt-2 Recent File 2
Alt-3 Recent File 3
Source and Property Window
Ctrl-Click on Node Navigate to definition
Source Window
Ctrl-M Toggle Assembler
Ctrl-I Open/Next Incremental Search
Ctrl-Shift-I Previous Incremental Search
Esc Close Search
Ctrl- - Navigate backward
Ctrl- + Navigate forward
Arrow Keys Next Node in that direction
Home Start of Line
End End of Line
Page Down Page Down
Page Up Page Up
Ctrl-Page Down Next Window
Ctrl-End Bottom of Document
Ctrl-Home Top of Document
Ctrl-Page Up Previous Window
Function Tree
Ctrl-I Open/Next Incremental Search
Ctrl-Shift-I Previous Incremental Search
Arrow Keys Navigation in Tree

0.7.6 Alpha Release

Drag & Drop

… makes your life easier. Drop you binary on the GUI to start a new session.

Struct detection, propagation and merging

Our C Decompiler has structure detection and merging for quite some time. It worked ok for smaller examples, but failed to create correct structures for the bigger Rogue example. The main structure detection deficits:

  • Missed merging opportunities resulted in too many remaining structures (> 100 instead of 2).
  • Aggressive structure propagation created monster structures, by combining unrelated structures.

The new release merges now the structures in our Rogue example correctly. More important structure propagation is minimized to avoid monster structures. An example for problematic struct propagation:

An unlimited struct type propagation learns from the first malloc call that malloc returns a ‘struct big *’ and then assign this type wrongly to variable ‘s’. The wrong type spread from there like an infection to other expressions, variables, functions…
The solution is to detect functions with generic (void *) parameter or return values and limit the struct type propagation for them.

Result example

The detected ‘struct S750’ side by side compared with the original ‘struct obj’.

Struct S750 is an exact match, except from some undetectable ‘unsigned’ modifiers. And of course the names and comments ;), we are working on it.

Vectors as Structs

We don’t have full vector detection yet and some vectors become structures instead.

The structure results from constant offsets used to access the vector, i.e. vector[ 1 ] looks for the C Decompiler very similar to vector->f1 and only variable access to vector elements can indicate the difference.

0.7.5 Alpha Release

XMM break up for better float type detection and tracking

XMM registers can hold either four 32-bit values or two 64-bit values to support SIMD (Single Instruction, Multiple Data) instructions that perform the same operations on multiple values at the same time. The challenge for a C Decompiler is to identify through live analysis the actually used parts of the XMM registers and track the types correctly. The BITFLD_GET and BITFLD_MERGE macros used in the generated C code above show how the previous versions had to extract and merges single values from and to the XMM register.
The monolithic nature of the 128-bit register complicates the propagation of types, e.g. stat0277ac in the last line was not identified as a float and therefore the assignment required a float cast.

The new version breaks the XMM register in our IR representation (not shown) into two 64-bit and four 32-bit sub-registers which overlay the original 128-bit register. The XMM registers are simply represented as multiple ‘normal’ registers which hold only one data item. The following analysis then works  on the converted registers as it would on any other register.

Function return minimization

Types are propagated between functions via the function parameters and the function return value. The size of the return value is not only relevant for correct typing but also for the live analysis. Having a 32-bit EAX return value will result in more definitions or larger locations live throughout the function than having an 8-bit return value in AL. The live property separates relevant code from irrelevant code and therefore minimizing live definitions is a priority for a C Decompiler.
Function ABIs define return value locations, e.g. Windows ABIs define EAX (I86) and RAX (I64) as integer returns. The register (EAX) used by the compiler suggests a return value size as you can see in the example below.

Initial guess: Looks like the function returns a 32-bit value.
Only analyzing the use of the return value in all callers provides us with the exact live state of the return value. If no caller uses the return value (at all) then the function has no return value, if the callers doesn’t use more than the lower 16-bit then the return value of the function is AX (the lower 16-bit sub-register of EAX). There is no assumption-free way of identifying the return value without analyzing the callees (see Decompiler Logic).
While we always had the caller analysis to detect the correct return register we now apply the minimization of the detected return registers.

Other

  • We fixed as usual several reported and detected errors in the loader, decoder, core and GUI.
  • This version is our auto update premiere! Please let us know if you have any problems.

0.7.3 Alpha Release

This is a snapshot release fresh from the developer machine to show the progress of the last weeks.

  • The project focus is for now I64 (Intel x64) only. I32 (Intel x32) is disabled.
  • All functions in our Rogue example decompile now (sometimes correct).
  • Better function parameters and return value identification (less).
  • Basic float support (XMM).
  • Countless fixes, corrections, … everywhere.
  • GUI improvements (toolbar, better scroll performance, …).
  • The new installer requires no administrator rights and installs only into user directories.
  • Automatic update added.

0.7.2 Alpha Release

This is a bug fix release. It addresses most of the user reported problems and handles more functions, but doesn’t improve much the decompile quality.

  • 64-bit Session mode and ABIs added.
  • Corrected decoding of ‘mov reg, imm64’ instruction.
  • Added some missing instruction functions.
  • Exe loader handles ordinal imports from DLLs.
  • Overlapping Basic Blocks are treated as separate Basic Block instances.
  • Fixed problem with the CFG column in mixed C/Assembler view.
  • Fixed the errors for reported bug stack traces.

0.7.1 Alpha Release

  • IR expression simplifications of operation combinations that cancel each other out, such as truncation and extension operations. The generated C source is cleaner and shorter.
  • Unnecessary C type casts in assignments removed, e.g. int2 = (short)int4 becomes int2 = int4.
  • C type cast chains simplified, e.g. (unsigned short)(short)x becomes (unsigned short)x.
  • Several small bug fixes.

0.7.0 Alpha Release

  • More robust decompile framework to handle real world programs. Rogue example with 300+ functions runs through and generates some useable code.
  • Countless decompile and GUI improvements.
  • Installs (again) for all users of a machine and needs Administrator rights for installation.

RogueScreenshot

0.6.1 (Alpha)

  • Installer changed to per-user installation. Works now without administrator rights, but user has still to manually change the program file installation directory if he has no administrator rights.

0.6.0 (Alpha)

  • Everything, this is the first Alpha release.