Hex-Rays v7.2 vs. v7.1 Decompiler Comparison Page
Below you will find side-by-side comparisons of v7.1 and v7.2 decompilations. Please maximize the window too see both columns simultaneously.
The following examples are displayed on this page:
- Magic divisions in 64-bit code
- More aggressive ‘if’ to ‘boolean’ folding
- Better type of ‘this’ argument
- Improved union field selection
- Improved recognition of ‘for’ loops
- Added support for shifted pointers
- Better recognition of inlined standard functions
- Improved application of pre-increment and pre-decrement
- Added support for RRX addressing mode in ARM
- Improved constant propagation in global memory
- Added support for Objective C blocks
- Improved recognition of 64-bit comparisons
- Merged common code in ‘if’ branches
- Added forced stack variables
- Added support for virtual calls
NOTE: these are just some selected examples that can be illustrated as side-by-side differences. There are many other improvements and new features that are not mentioned on this page.
Magic divisions in 64-bit code
In the past the Decompiler was able to recognize magic divisions in 32-bit code. We now support magic divisions in 64-bit code too.
{% tabs %} {% tab title=“PSEUDOCODE V7.2” %}
return 21600 * (t / 21600);
{% endtab %}
{% tab title=“PSEUDOCODE V7.1” %}
return 21600
* (((signed __int64)((unsigned __int128)(1749024623285053783LL
* (signed __int128)t) >> 64) >> 11) - (t >> 63));
{% endtab %} {% endtabs %}
More aggressive ‘if’ to ‘boolean’ folding
More aggressive folding of if_one_else_zero constructs; the output is much shorter and easier to grasp.
{% tabs %} {% tab title=“PSEUDOCODE V7.2” %}
return a1 << 28 != 0 && (a1 & (unsigned __int8)(a1 - 1)) == 0;
{% endtab %}
{% tab title=“PSEUDOCODE V7.1” %}
v1 = 1;
v2 = 1;
if ( !(a1 << 28) )
v2 = 0;
if ( !((unsigned __int8)a1 & (unsigned __int8)(a1 - 1)) )
v1 = 0;
return v2 && !v1;
{% endtab %} {% endtabs %}
Better type of ‘this’ argument
The decompiler tries to guess the type of the first argument of a constructor. This leads to improved listing.
{% tabs %} {% tab title=“PSEUDOCODE V7.2” %}
XImage *__fastcall XImage::setHotSpot(XImage *this, int a2, int a3)
{
LOWORD(this->height) = a2;
HIWORD(this->height) = a3;
return this;
}
{% endtab %}
{% tab title=“PSEUDOCODE V7.1” %}
int __fastcall XImage::setHotSpot(int this, int a2, int a3)
{
*(_WORD *)(this + 4) = a2;
*(_WORD *)(this + 6) = a3;
return this;
}
{% endtab %} {% endtabs %}
Improved union field selection
The decompiler has a better algorithm to find the correct union field. This reduces the number of casts in the output.
{% tabs %} {% tab title=“PSEUDOCODE V7.2” %}
float __fastcall ret4f(__n128 a1)
{
return a1.n128_f32[2];
}
{% endtab %}
{% tab title=“PSEUDOCODE V7.1” %}
float __fastcall ret4f(__n128 a1)
{
return *(float *)&a1.n128_u32[2];
}
{% endtab %} {% endtabs %}
Improved recognition of ‘for’ loops
We improved recognition of ‘for’ loops, they are shorter and much easier to understand.
{% tabs %} {% tab title=“PSEUDOCODE V7.2” %}
for ( i = 0; i < 16; ++i )
{
printf("%x", *(unsigned __int8 *)(i + v2) >> 4);
printf("%x", *(_BYTE *)(i + v2) & 0xF);
}
{% endtab %}
{% tab title=“PSEUDOCODE V7.1” %}
v3 = 0;
do
{
printf("%x", (unsigned int)*(unsigned __int8 *)(v3 + v2) >> 4);
printf("%x", *(_BYTE *)(v3++ + v2) & 0xF);
}
while ( v3 < 16 );
{% endtab %} {% endtabs %}
Added support for shifted pointers
Please note that the code on the left is completely illegible; the assembler code is probably easier to work with in this case.
However, the code on the right is very neat.
JFYI, below is the class hierarchy for this example:
struct __cppobj B1
{
B1_vtbl *__vftable /*VFT*/;
char d1[4];
};
struct __cppobj B2
{
B2_vtbl *__vftable /*VFT*/;
char d2[4];
};
struct __cppobj A : B1, B2
{
char d3[4];
};
Also please note that the source code had
A::a2(A *this)
but at the assembler level we have
A::a2(B2 *this)
Visual Studio plays such tricks.
{% tabs %} {% tab title=“PSEUDOCODE V7.2” %}
int __thiscall A::a2(B2 *__shifted(A,8) this)
{
printf("A::a2 %p\n", ADJ(this));
printf("A::d2 %p\n", ADJ(this)->d2);
return ADJ(this)->d3[0];
}
{% endtab %}
{% tab title=“PSEUDOCODE V7.1” %}
int __thiscall A::a2(B2 *this)
{
B2 *v1; // ST08_4
v1 = this;
printf("A::a2 %p\n", this - 1);
printf("A::d2 %p\n", (char *)v1 + 4);
return *((char *)v1 + 8);
}
{% endtab %} {% endtabs %}
Better recognition of inlined standard functions
Yes, the code on the left and on the right do the same. We prefer the right side, very much.
{% tabs %} {% tab title=“PSEUDOCODE V7.2” %}
if ( !memcmp(i + 10, "AMIBIOSC", 8u) )
return i + 10;
{% endtab %}
{% tab title=“PSEUDOCODE V7.1” %}
v2 = 0;
v3 = 1;
v4 = i + 10;
v5 = "AMIBIOSC";
v6 = 8;
do
{
if ( !v6 )
break;
v2 = *v4 < (const unsigned __int8)*v5;
v3 = *v4++ == *v5++;
--v6;
}
while ( v3 );
if ( (!v2 && !v3) == v2 )
return i + 10;
{% endtab %} {% endtabs %}
Improved application of pre-increment and pre-decrement
Minor stuff, one would say, and we’d completely agree. However, these minor details make reading the output a pleasure.
{% tabs %} {% tab title=“PSEUDOCODE V7.2” %}
v5 = *++v4;
result = --a4;
{% endtab %}
{% tab title=“PSEUDOCODE V7.1” %}
v5 = (v4++)[1];
result = a4-- - 1;
{% endtab %} {% endtabs %}
Added support for RRX addressing mode in ARM
This is a rare addressing mode that is nevertheless used by compilers. Now we support it nicely.
{% tabs %} {% tab title=“PSEUDOCODE V7.2” %}
__int64 __fastcall sar64(__int64 a1)
{
return a1 >> 1;
}
{% endtab %}
{% tab title=“PSEUDOCODE V7.1” %}
__int64 __fastcall sar64(__int64 a1)
{
__int64 result; // r0
SHIDWORD(a1) >>= 1;
__asm { MOV R0, R0,RRX }
return result;
}
{% endtab %} {% endtabs %}
Improved constant propagation in global memory
The new decompiler managed to disentangle the obfuscation code and convert it into a nice strcpy()
{% tabs %} {% tab title=“PSEUDOCODE V7.2” %}
strcpy((char *)&dword_1005DF9A, "basic_string");
{% endtab %}
{% tab title=“PSEUDOCODE V7.1” %}
dword_1005DF9A = 0xADB0A3A3;
dword_1005DF9E = 0xBCB499A6;
dword_1005DFA2 = 0xABA5A3BB;
LOBYTE(dword_1005DF9A) = 'b';
BYTE1(dword_1005DF9A) ^= 0xC2u;
HIWORD(dword_1005DF9A) = 'is';
LOBYTE(dword_1005DF9E) = 'c';
BYTE1(dword_1005DF9E) ^= 0xC6u;
HIWORD(dword_1005DF9E) = 'ts';
LOBYTE(dword_1005DFA2) = 'r';
BYTE1(dword_1005DFA2) ^= 0xCAu;
HIWORD(dword_1005DFA2) = 'gn';
byte_1005DFA6 = 0;
{% endtab %} {% endtabs %}
Added support for Objective C blocks
The new version knows about ObjC blocks and can represent them correctly in the output. See Edit, Other, Objective-C submenu in IDA, it contains the necessary actions to analyze the blocks.
{% tabs %} {% tab title=“PSEUDOCODE V7.2” %}
__int64 __fastcall sub_181450634(__int64 a1, __int64 a2, __int64 a3)
{
Block_layout_18145064C blk; // [xsp+0h] [xbp-30h]
blk.isa = _NSConcreteStackBlock;
*(_QWORD *)&blk.flags = 0x42000000LL;
blk.invoke = sub_181450694;
blk.descriptor = (Block_descriptor_1 *)&unk_1B0668958;
blk.lvar1 = *(_QWORD *)(a1 + 32);
blk.lvar2 = a3;
return sub_18144BD0C(a2, &blk);
}
{% endtab %}
{% tab title=“PSEUDOCODE V7.1” %}
__int64 __fastcall sub_181450634(__int64 a1, __int64 a2, __int64 a3)
{
void *(*v4)[32]; // [xsp+0h] [xbp-30h]
__int64 v5; // [xsp+8h] [xbp-28h]
__int64 (__fastcall *v6)(); // [xsp+10h] [xbp-20h]
void *v7; // [xsp+18h] [xbp-18h]
__int64 v8; // [xsp+20h] [xbp-10h]
__int64 v9; // [xsp+28h] [xbp-8h]
v4 = _NSConcreteStackBlock;
v5 = 1107296256LL;
v6 = sub_181450694;
v7 = &unk_1B0668958;
v8 = *(_QWORD *)(a1 + 32);
v9 = a3;
return sub_18144BD0C(a2, &v4);
}
{% endtab %} {% endtabs %}
Improved recognition of 64-bit comparisons
We continue to improve recognition of 64-bit arithmetics. While it is impossible to handle all cases, we do not give up.
{% tabs %} {% tab title=“PSEUDOCODE V7.2” %}
gettimeofday(&tv, 0);
v0 = 90 * (v3 / 1000 + 1000LL * *(_QWORD *)&tv);
if ( v0 < 0xFFFFFFFFFFFFFFFFLL )
stamp = 90 * (v3 / 1000 + 1000LL * *(_QWORD *)&tv);
{% endtab %}
{% tab title=“PSEUDOCODE V7.1” %}
gettimeofday(&tv, 0);
v0 = 1000LL * (unsigned int)tv.tv_usec;
HIDWORD(v0) = (unsigned __int64)(1000LL * *(_QWORD *)&tv) >> 32;
v1 = 90LL * (unsigned int)(v4 / 1000 + v0);
HIDWORD(v1) = (unsigned __int64)(90 * (v4 / 1000 + v0)) >> 32;
if ( HIDWORD(v1) < 0xFFFFFFFF || -1 == HIDWORD(v1) && (unsigned int)stamp > (unsigned int)v1 )
stamp = v1;
{% endtab %} {% endtabs %}
Merged common code in ‘if’ branches
Yet another optimization rule that lifts common code from ‘if’ branches. We made it even more aggressive.
{% tabs %} {% tab title=“PSEUDOCODE V7.2” %}
mywcscpy();
if ( a3 < 0 )
v4 = -a3;
{% endtab %}
{% tab title=“PSEUDOCODE V7.1” %}
if ( a3 >= 0 )
{
mywcscpy();
}
else
{
mywcscpy();
v4 = -a3;
}
{% endtab %} {% endtabs %}
Added forced stack variables
Sometimes compilers reuse the same stack slot for different purposes. Many our users asked us to add a feature to handle this situation. The new decompiler addresses this issue by adding a command to force creation of a new variable at the specified point. Currently we support only aliasable stack variables because this is the most common case.
In the sample above the slot of the p_data_format variable is reused. Initially it holds a pointer to an integer (data_format) and then it holds a simple integer (errcode). Previous versions of the decompiler could not handle this situation nicely and the output would necessarily have casts and quite difficult to read. The two different uses of the slot would be represented just by one variable. You can see it in the left listing.
The new version produces clean code and displays two variables. Naturally it happens after applying the force new variable command.
{% tabs %} {% tab title=“PSEUDOCODE V7.2” %}
data_format = *p_data_format;
if ( *p_data_format < 0 || data_format > 13 )
{
errcode = 2;
SetError(&this->status, &errcode, "format not one of accepted types");
}
{% endtab %}
{% tab title=“PSEUDOCODE V7.1” %}
data_format = *p_data_format;
if ( *p_data_format < 0 || data_format > 13 )
{
p_data_format = (int *)2;
SetError(&this->status, (errcode_t *)&p_data_format, "format not one of accepted types");
}
{% endtab %} {% endtabs %}
Added support for virtual calls
Well, these listings require no comments, the new version apparently wins!
{% tabs %} {% tab title=“PSEUDOCODE V7.2” %}
void __cdecl test3(D7 *a1)
{
a1->f1(&a1->A1);
a1->f2(&a1->D3);
a1->f3(&a1->D5);
a1->f4(&a1->A4);
a1->f5(a1);
a1->f6(a1);
a1->g0(&a1->D5);
a1->g5(&a1->D5);
a1->g7(a1);
if ( a1 )
a1->~D7(a1);
}
{% endtab %}
{% tab title=“PSEUDOCODE V7.1” %}
void __cdecl test3(D7 *a1)
{
(**((void (__cdecl ***)(char *))a1 + 12))((char *)a1 + 48);
(*(void (__cdecl **)(char *))(*((_DWORD *)a1 + 10) + 12))((char *)a1 + 40);
(**((void (__cdecl ***)(char *))a1 + 6))((char *)a1 + 24);
(**((void (__cdecl ***)(char *))a1 + 26))((char *)a1 + 104);
(**(void (__cdecl ***)(D7 *))a1)(a1);
(*(void (__cdecl **)(D7 *))(*(_DWORD *)a1 + 12))(a1);
(*(void (__cdecl **)(char *))(*((_DWORD *)a1 + 6) + 4))((char *)a1 + 24);
(*(void (__cdecl **)(char *))(*((_DWORD *)a1 + 6) + 16))((char *)a1 + 24);
(*(void (__cdecl **)(D7 *))(*(_DWORD *)a1 + 16))(a1);
if ( a1 )
(*(void (__cdecl **)(D7 *))(*(_DWORD *)a1 + 8))(a1);
}
{% endtab %} {% endtabs %}