x64 MSVC C++ Exception Handler for the Decompiler
Hex-Rays’ support for exceptions in Microsoft Visual C++/x64
incorporates the C++ exception metadata for functions into their
decompilation, and presents the results to the user via built-in
constructs in the decompilation (try, catch, __wind,
__unwind). When the results cannot be presented entirely with
these constructs, they will be presented via helper calls in the
decompilation.
The documentation describes:
- Background behind C++ exception metadata. It is recommended that users read this first.
- Interactive operation via the GUI, configuration file, and keyboard shortcuts.
- A list of helper calls that may appear in the output.
- A note about the boundaries of
tryand__unwindregions. - Miscellaneous notes about the plugin.
BACKGROUND ON C++ EXCEPTIONS
TRY, CATCH, AND THROW
The C++ language provides the try scoped construct in which the
developer expects that an exception might occur. try blocks must be
followed by one or more scoped catch constructs for catching
exceptions that may occur within. catch blocks may use ... to
catch any exception. Alternatively, catch blocks may name the type
of an exception, such as std::bad_alloc. catch blocks with named
types may or may not also catch the exception object itself. For
example, catch(std::bad_alloc *v10) and catch(std::bad_alloc *) are both valid. The former can access the exception object through
variable v10, whereas the latter cannot access the exception object.
C++ provides the throw keyword for throwing an exception, as in
std::bad_alloc ba; throw ba;. This is represented in the output as
(for example) throw v10;. C++ also allows code to rethrow the
current exception via throw;. This is represented in the output as
throw;.
WIND AND UNWIND
Exception metadata in C++ binaries is split into two categories: try
and catch blocks, as discussed above, and so-called wind and
unwind blocks. C++ does not have wind and unwind keywords,
but the compiler creates these blocks implicitly. In most binaries, they
outnumber try and catch blocks by about 20 to 1.
Consider the following code, which may or may not throw an int as an
exception at three places:
void may_throw()
{
// Point -1
if ( rand() % 2 )
throw -1;
string s0 = "0";
// Point 0
if ( rand() % 2 )
throw 0;
string s1 = "1";
// Point 1
if ( rand() % 2 )
throw 1;
// Point 2
printf("%s %sn",
s0.c_str(),
s1.c_str());
// Implicit
// destruction
s1.~string();
s0.~string();
}
If an exception is thrown at point -1, the function exits early without executing any of its remaining code. As no objects have been created on the stack, nothing needs to be cleaned up before the function returns.
If an exception is thrown at point 0, the function exits early as
before. However, since string s0 has been created on the stack, it
needs to be destroyed before exiting the function. Similarly, if an
exception is thrown at point 1, both string s1 and string s0
must be destroyed.
These destructor calls would normally happen at the end of their
enclosing scope, i.e. the bottom of the function, where the compiler
inserts implicitly-generated destructor calls. However, since the
function does not have any try blocks, none of the function’s
remaining code will execute after the exception is thrown. Therefore,
the destructor calls at the bottom will not execute. If there were no
other mechanism for destructing s0 and/or s1, the result would
be memory leaks or other state management issues involving those
objects. Therefore, the C++ exception management runtime provides
another mechanism to invoke their destructors: wind blocks and their
corresponding unwind handlers.
wind blocks are effectively try blocks that are inserted
invisibly by the compiler. They begin immediately after constructing
some object, and end immediately before destructing that object. Their
unwind blocks play the role of catch handlers, calling the
destructor upon the object when exceptional control flow would otherwise
cause the destructor call to be skipped.
Microsoft Visual C++ effectively transforms the previous example as follows:
void may_throw_transformed()
{
if ( rand() % 2 )
throw -1;
string s0 = "0";
// Implicit try
__wind
{
if ( rand() % 2 )
throw 0;
string s1 = "1";
// Implicit try
__wind
{
if ( rand() % 2 )
throw 1;
printf("%s %sn",
s0.c_str(),
s1.c_str());
}
// Implicit catch
__unwind
{
s1.~string();
}
s1.~string();
}
// Implicit catch
__unwind
{
s0.~string();
}
s0.~string();
}
unwind blocks always re-throw the current exception, unlike
catch handlers, which may or may not re-throw it. Re-throwing the
exception ensures that prior wind blocks will have a chance to
execute. So, for example, if an exception is thrown at point 1, after
the unwind handler destroys string s1, re-throwing the exception
causes the unwind handler for point 0 to execute, thereby allowing it to
destroy string s0 before re-throwing the exception out of the
function.
STATE NUMBERS AND INSTRUCTION STATES
As we have discussed, the primary components of Microsoft Visual C++ x64
exception metadata are try blocks, catch handlers, wind
blocks, and unwind handlers. Generally speaking, these elements can
be nested within one another. For example, in C++ code, it is legal for
one try block to contain another, and a catch handler may
contain try blocks of its own. The same is true for wind and
unwind constructs: wind blocks may contain other wind blocks
(as in the previous example) or try blocks, and try and
catch blocks may contain wind blocks.
Exceptions must be processed in a particular sequence: namely, the most
nested handlers must be consulted first. For example, if a try block
contains another try block, any exceptions occurring within the
latter region must be processed by the innermost catch handlers
first. Only if none of the inner catch handlers can handle the
exception should the outer try block’s catch handlers be consulted.
Similarly, as in the previous example, unwind handlers must destruct
their corresponding objects before passing control to any previous
exception handlers (such as string s1’s unwind handler passing
control to string s0’s unwind handler).
Microsoft’s solution to ensure that exceptions are processed in the
proper sequence is simple. It assigns a “state number” to each
exception-handling construct. Each exception state has a “parent”
state number whose handler will be consulted if the current state’s
handler is unable to handle the exception. In the previous example, what
we called “point 0” is assigned the state number 0, while “point 1”
is assigned the state number 1. State 1 has a parent of 0. (State 0’s
parent is a dummy value, -1, that signifies that it has no parent.)
Since unwind handlers always re-throw exceptions, if state 1’s
unwind handler is ever invoked, the exception handling machinery
will always invoke state 0’s unwind handler afterwards. Because
state 0 has no parent, the exception machinery will re-throw the
exception out of the current function. This same machinery ensures that
the catch handlers for inner try blocks are consulted before outer
try blocks.
There is only one more piece to the puzzle: given that an exception
could occur anywhere, how does the exception machinery know which
exception handler should be consulted first? I.e., for every address
within a function with C++ exception metadata, what is the current
exception state? Microsoft C++/x64 binaries provide this information in
the IPtoStateMap metadata tables, which is an array of address
ranges and their corresponding state numbers.
GUI OPERATION
This support is fully automated and requires no user interaction. However, the user can customize the display of C++ exception metadata elements for the global database, as well as for individual functions.
GLOBAL SETTINGS
Under the Edit -> Other-> C++ exception display settings menu item,
the user can edit the default settings to control which exception
constructs are shown in the listing. These are saved persistently in the
database (i.e., the user’s choices are remembered after saving,
closing, and re-opening), and can also be adjusted on a per-function
basis (described later).
The settings on the dialog are as follows:
- Default output mode:
When the plugin is able to represent C++ exception constructs via nice constructs like
try,catch,__wind, and__unwindin the listings, these are called “structured” exception states. The plugin is not always able to represent exception metadata nicely, and may instead be forced to represent the metadata via helper calls in the listing (which are called “unstructured” states). As these can be messy and distracting, users may prefer not to see them by default. Alternatively, the user may prefer to see no exception metadata whatsoever, not even the structured ones. This setting allows the user to specify which types of metadata will be shown in the listing.
- Show wind states:
We discussed wind states and unwind handlers in the background material. Although these states can be very useful when reverse engineering C++ binaries (particularly when analyzing constructors), displaying them increases the amount of code in the listing, and sometimes the information they provide is more redundant than useful. Therefore, this checkbox allows the user to control whether they are shown by default.
- Inform user of hidden states:
The two settings just discussed can cause unstructured and/or wind states to be omitted from the default output. If this checkbox is enabled, then the plugin will inform the user of these omissions via messages at the top of the listing, such as this message indicating that one unstructured wind state was omitted:
// Hidden C++ exception states: #wind_helpers=1The contents of these messages depend upon the settings from above, and the facts about which states were hidden from display. If output of exception blocks is entirely disabled, the messages will not appear, even if this setting is enabled.
The following notes all assume that output is enabled.
If the settings indicated that unstructured states should be hidden, and there were hidden (unstructured)
trystates, the message will say something like#try_helpers=2.If the settings indicated that all
windstates should be hidden, then the message might say something like#wind=3to indicate the total number of hiddenwindstates.If the settings indicated that
windstates should be shown, but that unstructured states should be hidden, then the message might show something like#wind_helpers=1to indicate that one unstructuredwindstate was hidden.
There are three more elements on the settings dialog; most users should never have to use them. However, for completeness, we will describe them now.
- Warning behavior:
When internal warnings occur, they will either be printed to the output window at the bottom, or shown as a pop-up warning message box depending on this setting.
- Reset per-function settings:
The next section will discuss how the display settings described above can be customized on a per-function basis. This button allows the user to erase all such saved settings, such that all functions will use the global display settings the next time they are decompiled.
- Rebuild C++ metadata caches:
Before the plugin can show C++ exception metadata in the output, it must pre-process the metadata across the whole binary. Doing so crucially relies upon the ability to recognize the
__CxxFrameHandler3and/or__CxxFrameHandler4unwind handler functions when they are referenced by the binary’s unwind metadata. If the plugin fails to recognize one of these functions, then it will be unable to display C++ exception metadata for any function that uses the unrecognized unwind handler(s).If the user suspects that a failure like this has taken place – say, because they expect to see a
try/catchin the output and it is missing, and they have confirmed that the output was not simply hidden due to the display settings above – then this button may help them to diagnose and repair the issue. Pressing this button flushes the existing caches from the database and rebuilds them. It also prints output to tell the user which unwind handlers were recognized and which ones were not. The user can use these messages to confirm whether the function’s corresponding unwind handler was unrecognized. If it was not, the user can rename the unwind handler function to something that contains one of the two aforementioned names, and then rebuild the caches again.Note that users should generally not need to use this button, as the plugin tries several methods to recognize the unwind handlers (such as FLIRT signatures, recognizing import names, and looking at the destination of “thunk” functions with a single
jmpto a destination function). If the user sees any C++ exception metadata in the output, this almost always means that the recognition worked correctly. This button should only be used by experienced users as a last resort. Users are advised to save their database before pressing this button, and only proceed with the changes if renaming unwind handlers and rebuilding the cache addresses missing metadata in the output.
CONFIGURATION
The default options for the settings just described are controlled via
the %IDADIR%/cfg/eh34.cfg configuration file. Editing this file will
change the defaults for newly-created databases (but not affect existing
databases).
PER-FUNCTION SETTINGS
As just discussed, the user can control which C++ exception metadata is displayed in the output via the global menu item. Users can also customize these settings on a per-function basis (say, by enabling display of wind states for selected functions only), and they will be saved persistently in the database.
When a function has C++ exception metadata, one or more items will appear on Hex-Rays’ right click menu. The most general one is “C++ exception settings…”. Selecting this menu item will bring up a dialog that is similar to the global settings menu item with the following settings:
- Use global settings.
If the user previously changed the settings for the function, but wishes that the function be shown via the global settings in the future, they can select this item and press “OK”. This will delete the saved settings for the function, causing future decompilations to use the global settings.
- This function’s output mode:
This functions identically to “Default output mode” from the global settings dialog, but only affects the current function.
- Show wind states:
Again, identical to the global settings dialog item.
There is a button at the bottom, “Edit global settings”, which is
simply a shortcut to the same global settings dialog from the
Edit -> Other -> C++ exception display settings menu item.
The listing will automatically refresh if the user changes any settings.
Additionally, there are four other menu items that may or may not appear, depending upon the metadata present and whether the settings caused any metadata to be hidden. These menu items are shortcuts to editing the corresponding fields in the per-function settings dialog just discussed. They are:
- Show unstructured C++ states:
If the global or per-function default output setting was set to “Structured only”, and the function had unstructured states, this menu item will appear. Clicking it will enable display of unstructured states for the function and refresh the decompilation.
- Hide unstructured C++ states:
Similar to the above.
- Show wind states:
If the global or per-function “Show wind states” setting was disabled, and the function had wind states, this menu item will appear. Clicking it will enable display of wind states for the function and refresh the decompilation.
- Hide wind states:
Similar to the above.
KEYBOARD SHORTCUTS
The user can change (add, remove, or edit) the keyboard shortcuts for
the per-function settings right-click menu items from the
Edit -> Options -> Shortcuts dialog. The names of the corresponding
actions are:
| Title | Name |
|---|---|
| “Show unstructured C++ states” | eh34:enable_unstructured |
| “Hide unstructured C++ states” | eh34:disable_unstructured |
| “Show wind states” | eh34:enable_wind |
| “Hide wind states” | eh34:disable_wind |
| The global settings dialog | eh34:config_menu |
HELPER CALLS
Hex-Rays’ Microsoft C++ x64 exception support tries to hide low-level
details about exception state numbers as much as possible. However,
compiler optimizations can cause binaries to diverge from the original
source code. For example, inlined functions can produce goto
statements in the decompilation despite there being none in the source.
Optimizations can also cause C++ exception metadata to differ from the
original code. As a result, it is not always possible to represent
try, catch, wind, and unwind constructs as scoped regions that
hide the low-level details.
In these cases, Hex-Rays’ Microsoft C++ x64 exception support will
produce helper calls with informative names to indicate when exception
states are entered and exited, and to ensure that the user can see the
bodies of catch and unwind handlers in the output. The user can
hover their mouse over those calls to see their descriptions. They are
also catalogued below.
The following helper calls are used when exception states have multiple entrypoints, or multiple exits:
| Function name | Description |
|---|---|
__eh34_enter_wind_state(s1, s2) | switch state from parent state s1 to child wind state s2 |
__eh34_enter_try_state(s1, s2) | switch state from parent state s1 to child try state s2 |
__eh34_exit_wind_state(s1, s2) | switch state from child wind state s1 to parent state s2 |
__eh34_exit_try_state(s1, s2) | switch state from child try state s1 to parent state s2 |
The following helper calls are used when exception states had single
entry and exit points, but could not be represented via try or
__wind keywords:
| Function name | Description |
|---|---|
__eh34_wind(s1, s2) | switch state from parent state s1 to child state s2; a new c++ object that requires a dtr has been created |
__eh34_try(s1, s2) | switch state from parent state s1 to child state s2; mark beginning of a try block |
The following helper calls are used to display catch handlers for
exception states that could not be represented via the catch
keyword:
| Function name | Description |
|---|---|
__eh34_catch(s) | beginning of catch blocks at state s; s corresponds to the second argument of the matching try call (if present) |
__eh34_catch_type(s, "handler_address") | a catch statement for the type described at "handler_address" |
__eh34_catch_ellipsis(s) | catch all“ statement |
The following helper calls should be removed, but if you see them, they
signify the boundary of a catch handler:
| Function name | Description |
|---|---|
__eh34_try_continuation(s, i, ea) | end of catch handler i for state s, returning to address ea |
__eh34_caught_type(s, "handler_address") | a pairing call for __eh34_catch_type when catch handler has no continuation |
__eh34_caught_ellipsis(s) | “caught all”, paired with __eh34_catch_ellipsis when catch handler has no continuation |
The following helper calls are used to display unwind handlers for
exception states that could not be represented via the __unwind
keyword:
| Function name | Description |
|---|---|
__eh34_unwind(s) | destruct the c++ object created immediately before entering state s; s corresponds to the second argument of the matching wind call (if present) |
The following helper calls are used to signify that an unwind
handler has finished executing, and will transfer control to a parent
exception state (or outside of the function):
| Function name | Description |
|---|---|
__eh34_continue_unwinding(s1, s2) | after unwinding at child state s1, switch to parent state s2 and perform its unwind or catch action |
__eh34_propagate_exception_into_caller(s1, s2) | after unwinding at child state s1, switch to root state s2; this corresponds to the exception being propagated into the calling function |
The following helper call is used when the exception metadata did not
specify a function pointer for an unwind handler, which causes
program termination:
| Function name | Description |
|---|---|
__eh34_no_unwind_handler(s) | the state s did not have an unwind handler, which causes program termination in the event that an exception reaches it |
The following helper calls are used to signify that Hex-Rays was unable to display an exception handler in the decompilation:
| Function name | Description |
|---|---|
__eh34_unwind_handler_absent(s, ea) | could not inline unwind handler at ea for wind state s |
__eh34_catch_handler_absent(s, i, ea) | could not inline i’th catch handler at ea for try state s |
BOUNDARIES OF EXCEPTION REGIONS
From Microsoft Visual Studio versions 2005 (toolchain version 8.0) to 2017 Service Pack 2 (version 14.12), the compiler emitted detailed metadata that precisely defined the boundaries of all exception regions within C++ functions. This made binary files large, and not all of the metadata was strictly necessary for the runtime library to handle C++ exceptions correctly.
Starting from MSVC 2017 Service Pack 3 (version 14.13), the compiler
began applying optimizations to reduce the size of the C++ exception
metadata. An official Microsoft blog entry entitled:
“Making C++ Exception Handling Smaller on x64”
describes the first change as “dropping metadata for regions of code
that cannot throw and folding logically identical states”. (Version
14.23 later introduced the __CxxFrameHandler4 metadata format to
compress the metadata further.)
As a result of these changes, the C++ exception metadata in MSVC 14.13+ binaries is no longer fully precise. Exception states are frequently reported as beginning physically after where the source code would indicate. In order to produce usable output, Hex-Rays employs mathematical optimization algorithms to reconstruct more detailed C++ exception metadata configurations that can be displayed in a nicer format in the decompilation. These algorithms improve the listings by producing more structured regions and fewer helper calls in the output, but they introduce further imprecision as to the true starting and ending locations of exception regions when compared to the source code. They are an integral part of Hex-Rays C++/x64 Windows exception metadata support and cannot be disabled.
The takeaway is that, when processing MSVC 14.13+ binaries, Hex-Rays
C++/x64 Windows exception support frequently produces try and
__unwind blocks that begin and/or end earlier and/or later than
what the source code would indicate, were it available. This has
important consequences for vulnerability analysis.
For example, given accurate exception boundary information, the
destructor for a local object would ordinarily be situated after the end
of that object’s __wind and __unwind blocks, as in:
Object::Constructor(&v14);
__wind
{
// ...
}
__unwind
{
Object::Destructor(&v14);
}
// HERE: destructor after __wind region
Object::Destructor(&v14);
Yet, due to the imprecise boundary information, Hex-Rays might display
the destructor as being inside of the __wind block:
Object::Constructor(&v14);
__wind
{
// ...
// HERE: destructor inside of __wind region
Object::Destructor(&v14);
}
__unwind
{
Object::Destructor(&v14);
}
The latter output might indicate that v14’s destructor would be
called twice if its destructor were to throw an exception. However, this
indication is simply the result of imprecise exception region boundary
information. In short, users should be wary of diagnosing software bugs
or security issues based upon the positioning of statements nearby the
boundaries of try and __wind blocks. The example above
indicates something that might appear to be a bug in the code – a
destructor being called twice – but is in fact not one.
These considerations primarily apply when analyzing C++ binaries compiled with MSVC 14.13 or greater. They do not apply as much to binaries produced by MSVC 14.12 or earlier, when the compiler emitted fully precise information about exception regions.
Although Hex-Rays may improve its detection of exception region
boundaries in the future, because modern binaries lack the ground truth
of older binaries, the results will never be fully accurate. If the
imprecision is unacceptable to you, we recommend permanently disabling
C++ metadata display via the eh34.cfg file discussed previously.
MISCELLANEOUS
Hex-Rays’ support for exceptions in Microsoft Visual C++/x64 only works after auto-analysis has been completed. Users can explore the database and decompile functions as usual, but no C++ exception metadata will be shown. Users are advised to refresh any decompilation windows after auto-analysis has completed.
If users have enabled display of wind states, they may see empty
__wind or __unwind constructs in the output. Usually, this
does not indicate an error occurred; this usually means that the region
of the code corresponding the wind state was very small or contained
dead code, and Hex-Rays normal analysis and transformation made it
empty.
Starting in IDA 9.0, IDA’s auto-analysis preprocesses C++ exception
metadata differently than in previous versions. In particular, on
MSVC/x64 binaries, __unwind and catch handlers are created as
standalone functions, not as chunks of their parent function as in
earlier versions. This is required to display the exception metadata
correctly in the decompilation. For databases created with older
versions, the plugin will still show the outline of the exception
metadata, but the bodies of the __unwind and catch handlers
will be displayed via the helper calls
__eh34_unwind_handler_absent and
__eh34_catch_handler_absent, respectively. The plugin will also
print a warning at the top of the decompilation such as Absent C++ exception handlers: # catch=1 (pre-9.0 IDB) in these situations.
Re-creating the IDB with a newer version will solve those issues,
although users might still encounter absent handlers in new databases
(rarely, and under different circumstances).