Title: Paper notes: Towards generic deobfuscation of Windows API calls
Date: 2018-02-15 14:00

- Complete title: Towards Generic Deobfuscation of Windows API Calls
- PDF:
	[3d2a34bac6eef17f3b556b0e02600c44059cf66c925616db1ef0beb479faf9f5_towards_generic_deobfuscation_of_Winapi_calls.pdf]({static}/files/papers/3d2a34bac6eef17f3b556b0e02600c44059cf66c925616db1ef0beb479faf9f5_towards_generic_dobfuscation_of_Windows_api_calls.pdf)

Malwares are often obfuscating calls to API, often via
runtime `Import Address Table` (IAT) population from manual parsing of DLL
`ExportTable`.
This can be defeated by (boring) static analysis, or (often detectable) dynamic
analysis like sandboxes.

To automate the deobfuscation, the paper uses the arguments passed to API
calls. For example, [RegCreateKeyEx](https://msdn.microsoft.com/en-us/library/windows/desktop/ms724844(v=vs.85).aspx):

```C
LONG WINAPI RegCreateKeyEx(
  _In_       HKEY                  hKey,
  _In_       LPCTSTR               lpSubKey,
  _Reserved_ DWORD                 Reserved,
  _In_opt_   LPTSTR                lpClass,
  _In_       DWORD                 dwOptions,
  _In_       REGSAM                samDesired,
  _In_opt_   LPSECURITY_ATTRIBUTES lpSecurityAttributes,
  _Out_      PHKEY                 phkResult,
  _Out_opt_  LPDWORD               lpdwDisposition
);
```

This function takes 9 arguments, and some are constants in a small keyspace,
making it possible to recognise it with a certain degree of certitude.

Throwing some machine-learning with
[HMM](https://en.wikipedia.org/wiki/Hidden_Markov_model) and
[MLR](https://en.wikipedia.org/wiki/Multinomial_logistic_regression) is
yielding around 80% correctness. Getting to the calls is a simple matter of
symbolic execution: because it's a proof of concept, the paper used a super-simple one to
avoid path explosion, since only `push/pop/mov/lea/…` are needed to call
functions on Windows x86.

The code and the results related to the publication are [openly
available](https://github.com/cylance/winapi-deobfuscation)
([local mirror]({static}/files/papers/towards_generic_dobfuscation_of_Windows_api_calls.zip), `2018-02-15`).
