Artificial truth

The more you see, the less you believe.

[archives] [latest] | [homepage] | [atom/rss/twitter]

Paper notes: Towards generic deobfuscation of Windows API calls
Thu 15 February 2018 — download

Malwares are often obfuscating calls to API, often via runtime Import Address Table (IAT) population from manual parsing of DLL ExportTable. This can be defeated by (boring) static analysis, or (often detectable) dynamic analysis like sandboxes.

To automate the deobfuscation, the paper uses the arguments passed to API calls. For example, RegCreateKeyEx:

  _In_       HKEY                  hKey,
  _In_       LPCTSTR               lpSubKey,
  _Reserved_ DWORD                 Reserved,
  _In_opt_   LPTSTR                lpClass,
  _In_       DWORD                 dwOptions,
  _In_       REGSAM                samDesired,
  _In_opt_   LPSECURITY_ATTRIBUTES lpSecurityAttributes,
  _Out_      PHKEY                 phkResult,
  _Out_opt_  LPDWORD               lpdwDisposition

This function takes 9 arguments, and some are constants in a small keyspace, making it possible to recognise it with a certain degree of certitude.

Throwing some machine-learning with HMM and MLR is yielding around 80% correctness. Getting to the calls is a simple matter of symbolic execution: because it's a proof of concept, the paper used a super-simple one to avoid path explosion, since only push/pop/mov/lea/… are needed to call functions on Windows x86.

The code and the results related to the publication are openly available (local mirror, 2018-02-15).