Artificial truth

The more you see, the less you believe.

[archives] [latest] | [homepage] | [atom/rss]

Paper notes: Breaking the x86 ISA
Sun 01 October 2017 — download

The paper Christopher Domas (that wrote the movfuscator, REpsych and cantor dust) is about the exhaustive exploration of the x86 ISA (Instruction Set Architecture) using page-fault and guided fuzzing.

Fuzzing

The set of every possible x86 instruction is too large to be exhausted by pure brute-force (instructions from 1 to 15 bytes long, so 1329227995784915872903807060280344576 possibilities), so to achieve exhaustive coverage, the following approach is used, on a 15 bytes buffer, initially filled with 0:

  1. Execute the instruction in the buffer, and observe its effective length
  2. The byte at the end of the instruction is incremented,
  3. Repeat the process with the resulting new instruction,
  4. In case of a change of the length or an exception, the incrementation will start from the new end.
  5. If the increment reaches 256, the incrementation moves to the previous byte.

For example, for 0…0 the observed length is 2, so the next instruction to be executed will be 010…0.

This approach skips less significant portions of the instruction, such as immediate value or displacements

Measuring the length of the instruction

Using the trap instruction and comparing eip doesn't work for instructions that are generating exceptions, since they do not execute.

A more reliable way to measure the length of the instruction is to place the first byte of the instruction on an executable page, and the rest of it in a non-executable one. If a page fault occur during fetching, the processor trigger an exception, and the address of the page boundary is reported in a register, allowing to iteratively find the size of the instruction (either until there is not exception, or if the reported address isn't the boundary). Valid and invalid instructions can be told apart because the later will raise an UD (invalid opcode) exception upon execution. Privileged ones will raise a GP one.

To prevent self-corruption, the process is running in ring 3, with every register set to zero and every possible exception hooked. A side-effect of vastly ignoring different immediate values (like inc [0x0804a10c]) is that this reduces the odds of corruption, since there is nothing mapped at low level addresses.

To find interesting instructions, the length of the instruction is compared with the length of the one decoded by capstone.

Results

  • QEMU, IDA, Objdump, capstone, VS, … are getting some instructions wrong, allowing to hide malicious code from the disassembly listing since it's interpreted wrongly.
  • The tool (called sandsifter) rediscovered the famous lock cmpxchg8b eax bug
  • Disparities were found against AMD an Intel processors
  • An halt and catch fire instruction that could be triggered from ring3 was found in an unnamed processor

As usual with decent research paper, the code related to the publication is openly available and documented (local mirror, 2017-09-13).