- Complete title: Breaking the x86 ISA
- PDF: 31dd6431a3dd0d26814b4ff4d453c20e8d1292eef2f3ec1f777b8e4ee1852d08_domas_breaking_the_x86_isa_wp.pdf
The paper Christopher Domas (that wrote the movfuscator, REpsych and cantor dust) is about the exhaustive exploration of the x86 ISA (Instruction Set Architecture) using page-fault and guided fuzzing.
Fuzzing
The set of every possible x86 instruction is too large to be exhausted by pure
brute-force (instructions from 1 to 15 bytes long, so
1329227995784915872903807060280344576 possibilities), so to achieve
exhaustive coverage, the following approach is used, on a 15 bytes buffer,
initially filled with 0:
- Execute the instruction in the buffer, and observe its effective length
- The byte at the end of the instruction is incremented,
- Repeat the process with the resulting new instruction,
- In case of a change of the length or an exception, the incrementation will start from the new end.
- If the increment reaches
256, the incrementation moves to the previous byte.
For example, for 0…0 the observed length is 2, so the next instruction to be
executed will be 010…0.
This approach skips less significant portions of the instruction, such as immediate value or displacements
Measuring the length of the instruction
Using the trap instruction and comparing eip doesn't work for instructions
that are generating exceptions, since they do not execute.
A more reliable way to measure the length of the instruction is to place the
first byte of the instruction on an executable page, and the rest of it
in a non-executable one. If a page fault occur during fetching, the processor
trigger an exception, and the address of the page boundary is reported in a register,
allowing to iteratively find the size of the instruction (either until there is
not exception, or if the reported address isn't the boundary). Valid and
invalid instructions can be told apart because the later will raise an UD (invalid
opcode) exception upon execution. Privileged ones will raise a GP one.
To prevent self-corruption, the process is running in ring 3, with every
register set to zero and every possible exception hooked. A side-effect of
vastly ignoring different immediate values (like inc [0x0804a10c]) is that
this reduces the odds of corruption, since there is nothing mapped at low level
addresses.
To find interesting instructions, the length of the instruction is compared with the length of the one decoded by capstone.
Results
- QEMU, IDA, Objdump, capstone, VS, … are getting some instructions wrong, allowing to hide malicious code from the disassembly listing since it's interpreted wrongly.
- The tool (called sandsifter) rediscovered the famous
lock cmpxchg8b eaxbug - Disparities were found against AMD an Intel processors
- An
halt and catch fireinstruction that could be triggered from ring3 was found in an unnamed processor
As usual with decent research paper, the code related to the publication is openly
available and documented
(local mirror, 2017-09-13).