Artificial truth

The more you see, the less you believe.

[archives] [latest] | [homepage] | [atom/rss]

Reven workshop
Fri 20 November 2020 — download

A couple of days ago, I had the pleasure to beta test the Reven HITB lab. Reven is the flagship product of Tetrane: a timeless analysis program, capturing everything that's going on in a virtual machine (cpu, memory, hardware interrupts, …) to display a huge trace with a timeline, allowing the user to freely inspect what's going on. This makes it both a powerful and super-weird tool, at least when you're not used to deal with traces. It's kind of magical to see the network card writing bytes directly into a physical address to be later read by the driver, or watching a bluescreen being painting the screen blue line by line, character by character.

Anyway, to illustrate the mind shift, let's take the following program:

void a(){
  puts("in a");
}

void b(){
  puts("in b");
  a();
}

int main(){
  puts("start of main");
  a();
  b();
  puts("end of main");
  return 0;
}

It'll likely look like this in your disassembler:

int main (int argc, char **argv, char **envp);
      0x00001169      55             push rbp
      0x0000116a      4889e5         mov rbp, rsp
      0x0000116d      488d3d9a0e00.  lea rdi, qword str.start_of_main ; 0x200e ; "start of main"
      0x00001174      e8b7feffff     call sym.imp.puts           ; int puts(const char *s)
      0x00001179      b800000000     mov eax, 0
      0x0000117e      e8b6ffffff     call sym.a
      0x00001183      b800000000     mov eax, 0
      0x00001188      e8bfffffff     call sym.b
      0x0000118d      488d3d880e00.  lea rdi, qword str.end_of_main ; 0x201c ; "end of main"
      0x00001194      e897feffff     call sym.imp.puts           ; int puts(const char *s)
      0x00001199      b800000000     mov eax, 0
      0x0000119e      5d             pop rbp
      0x0000119f      c3             ret
sym.a ();
      0x00001139      55             push rbp
      0x0000113a      4889e5         mov rbp, rsp
      0x0000113d      488d3dc00e00.  lea rdi, qword str.in_a     ; 0x2004 ; "in a"
      0x00001144      e8e7feffff     call sym.imp.puts           ; int puts(const char *s)
      0x00001149      90             nop
      0x0000114a      5d             pop rbp
      0x0000114b      c3             ret
sym.b ();
      0x0000114c      55             push rbp
      0x0000114d      4889e5         mov rbp, rsp
      0x00001150      488d3db20e00.  lea rdi, qword str.in_b     ; 0x2009 ; "in b"
      0x00001157      e8d4feffff     call sym.imp.puts           ; int puts(const char *s)
      0x0000115c      b800000000     mov eax, 0
      0x00001161      e8d3ffffff     call sym.a
      0x00001166      90             nop
      0x00001167      5d             pop rbp
      0x00001168      c3             ret

But in Reven, it looks like this instead:

      0x00001169      55             push rbp
      0x0000116a      4889e5         mov rbp, rsp
      0x0000116d      488d3d9a0e00.  lea rdi, qword str.start_of_main
      0x00001174      e8b7feffff     call sym.imp.puts           ; int puts(const char *s)
      0x00080d94      4156           push r14
      0x00080d96      4155           push r13
      ; More of `puts` implementation…
      0x00080ed8      415d           pop r13
      0x00080eda      415e           pop r14
      0x00080edc      c3             ret
      0x00001179      b800000000     mov eax, 0                  ; back into `main`
      0x0000117e      e8b6ffffff     call sym.a
      0x00001139      55             push rbp                    ; start of `a`
      0x0000113a      4889e5         mov rbp, rsp
      0x0000113d      488d3dc00e00.  lea rdi, qword str.in_a   
      0x00001144      e8e7feffff     call sym.imp.puts           ; int puts(const char *s)
      0x00080d94      4156           push r14
      0x00080d96      4155           push r13
      ; More of `puts` implementation…
      0x00080ed8      415d           pop r13
      0x00080eda      415e           pop r14
      0x00080edc      c3             ret                          ; end of `putc`
      0x00001149      90             nop
      0x0000114a      5d             pop rbp
      0x0000114b      c3             ret                         ; end of `a`
      0x00001183      b800000000     mov eax, 0
      0x00001188      e8bfffffff     call sym.b
      0x0000114c      55             push rbp                    ; start of `b`
      0x0000114d      4889e5         mov rbp, rsp
      0x00001150      488d3db20e00.  lea rdi, qword str.in_b    
      0x00001157      e8d4feffff     call sym.imp.puts           ; int puts(const char *s)
      0x0000115c      b800000000     mov eax, 0
      0x00001161      e8d3ffffff     call sym.a
      0x00001139      55             push rbp                    ; start of `a`
      0x0000113a      4889e5         mov rbp, rsp
      0x0000113d      488d3dc00e00.  lea rdi, qword str.in_a     
      0x00001144      e8e7feffff     call sym.imp.puts           ; int puts(const char *s)
      0x00080d94      4156           push r14
      0x00080d96      4155           push r13
      ; More of `puts` implementation…
      0x00080ed8      415d           pop r13
      0x00080eda      415e           pop r14
      0x00080edc      c3             ret                          ; end of `putc`
      0x00001149      90             nop
      0x0000114a      5d             pop rbp
      0x0000114b      c3             ret                         ; end of `a`
      0x00001166      90             nop
      0x00001167      5d             pop rbp
      0x00001168      c3             ret                         ; end of `b`
      0x0000118d      488d3d880e00.  lea rdi, qword str.end_of_main
      0x00001194      e897feffff     call sym.imp.puts           ; int puts(const char *s)
      0x00080d94      4156           push r14
      0x00080d96      4155           push r13
      ; More of `puts` implementation…
      0x00080ed8      415d           pop r13
      0x00080eda      415e           pop r14
      0x00080edc      c3             ret                          ; end of `putc`
      0x00001199      b800000000     mov eax, 0
      0x0000119e      5d             pop rbp
      0x0000119f      c3             ret

For simplicity's sake, I didn't show the kernel-land parts that randomly pop up due to various interruptions (a packet is received on the network card, your process is scheduled, …). Apparently, they're working on adding the possibility to filter those out.

Features

The traces Reven are dealing with are gigantic, of the order of several dozen millions of instructions, if not dozen of billions, yet moving around is not only super-snappy, but kinda-intuitive once you made peace with the concept that you're looking at a timeline Plus the fact that there is a taint analyser included makes it easy to instantly find what you're looking for:

  • Stuck in a loop and want to get out of it? Just search for whatever ip pointing inside it, and jump to its first occurence in the trace to go to the first iteration.
  • Want to understand why a segfault happened? Just do a taint from the exception to the begining of your trace.
  • Wondering where this value written in memory comes from? Just check the address' history to see all the reads/writes.
  • What's this buffer overflow about? Taint the cookie, and see what overwrote it, and where the value is coming from.

Workshop

The workshop was done via jitsi, with a jupyter notepad for one of each of the 3 proposed exercises.

The first one was pretty basic, and involved moving around in a dummy program, to grasp how Reven is working. It was trace of a crash, due to a division by zero, which was trivial to diagnose: find the exception, look at the backtrace, taint some stuff, …

The second one was more interesting, and consisted in an analysis of CVE-2020-16898 (aka Bad Neighbor), based on Quarkslab's article and PoC: see the framebuffer being filled with the familiar blue of a bluescreen, find the crash, look at the stack cookie being overwritten, look at the write history of this offset, walk the backtrace to find which function is responsible for giving the faulty value to the memcpy, taint again, and see that the value if coming from the network card: it's user-controlled! And interesting part of the exercise was to diff the traces of the vulnerable version with the fixed one, so see what changed, and how the fix was implemented, without having to know anything about the vulnerability.

The last exercise was trickier: it was supposed to be a small contest with Reven licenses for the people able to answer the following questions given a trace of a crash:

  • Why is the OS crashing?
  • Where is the faulty memory access coming from?
  • What is the root cause of the issue?

Here is how I solved it:

  1. Searched for the symbol KeBugCheckEx
  2. Noticed that KiGeneralProtectionFault was in its backtrace reven_1
  3. Noticed the general protection while executing mov rcx, qword ptr [rdx + 0x18]
  4. Tainted rax, since there is a mov rdx, rax right above. I couldn't taint [rdx + x018], since it wasn't completely executed because of the exception.
  5. CfgAdtFormatPropertyBlock being a gigantic loop, I searched for the first occurrence of the address of one of its instruction, to find the beginning of the loop. reven_2
  6. Right above it, there is the end of a call to BCryptAlloc, which is likely allocating a buffer in a wrong way, leading to the KiGeneralProtectionFault. reven_3
  7. I jumped at the beginning of BCryptAlloc with % (like in vim)
  8. The code is doing some operation on short int before passing an argument to BCryptAlloc. reven_4
  9. It's a convoluted multiply by 6, on short int, which can (and will) overflow, leading to an underallocated buffer.
movzx ebp, dx
; […]
movzx eax, dx
add ax, ax
lea edi, [rax + rbp]
add di, di
movzx ecx, di
call BCryptAlloc

The issue was in fact cve-2020-17087, discovered by Mateusz Jurczyk and Sergei Glazunov of Google Project Zero.

Conclusion

There are still some rough edges, like:

  • suddenly some writes happening in unrelated functions because Reven doesn't show that it's the network card writing things directly into physical addresses.
  • there is no way to collapse a loop, making it tedious to navigate outside large ones.
  • the lack of context when looking at the trace, like the call not being symbolized.
  • the UI is a bit confusing: it's not clear with is clickable and what happens upon a click.

But apart from those minor inconveniences, it's really neat to be able to tackle reverse engineering from an other perspective than simple assembly listing, and exploring traces in a blazing-fast way.

Tetrane is nice enough to provide documentation as well as some cool playgrounds of Reven: I really recommend toying around the bluekeep one.