Title: Finding bugs in OpenMW with AFL++ and honggfuzz
Date: 2021-06-03 21:00

I'm writing fuzzers at work, but since the internal tooling is doing a [lot of
magic](https://blog.bazel.build/2021/02/08/rules-fuzzing.html) fuelled by
incredible automation, it has little in common with how *regular* people are
casually fuzzing things. So I decided to give a try at fuzzing [OpenMW](https://openmw.org), with
[AFL++](https://aflplus.plus) and [honggfuzz](https://honggfuzz.dev).

At first, I naïvely tried to fuzz the `openmw` binary, by patching it to
[immediately exit](https://gitlab.com/OpenMW/openmw/-/issues/5936) after
loading all of its resources, but it is way too large, way too slow and with
inputs way too big to be fuzzed in a meaningful way. On the bright side,
[esmtool](https://gitlab.com/OpenMW/openmw/-/wikis/development/architecture#apps),
[bsatool](https://gitlab.com/OpenMW/openmw/-/wikis/development/architecture#apps)
and [niftest](https://gitlab.com/OpenMW/openmw/-/wikis/development/architecture#apps)
are ~fast and small, making them suitable targets.

[Hristos](https://hristos.lol/), the person behind [Modding-OpenMW.com](https://modding-openmw.com/)
was kind enough to give me a sizeable dump of esp files for my starting corpus.
As for [NIF files](http://www.niftools.org/), I simply used all of Morrowind and
Skyrim ones.

I tried to fuzz with [Address
Sanitizer](https://en.wikipedia.org/wiki/AddressSanitizer), but since
OpenMW's codebase is dealing with too-large-to-be-true memory allocations by catching the
`std::bad_alloc` exception, this lead to a ton of false positives,
since exceptions [aren't supported in ASAN yet](https://github.com/google/sanitizers/issues/295).
So I had to resort to using `AFL_HARDEN` instead.

Because we're in 2021,
[llvm12](https://releases.llvm.org/12.0.0/docs/ReleaseNotes.html) is packaged
in modern Linux distributions, meaning that it's possible to use AFL++' [LTO
instrumentation](https://aflplus.plus/docs/env_variables/#lto), as well as
[laf-intel](https://github.com/AFLplusplus/AFLplusplus/blob/stable/instrumentation/README.laf-intel.md)
passes without the hassle of having to compile LLVM/clang on my own.

```
$ export CC=~/dev/AFLplusplus/afl-clang-lto
$ export CXX=~/dev/AFLplusplus/afl-clang-lto++
$ export LD=~/dev/AFLplusplus/afl-ld-lto
$ export AFL_LLVM_LAF_ALL=1 
$ export AFL_HARDEN=1
$ cmake ..
$ make -j $(nproc) niftest esmtool bsatool
$ afl-fuzz -i ../fuzzin_nif -o ./out_nif -d -f /tmp/test.nif -- ./niftest --input-file /tmp/test.nif
$ afl-fuzz -i ../fuzzin_esm -o ./out_esm -d -x ../esp.dict -- ./esmtool dump -C -p -q @@
$ afl-fuzz -i ../fuzzin_bsa -o ./out_bsa -d -- ./bsatool list -l @@
```

I ran two AFL++ instances on my ageing
[i7-3520M](https://ark.intel.com/content/www/us/en/ark/products/64893/intel-core-i7-3520m-processor-4m-cache-up-to-3-60-ghz.html)
to check that everything was working, then moved to twelve instances on a
beefier [Xeon W-2135](https://ark.intel.com/content/www/us/en/ark/products/126709/intel-xeon-w-2135-processor-8-25m-cache-3-70-ghz.html).

After a couple of days of fuzzing, my *CPU provider* told me that I had to
reboot the machine as soon as possible, likely due to a kernl upgrade.
So I merged all
the AFL++ instances output, ran
[fdupes](https://github.com/adrianlopezroche/fdupes) on it, and
tried to minimize the result with
[afl-cmin](https://github.com/AFLplusplus/AFLplusplus/blob/stable/afl-cmin),
but [it crashed](https://github.com/AFLplusplus/AFLplusplus/issues/919), so I
used honggfuzz instead. Unfortunately, honggfuzz doesn't like AFL++'
instrumentation, so I had to recompile my targets with
[`hfuzz-clang`](https://github.com/google/honggfuzz/tree/master/hfuzz_cc) with
[pc-guard](https://clang.llvm.org/docs/SanitizerCoverage.html), to be able to run:

```
$ honggfuzz -M -i ../fuzz_in --output ../fuzz_in_minimized -- ./esmtool dump -C -p -q ___FILE___
```

Also, __always__ use `--ouput`, because if your
minimizer doesn't like your instrumentation for whatever reason, odds are that
it might consider all the files in your corpus to have a coverage of zero, and
will thus trash everything.

I had around ~20.000 files in my corpus, and since honggfuzz' minimisation
[doesn't take advantage of multiple
cores](https://github.com/google/honggfuzz/issues/401), it took around 4 hours
to minimize everything down to ~5000 files.

After spending some time reading AFL++' [documentation](https://aflplus.plus/docs/)
and tuning [power-schedules](https://aflplus.plus/docs/power_schedules/),
I looked at [FuzzBench](https://www.fuzzbench.com/reports/index.html)
and switched to honggfuzz since it performs roughly the same,
without the need to have to manually launch a tuned fuzzer per core 
to get everything rolling the way it should.

```
$ honggfuzz --threads $(nproc) -i ../fuzz_in_esm -x ../esp.dict -- ./esmtool -C -p -q ___FILE___
$ honggfuzz --threads $(nproc) -i ../fuzz_in_bsa -- ./bsatool list -l ___FILE___
$ honggfuzz --threads $(nproc) -i ../fuzz_in_nif -e nif -- ./niftest --input-file ___FILE___
```

In the end, I used a mixture of the two, to take advantage of honggfuzz'
high efficiency/complexity ratio as well as AFL++' interesting power schedules.
Moreover, while honggfuzz ran around 50 execs/s, AFL++ was running around 175
execs/s.

All of this lead to a couple bugs:

- [an off-by-one in `esmtool`](https://gitlab.com/OpenMW/openmw/-/merge_requests/728)
- [a non-zero-terminated string in `bsatool`](https://gitlab.com/OpenMW/openmw/-/merge_requests/750)
- [a read heap-buffer overflow in `esmtool`](https://gitlab.com/OpenMW/openmw/-/merge_requests/751)
- [a read heap-buffer overflow in `esmtool`](https://gitlab.com/OpenMW/openmw/-/merge_requests/784)
- [a DoS in `niftest`](https://gitlab.com/OpenMW/openmw/-/merge_requests/814)
- [a crash in `esmtool`](https://gitlab.com/OpenMW/openmw/-/merge_requests/848)

Those are now fixed, mostly thanks to [elsid](https://gitlab.com/elsid)
handholding me into writing acceptable C++. My fuzzers' coverage isn't
increasing anymore since a couple of days, time to wrap up and publish this
blogpost.

The next steps are to make [esmtool compile MWScripts](https://gitlab.com/OpenMW/openmw/-/issues/5947)
to fuzz the compiler/interpreter, make `esmtool` faster, and maybe try to make use of [gitlab's
continuous fuzzing](https://docs.gitlab.com/ee/user/application_security/coverage_fuzzing/) at some point.

If you want to fuzz OpenMW on your own, I documented everything on the [wiki](https://gitlab.com/OpenMW/openmw/-/wikis/development/fuzzing)
and I would be happy to share my corpus.
