Title: Running a wargame and sleeping well at night
Date: 2022-01-16 20:00

In May 2016, [Mantis](https://twitter.com/MantisSTS) and I started
[websec.fr](https://websec.fr), with the help of 
[blotus](https://blot.me), [cutz](https://twitter.com/_cutz)
and [nurfed](https://twitter.com/nurfed1),
as previously announced and detailed [here]({filename}/ctf/websec.md).

Since for most of the levels, skilled players are able to gain arbitrary PHP code execution,
sandboxing and isolation are in order, in ways as simple as possible:
our free time is limited, and we'd rather spend it writing new challenges or doing interesting things instead
of maintaining custom brittle contraptions and debugging weird snowflake issues.
Most of the infrastructure is documented in various text files in `/root`: how
to deploy a level, how to access the backups, along with working exploits for all
the levels, so onboarding new admins is easy.

At the beginning, we were using [grsecurity](https://grsecurity.net), to take advantage of 
[RBAC](https://grsecurity.net/featureset/rbac),
[TPE](https://grsecurity.net/featureset/filesystem_hardening)
and [`PAX_MPROTECT`](https://en.wikibooks.org/wiki/Grsecurity/Appendix/Grsecurity_and_PaX_Configuration_Options#Restrict_mprotect.28.29)
to prevent players from introducing new native code and execute it.
But with grsecurity [going dark](https://grsecurity.net/passing_the_baton),
we had to spend some time trying to ghetto-replace™ the features we used.

Every challenge is running under an unique user, `levelXX`, 
via [php-fpm](https://www.php.net/manual/en/install.fpm.php), in an empty
read-only chroot. Unfortunately, for some levels, having a session is required,
and the simplest way to store them is to write them down in files, on the
filesystem, that has thus to be writeable. Those levels are using a [bind mount
with `MS_NOEXEC`](https://man7.org/linux/man-pages/man2/mount.2.html), to
prevent the execution of new binaries in the chroot.

There are also crontabs regularly deleting session/uploaded files, and everything else that could be lying there.
It's way simpler than trying to make use of [quotas](https://debian-handbook.info/browse/stable/sect.quotas.html).
PHP process are short-lived and not re-used a lot; along with some
[disable\_functions](https://www.php.net/manual/en/ini.core.php#ini.disable-functions)/[disable\_classes](https://www.php.net/manual/en/ini.core.php#ini.disable-classes)
common sense, this prevents a lot of dumb resource-hungry mistakes.

We used to have [monit](https://mmonit.com/monit) solving the challenges every
hour, to make sure that everything worked, but in the end, it was brittle for
challenges requiring several requests, and completely useless for the rest
since nothing really broke because PHP takes retro-compatibility (too)
seriously and levels tend to be pretty immutable.

Since PHP is a shitshow prone to [local roots](https://www.ambionics.io/blog) and other assorted happy little accidents,
it's running via [systemd](https://systemd.io),
taking advantage of the [sandboxing features](https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Sandboxing),
because unfortunately, running php-fpm with the master process under a non-root user is equally painful
and fragile, so we pass `USE_ZEND_ALLOC=0` in the environment to disable PHP's
*interesting* custom memory management subsystem, and put a custom allocator
like [mimalloc](https://github.com/microsoft/mimalloc) in
[`/etc/ld.so.conf`](https://man7.org/linux/man-pages/man8/ldconfig.8.html).
Amusingly, it's not really possible to use something like
[isoalloc](https://github.com/struct/isoalloc) or
[hardened\_malloc](https://github.com/GrapheneOS/hardened_malloc), since they
tend to crash PHP because of latent memory corruptions, at least last time we
tried. But this doesn't really matter, since the goal here is only to prevent
script-kiddies from copy-pasting PHP exploits to get some flags or try to mess with
the master process. Unix sockets are used to communicate with php workers,
preventing the usage of one level to talk to others without going through the
reverse proxy.

Nothing fancy about it by the way: it's a simple [nginx](https://nginx.org), with some regex-fu to factorise the configuration
instead of copy-pasting a new block every time a new level is added. There is an administration interface
at a secret URL looking like https://websec.fr/fd9dcc7f049aaae76bd277955eb585554f840…, with an equality random
password, to publish level, manage writeups, … There are also some mitigations in place
to detect, annoy and confuse automatic tooling like [sqlmap](https://sqlmap.org) and its friends.

Logging it into the box is done via ssh, running on the standard port, as root.
The only hardening here is that the ciphers/KEX/HMAC/Key/… are [tweaked to only
allow modern ones](https://tls.imirhil.fr/ssh/websec.fr): this keeps a *ton* of bots at bays.

The scoreboard is also running in a chroot, under an unique user, sandboxed.
Since we allow and encourage players to publish writeups, we're processing them
as markdown, then parse the resulting html using
[html5lib](https://github.com/html5lib/html5lib-python), and use an allow-list
approach for tags, which should™ prevent XSS.

Upgrades are automatically installed via a crontab and the machine is ~automatically rebooted when a new kernel is installed.
To prevent copy-pasted kernel exploits, [LKRG](https://lkrg.org) is used,
because ~nobody bothers with bypassing it.

The levels are backuped in a git repository, along with the scoreboard and website,
on bitbucket, so should the box go down in a gigantic fire, ~nothing should be lost.
Because of laziness, the wargame is running on a [DigitalOcean droplet](https://www.digitalocean.com/products/droplets):
it's cheap, reliable, simple, …

Everything has been running *smoothly* for the last 6 years, I can't remember
the last time there was a significant issue infrastructure-wise. The next big
thing is going to be the migration of the
[PHP5](https://www.php.net/manual/php5.php) levels to PHP7 when possible, or
their archival, but this is a problem for future me.

