Some notes on "Randomized slab caches for kmalloc()"

Ruiqi Gong and Xiu Jianfeng got their Randomized slab caches for kmalloc() patch series merged upstream, and I've had enough discussions about it to warrant summarising them into a small blogpost.

The main idea is to have multiple slab caches, and pick one at random based on the address of code calling kmalloc() and a per-boot seed, to make heap-spraying harder. It's a great idea, but comes with some shortcomings for now:

Objects being allocated via wrappers around kmalloc(), like sock_kmalloc, f2fs_kmalloc, aligned_kmalloc, … will end up in the same slab cache. A possible improvement would be to mix the callsite address and the parent caller's address.
The slabs needs to be pinned, otherwise an attacker could feng-shui their way into having the whole slab free'ed, garbage-collected, and have a slab for another type allocated at the same VA. Jann Horn and Matteo Rizzo have a nice set of patches, discussed a bit in this Project Zero blogpost, for a feature called SLAB_VIRTUAL, implementing precisely this.
There are 16 slabs by default, so one chance out of 16 to end up in the same slab cache as the target.
There are no guard pages between caches, so inter-caches overflows are possible.
As pointed by andreyknvl and minipli, the fewer allocations hitting a given cache means less noise, so it might even help with some heap feng-shui.
minipli also pointed that "randomized caches still freely mix kernel allocations with user controlled ones (xattr, keyctl, msg_msg, …). So even though merging is disabled for these caches, i.e. no direct overlap with cred_jar etc., other object types can still be targeted (struct pipe_buffer, BPF maps, its verifier state objects,…). It’s just a matter of probing which allocation index the targeted object falls into.", but I considered this out of scope, since it's much more involved; albeit something like Jann Horn's CONFIG_KMALLOC_SPLIT_VARSIZE wouldn't significantly increase complexity.

Also, while code addresses as a source of entropy has historically be a great way to provide KASLR bypasses, hash_64(caller ^ random_kmalloc_seed, ilog2(RANDOM_KMALLOC_CACHES_NR + 1)) shouldn't trivially leak offsets.

The segregation technique is a bit like a weaker version of grsecurity's AUTOSLAB, or a weaker kernel-land version of PartitionAlloc, but to be fair, making use-after-free exploitation harder, and significantly harder once pinning lands, with only ~150 lines of code and negligible performance impact is amazing and should be praised. Moreover, I wouldn't be surprised if this was backported in Google's KernelCTF soon, so we should see if my analysis is correct.

Artificial truth

archives | latest | homepage

Some notes on "Randomized slab caches for kmalloc()"
Mon 11 September 2023 — download