Artificial truth

The more you see, the less you believe.

[archives] [latest] | [homepage] | [atom/rss]

Firefox own linker on Android
Tue 21 February 2017 — download

Recently, someone on irc was complaining about how insane the Firefox linker on Android was (this blogpost is largely based on his rant). I thought it was a joke: why would an Android application have its own linker? Apparently, Firefox does have its very own.

The right™ way of doing it, according to bionic's own linker, is to bundle your libraries into your apk, and have Android decompress and put them in the right place when installing your application. As a nice bonus, the application doesn't have write-access on its libraries.

Everything was looking fine in Firefox's one, until circa line 470, where the linker is decompressing zip files, to map them into memory. But why would you do that, instead of using the default linker?

Things are starting to slowly become scary at line 965:

#if defined(ANDROID)
/* As some system libraries may be calling signal() or sigaction() to
 * set a SIGSEGV handler, effectively breaking MappableSeekableZStream,
 * or worse, restore our SIGSEGV handler with wrong flags (which using
 * signal() will do), we want to hook into the system's sigaction() to
 * replace it with our own wrapper instead, so that our handler is never
 * replaced. We used to only do that with libraries this linker loads,
 * but it turns out at least one system library does call signal() and
 * breaks us (libsc-a3xx.so on the Samsung Galaxy S4).
 * As libc's signal (bsd_signal/sysv_signal, really) calls sigaction
 * under the hood, instead of calling the signal system call directly,
 * we only need to hook sigaction. This is true for both bionic and
 * glibc.
 */

/* libc's sigaction */
extern "C" int
sigaction(int signum, const struct sigaction *act,
          struct sigaction *oldact);

/* Simple reimplementation of sigaction. This is roughly equivalent
 * to the assembly that comes in bionic, but not quite equivalent to
 * glibc's implementation, so we only use this on Android. */
int
sys_sigaction(int signum, const struct sigaction *act,
              struct sigaction *oldact)
{
  return syscall(__NR_sigaction, signum, act, oldact);
}

/* Replace the first instructions of the given function with a jump
 * to the given new function. */
template <typename T>
static bool
Divert(T func, T new_func)
{

Wait what!? The linker seems to set a handler on SIGSEGV at some point, provides a sigaction reimplementation, and does some hot-patching of functions?!

At line 1097, it becomes even more clear that we're getting closer to Hell:

/* Get the current segfault signal handler. */
struct sigaction old_action;
sys_sigaction(SIGSEGV, nullptr, &old_action);

/* Some devices don't provide useful information to their SIGSEGV handlers,
 * making it impossible for on-demand decompression to work. To check if
 * we're on such a device, setup a temporary handler and deliberately
 * trigger a segfault. The handler will set signalHandlingBroken if the
 * provided information is bogus.
 * Some other devices have a kernel option enabled that makes SIGSEGV handler
 * have an overhead so high that it affects how on-demand decompression
 * performs. The handler will also set signalHandlingSlow if the triggered
 * SIGSEGV took too much time. */
struct sigaction action;
action.sa_sigaction = &SEGVHandler::test_handler;
sigemptyset(&action.sa_mask);
action.sa_flags = SA_SIGINFO | SA_NODEFER;
action.sa_restorer = nullptr;
stackPtr.Assign(MemoryRange::mmap(nullptr, PageSize(),
                                  PROT_READ | PROT_WRITE,
                                  MAP_PRIVATE | MAP_ANONYMOUS, -1, 0));
if (stackPtr.get() == MAP_FAILED)
  return;
if (sys_sigaction(SIGSEGV, &action, nullptr))
  return;

The linker is indeed setting a signal handler on SIGSEGV, and then deliberately triggers a SEGFAULT by trying to map NULL as read/write to check various things; the whole point of this being apparently to decompress (and then likely map) libraries at runtime!?

Even better, it hooks sigaction, to ensure that SIGSEGV are correctly handled, and then redispatch the signals.

Long story short, it seems that, for performance reasons, firefox starts without loading libraries, and catches SIGSEGV to load libraries directly into memory (not paged from storage) on demand.