Skip to main content

Exploring 3 types of directory traversal vulnerabilities in C/C++

Escrito por:
Kirill Efimov

Kirill Efimov

wordpress-sync/blog-feature-c-c-security-beta

4 de abril de 2022

0 minutos de leitura

Directory traversal vulnerabilities (also known as path traversal vulnerabilities) allow bad actors to gain access to folders that they shouldn’t have access to. In this post, we are going to take a look how directory traversal vulnerabilities work on web servers written on C/C++, as well as how to prevent them.

For us on the Security Research team, the C and C++ ecosystems have been out of scope for a long time, but that changed when FossID joined Snyk. FossID’s expertise was in C/C++, which meant we were able to start a couple of projects to gain some understanding of the overall information security posture for those languages. As a result, we were able to uncover many high severity vulnerabilities and are excited to start sharing our findings.

In this post, we are going to focus on the improper limitation of a pathname to a restricted directory (CWE-22). It has many variants and according to CWE Top 25 of the most dangerous software weaknesses will remain very widespread and dangerous in 2022. We’ll explore three different types of directory traversal and correspond vulnerabilities with vulnerable code samples, fixes, and possible implications.

Arbitrary file read

Web servers quite often implement functionality to serve static assets like HTML, CSS, and JS files. Usually the assets are stored in dedicated folders (e.g. /static, /www or /assets) in the file system. An arbitrary file read vulnerability occurs when the web application doesn't properly sanitize the path to the static file, allowing the user to use “../” path segments to get outside of the intended folder and eventually read arbitrary files on the disk.

We are going to jump straight into the example of this vulnerability that we recently discovered in the Crow web framework. Crow is a C++ microframework for running web services.

For this example, we’ll create a simple “hello world” application that will consist out of one main.cpp file:

1#define CROW_MAIN
2#include <crow_all.h>
3
4int main()
5{
6    crow::SimpleApp app;
7
8    CROW_ROUTE(app, "/")
9    ([]() {
10        return "Hello world!";
11    });
12
13    app.port(8000).run();
14    return 0;
15}

To compile the example, we use Ubuntu: g++ -pthread -o server main.cpp. From this point it is possible to run the server calling ./server in the terminal.

According to the documentation, by default, Crow serves static files from the /static folder at the same location as the server executable file. We can try this out by creating the /static folder and text file within: mkdir static && echo “test” > static/test.txt

Now from another terminal window, we can execute curl to check if it works:

curl http://localhost:8000/ gives us Hello world! as the response.

curl http://localhost:8000/static/test.txt shows test – which is indeed the contents of the static/test.txt file.

To exploit the vulnerability, we can execute curl --path-as-is "http://localhost:8000/static/../../etc/passwd". In the response, we will see the contents of the /etc/passwd file. Trick here is the --path-as-is flag. It disables path normalization in URLs. For us it means the server will receive exactly the URL we provided — including “../” segments.

Security implications

Leaking arbitrary files from the production server is usually a critical issue: SSH keys, various credentials and the server source code can give a malicious actor a way to escalate the attack and take over the server or confidential user data.

The same is true for embedded devices, where C++ servers are used most often. This directory traversal vulnerability is a common guest in Wi-Fi routers: NETGEAR, Belkin, TP-Link and so on. Possible implication in this case could be stealing admin panel credentials and gaining full control over the local network.

In some cases, the directory traversal issue can be dangerous even if the vulnerable server is running locally on your machine. Local web servers are often used in hybrid web-desktop applications like Electron. To read more about such attack vectors you can read our research about vulnerable VS Code plugins.

Mitigation

Crow v0.3+4 fixed this directory traversal vulnerability. The problem was introduced via the app.h file where the code simply concatenates part of the URL with the static folder path and call the set_static_file_info method to serve file:

1res.set_static_file_info(CROW_STATIC_DIRECTORY + file_path_partial);

The fix in this case was to add a utility::sanitize_filename(path) call to the beginning of the set_static_file_info method. sanitize_filename replaces all occurrences of “..” with “_” which remediates the vulnerability. The fix is not perfect because the server will not be able to serve files which contain “..” in the file name, but this is probably rare and will likely not cause any major issues.

Arbitrary file write

Arbitrary file write vulnerabilities are a very common issue in code bases where the application implements file uploading functionality. The root cause of the vulnerability is very similar to the previous case, but a bit more obscure. Imagine a simple HTML form to upload files:

1<form method=”post”>
2<input type="file" name="avatar">
3<input type="submit" value="submit">
4</form>

Let’s try to pick a text file with the name “test.txt” and the contents “test”. The browser will produce an HTTP request similar to the following:

1POST /upload HTTP/1.1
2Host: localhost:8000
3Content-Length: 150
4Content-Type: multipart/form-data; boundary=----xxx
5Connection: close
6
7------xxx
8Content-Disposition: form-data; name="file"; filename="test.txt"
9Content-Type: text/plain
10
11test
12
13------xxx--

If you take a deeper look into the request body, you can see it contains more information than just the file contents. Namely, it has a filename field which can contain “../” segments and has to be handled correctly by the server.

To showcase the issue, we are going to use recently discovered arbitrary write vulnerability in the very popular embedded web server framework Mongoose.

The file upload example from the Mongoose repository allows users to upload files in chunks, which is very convenient if you have a small amount of RAM on the device where you run the server. We can build and run the example by simply cloning the repository and running the  make command in the example folder. The server immediately starts listening on port 8000, and sets the target folder for uploaded files to /tmp (as specified in line 13 of the main.c file.)

At this point, we can use curl to upload files. The command curl -X POST --data-binary "test" http://localhost:8000/upload?offset=0&name=test.txt will create a /tmp/test.txt file with “test” text within on the server.

To exploit the vulnerability, we can add “../” path segments right before the name query parameter: curl -X POST --data-binary "pwned" http://localhost:8000/upload?offset=0&name=../malicious-file. The exploit will then create a malicious-file at the root of the filesystem.

Security Implications

The exploitation of arbitrary file write vulnerabilities is not as straightforward as with arbitrary file reads, but in many cases, it can still lead to remote code execution (RCE). The RCE is often possible because a malicious actor is able to override important system files like /etc/init.d/ or even the server executable files themselves.

Mitigation

Similarly to the previous example, Mongoose fixed the vulnerability by removing “..” from the user provided file name. The fix is published as part of the version 7.6.

As a general advice, we would recommend that you avoid using user-provided filenames and use random strings, timestamps or MD5 hashes instead of the original filenames. It is often applicable for applications with file uploading functionality and helps to avoid many other vulnerabilities like XSS and so on.

Zip slip

The third type of the directory traversal issues which we are going to cover today is zip slip. As you can tell from the name, it represents vulnerabilities related to archive extraction logic. This type is very similar to arbitrary file writes, but has a more narrow surface because the archive extraction logic is not that common.

To demonstrate the zip slip vulnerability we are going to use a vulnerable version 6.1.4 of JUCE, and open source cross-platform C++ application framework.

1#include <juce_core/juce_core.h>
2
3int main()
4{
5    juce::File file("archive.zip");
6    juce::ZipFile zipFile(file);
7    zipFile.uncompressTo(juce::File("data"));
8    return 0;
9}

This is a very simple application which uses JUCE API to extract archive.zip to the data folder.

To check out how it works we can create a sample archive with a test.txt file:

1echo test > test.txt && zip archive.zip test.zip
2  adding: test.zip (stored 0%)

If we compile and run our sample application, the data folder is going to be created and the test.txt file will be inside it. But what if we create an archive with the file one level up in the file hierarchy?

1echo bad > ../bad.txt && zip archive.zip ../bad.txt

In this case, if we execute our program we will have the bad.txt file outside of the target data folder. As you can see, the security concerns for zp slip are exactly the same as for the arbitrary file write case.

There are other variations of zip slip that exist (like symlink extraction), and we recommend reading our research paper to get a deeper understanding of this topic.

Mitigation

The vulnerability we showed was fixed in the 6.1.5 version of JUCE framework together with another variation – zip slip via symlink. JUCE fixed the vulnerability by verifying if the full resolved file path consists within the target directory.

General remediation advice

While researching this class of vulnerability, the Snyk Security Research team looked at tons of code samples implementing various filesystem path-related logic, and we believe that we found a general rule of thumb that you can use to protect yourself from directory traversal vulnerabilities.

If it’s possible to not use user-provided filepaths, don’t!

This rule is fair for many cases of uploading logic and archive extraction logic.

But if you have to use user-provided file paths, we can recommend you to follow next logic (implemented using std):

1// #include <iostream>
2// #include <string>
3// #include <filesystem>
4// #include <algorithm>
5
6    // This is the path to your static files folder.
7    std::string base_path("/tmp/path_to_the_content_root_directory/");
8    // User input – possibly contains "../" path segments.
9    std::string user_input("user_provided_path");
10
11    // The "canonical" method resolves ".", ".." and symlinks.
12    std::filesystem::path base_resolved_path(
13        std::filesystem::canonical(base_path));
14    // The "weakly_canonical" method does the same as "canonical" but not requires
15    // the file to exist.
16    std::filesystem::path requested_file_path(
17        std::filesystem::weakly_canonical(base_resolved_path / user_input));
18
19    // Using "equal" we can check if "requested_file_path" starts
20    // with base_resolved_path. Because we previously canonicalized both paths
21    // they can't contain any ".." segments, so this check is sufficient.
22    if (std::equal(
23            base_resolved_path.begin(),
24            base_resolved_path.end(),
25            requested_file_path.begin())) {
26        std::cout << "It is safe to work with the file: " << requested_file_path << "\n";
27    } else {
28        std::cout << "Not safe! We have to throw an exception here.\n";
29    }

The code above requires at least C++17, but the same can be implemented in boost. If neither std nor boost are available for your setup, replacing “..” with empty strings and then replacing multiple consequence “/” characters into one is also sufficient.

Additionally, if your application is meant to be cross-platform and will also be running on Windows, consider normalizing directory separator characters. In most of the cases, Windows accepts both slash (“/”) and backslash (“\”) as a directory separator. It means the “..\” payload is equivalent to “../” and can also be used to achieve directory traversal attack. The simplest way to make sure your code is safe on Windows is to replace “/” with “\” before handling paths.

Conclusion and key findings

When information security people think about C and C++ vulnerabilities, they often focus on low-level issues like various types of memory corruptions. Although memory corruption issues have a huge impact on the ecosystem, we should keep in mind the application context and prevent more high-level vulnerabilities as well as low-level ones. When we started the research our assumption was that C/C++ web developers pay not enough attention to common web issues, and indeed, we were able to uncover many directory traversal vulnerabilities:

Note that this list is still growing and we will update it as contributors push their fixes live.

Stay secure!

wordpress-sync/blog-feature-c-c-security-beta