Top 5 C++ security risks
2022年8月16日
0 分で読めますC++ offers many powerful capabilities to developers, which is why it’s used in many industries and many core systems. But unlike some higher-level languages that offer less direct control over resources, C++ has a variety of security concerns that developers must be keenly aware of when writing code to avoid introducing vulnerabilities into projects.
As developers, we build applications with our end-users in mind. They trust us with their data, time, and device access. It’s our responsibility to ensure that our application — and our users’ data — is always safe and secure.
This article explores the top five C++ security concerns that affect code development and offers advice on mitigating them.
1. Buffer overflow
One of the most common security concerns with C++ is a buffer overflow. Buffer overflows are caused by not having built-in boundary check features in C++, which reduce the risk of overwriting memory. Writing outside our allocated memory can cause program crashes and corrupt data. It can even lead to the execution of malicious code. As a result, buffer overflow ranked as the most dangerous software weakness in the 2021 CWE Top 25 Most Dangerous Software Weaknesses survey.
The 2019 buffer overflow attack on the WhatsApp application VOIP stack highlights this vulnerability’s severe impact. With this attack, the attacker exploited WhatsApp’s buffer overflow vulnerability to inject spyware onto targeted users’ phones.
Let’s see what a buffer overflow vulnerability looks like in practice. In the following code snippet, we get user input using the gets function and check to see if it affected the important_data
variable.
1#include <stdio.h>
2
3int main(int argc, char **argv)
4{
5 volatile int important_data = 0;
6 char user_input[10];
7
8 gets(user_input);
9
10 if(important_data != 0) {
11 printf("Warning !!!, the 'important_data' was changed\n");
12 } else {
13 printf("the 'important_data' was not changed\n");
14 }
15}
While this code may look innocent, it’s vulnerable to a buffer overflow attack if the user enters a string longer than the length of the user_input
array. The stack is growing towards a lower address, meaning that the important_data
is under the user_input
in the memory.
There are several ways to prevent this issue, some of which depend on our compiler or OS and kernel features. The main tools we can use to secure our data are stack canaries, address space layout randomization (ASLR), and data execution prevention (DEP).
Stack canaries add a new, randomly selected secret value to the stack every time a program starts. Before a function returns, this value is verified.
ASLR prevents an attacker from knowing the memory layout, making it challenging to perform the buffer overflow attack. If the attacker doesn’t know where the data resides in memory, they won’t know which buffer to attack.
DEP marks certain memory areas, like the stack, as non-executable memory.
In addition to using OS and compiler features, we must implement good coding practices, including bounds checking. We should avoid using standard library functions vulnerable to buffer overflow attacks, such as get
, strcpy
, strcat
, scanf
, and printf
, as they don’t perform bounds checking. Instead, we should replace them with equivalent secure functions, like fgets
.
2. Integer overflow and underflow
Integer overflow is when the value we’re trying to store in an integer variable exceeds the maximum value an integer can hold. Integer underflow occurs when this value is less than the minimum value an integer can hold. When it’s less, the value wraps around.
The 2021 CWE Top 25 Most Dangerous Software Weaknesses survey ranked this the 12th most dangerous software weakness. Additionally, integer overflow and underflow can lead to a buffer overflow vulnerability.
The following code is an actual bug found in OpenSSH v3.3 and is an example of how an integer overflow bug that can cause a buffer overflow attack:
1nresp = packet_get_int();
2if (nresp > 0) {
3response = xmalloc(nresp*sizeof(char*));
4for (i = 0; i < nresp; i++)
5 response[i] = packet_get_string(NULL);
6}
This code represents an integer overflow vulnerability. While the code checks for a zero value, it’s possible to have a zero-memory allocation if the input nresp
equals 1073741824
. Multiplying this value with four (size of char pointer) causes the variable to overflow, and xmalloc(nresp*sizeof(char*))
allocates a buffer of size 0
.
To mitigate this vulnerability, we should practice range checking for zero/the minimum value and the maximum value to protect against overflow or wrapping of the variable value.
3. Pointer initialization
Pointer initialization is critical. We can expose a lot of data using a pointer that’s not initialized as a pointer to a memory location or function. If the uninitialized pointer points to a memory location, it can cause the program to read or write to an unexpected memory location. If it points to a function, it can cause the unintentional execution of an arbitrary function.
Pointers are also vulnerable to null pointer dereferencing attacks, which can lead to program reliability issues. Pointer dereferencing attacks describe accessing a pointer while initialized to null, causing an undefined behavior (UB) in the program. And, in most cases, it will crash. An attacker can take advantage of this by enforcing a null pointer dereference. Incorrectly initializing the pointer will create unexpected and unpredictable behavior. An attacker can bypass some security checks or reveal debugging information that they can use later.
Let’s look at an example of pointer dereferencing caused by improper pointer initialization:
1void main(){
2int *ptr;
3if (nullptr != ptr)
4 {
5*ptr = 5;
6 }
7}
As we can see, the pointer is left uninitialized, which means that a random value is assigned to it, and the check will pass as it may be a non-null value.
We shouldn’t depend only on the checks or the error exception handling to avoid pointer dereferencing attacks. We should consider not using a pointer if we can avoid them and use references instead. If we do use pointers, we should use smart pointers as a replacement for raw pointers.
Another alternative to pointers is to adapt the Resource Acquisition Is Initialization (RAII) technique in our implementation, which guarantees that a resource is available to any function with access.
4. Incorrect type conversion
Another common vulnerability is incorrect type conversion. Most issues related to type conversion happen by signed to unsigned conversion, which usually occurs during function calls by passing the wrong parameter type. Another common vulnerability is type conversion when converting from longer datatypes like double to float and from long int to int, which causes the data to be lost during the implicit conversion.
Here’s a simple example of what an incorrect type conversion vulnerability looks like:
1#include <iostream>
2#include <string>
3using namespace std;
4
5int main()
6{
7 string str;
8 cout << "Please enter your string: \n";
9 getline(cin, str);
10 unsigned int len = str.length();
11 if (len > -1)
12 {
13 cout << "string length is " << len << "which is bigger than -1 " <<std::endl;
14 }else
15 {
16 cout << "string length is " << len << " which is less than -1 " <<std::endl;
17 }
18 return 0;
19}
While it seems impossible to hit the else
statement, as any input string length will be larger than -1
, this code doesn’t work and always hits the else statement. According to the C++ standard and the integral promotion concept, if two values of different data types are compared, the representation of the values will be changed.
In our case, signed short int
’s value will be cast to the larger type, unsigned int
. This will convert the -1
value to an unsigned integer value, equal to 4294967295
, which will cause the program flow to go to the else
statement.
The same issue can happen if you use unsigned integers for subtracting two input values from the user, assuming that the user will never enter a smaller value first, leading to the result of the subtraction being negative.
According to the Google C++ Style Guide, avoiding doing math with unsigned integers can mitigate most of the type conversion problems (except for representing bitfields). They also recommend avoiding the mix between signedness and using Iterators and containers instead of pointers and sizes.
5. Format string vulnerability
There are two components in a format string vulnerability: the format function and the format string. Before exploring format string vulnerabilities, let’s review what the format function and the format string do.
The format function is the function that converts variables of the programming language into a human-readable format. Examples of the format function are printf
and fprintf
.
The format string is the argument of the format function, which contains text and format string parameters. Let’s look at an example:
1printf ("This is a test text of number: %d ", 11);
"This is a test text of number: %d ", 11"
is the format string."%d"
is the format string parameter that defines the conversion format.
Format string attacks happen when we don’t check the parameters that pass to the format function. For example, assume we implemented an application like the following:
1#include <stdio.h>
2
3int main(int argc, char **argv)
4{
5printf(argv[1]);
6return 0;
7}
This code is vulnerable to the format string attack because it doesn’t check the user input. So, the attack happens by passing a format string parameter and input like this:
1"./program "Snyk %p %p %p %p %p" "
This input can get data from the stack, as the program output is something like the following:
1>> ./program "Snyk %p %p %p"
2 Synk 0x7ffcc40aafd8 0x7ffcc40aaff0 0x558581c18180%
This output is caused by printf
treating %p
as a reference to a void pointer that’s trying to interpret the memory addresses.
To prevent this vulnerability, we need to add a format argument to our code, as shown below:
1#include <stdio.h>
2
3int main(int argc, char **argv)
4{
5// safe code
6printf("%s\n",argv[1]);
7return 0;
8}
The code snippet above is safe because it won’t interpret the string. For example, if we tried to compile and run the code, the output would be as follows:
1>> ./program "Snyk %p %p %p"
2 Synk %p %p %p
The output is only the characters passed to the program without interpretation as a reference to the void pointer.
Another better solution is to avoid the usage of printf
whenever possible unless you are forced to use it, and instead replace it with std::formatand std::vformat (introduced in C++ 20), as it verifies the input against the types either on the runtime or in compile time and raise a format_error on mismatch.
Reduce your C++ security risk
In this article, we explored the top five security risks of working with C++. We looked at how an attacker can exploit these vulnerabilities to access user data or retrieve information from our applications and what these vulnerabilities look like in our code. We also highlighted strategies to prevent these vulnerabilities from becoming full-blown security breaches.
Unlike higher-level languages that offer less direct control over resources, C++ has a variety of vulnerabilities that developers must be aware of when writing code to avoid introducing them into projects. Exploitation can take many forms, and malicious actors continuously expand their attack strategies. For more information about vulnerabilities in C++, check out this blog from Snyk’s security research team, and this article exploring directory traversal vulnerabilities in C/C++.
As developers, it’s our responsibility to implement security techniques to ensure that our software is reliable and immune to the security attacks that may expose user information. With the right strategies in place, all the vulnerabilities we explored in this article are avoidable.