Skip to main content

Code injection in Python: examples and prevention

著者:
Lucien Chemaly
Lucien Chemaly
feature-python-code-injection

2023年12月6日

0 分で読めます

As software becomes increasingly integral to our professional and personal lives, the need to protect information and systems from malicious attacks grows proportionately. One of the critical threats that Python developers must grapple with is the risk of code injection, a sophisticated and often devastating form of cyberattack.

Code injection is a pervasive problem that transcends programming languages and platforms, yet its manifestation in Python applications can be remarkably subtle and dangerous.

As one of the most widely used languages for web development, data analysis, and automation, Python offers an extensive set of features and libraries that can be both a blessing and a curse. While it empowers developers to build robust and efficient systems, it also presents numerous opportunities for bad actors to exploit vulnerabilities if secure coding conventions are not strictly adhered to.

The challenge of preventing code injection in Python is further amplified by the rise and widespread usage of open source components and packages. These readily available resources speed up development; however, they can come with hidden security flaws that can be exploited for code injection.

In this article, you'll learn about the dangers and importance of secure coding conventions, particularly regarding code injection vulnerabilities and how these manifest in Python applications. By understanding the nature of code injection and embracing best practices in secure coding, you can contribute to a safer digital ecosystem and protect your applications from potential breaches.

What is code injection?

Code injection is a stealthy attack where malicious code is inserted into a software system, causing it to execute unintended commands. By exploiting vulnerabilities, an attacker can inject harmful code, leading to severe consequences, such as unauthorized data access, financial fraud, or total system takeover.

These vulnerabilities often occur when an application mishandles user input. For example, insecure use of functions like eval() in Python without proper validation can lead to code injection. So can creating code based on user input without adequate checks, using third-party code without security vetting, or having vulnerabilities in the configuration of web frameworks or databases.

Understanding code injection is essential for those involved in software development or security. To that end, let’s take an in-depth look at common vulnerabilities you may encounter.

Common vulnerabilities leading to code injection

Vulnerabilities leading to code injection are a significant concern in software development. Understanding and addressing these vulnerabilities is vital for creating secure systems. In the following sections, you'll explore some of the primary sources of code injection and how to guard against them.

Exploit of user-controlled inputs

When user input is used directly without validation, an attacker can enter Python code as input, and the application executes it. For instance, take a look at this example:

1user_input = input("Enter your username: ")
2query = "SELECT * FROM users WHERE username = '" + user_input + "';"
3execute_query(query)  # This can be exploited

If attackers wanted to exploit this code, they could enter something like ' OR '1'='1'; DROP TABLE users;. The semicolon (;) terminates the original query, and the subsequent SQL code becomes a new query that could potentially delete the users table from the database.

Insecure use of eval() and related functions

Functions like eval() can be exploited if a user is allowed to enter arbitrary expressions like this:

1user_input = input("Enter expression: ")
2result = eval(user_input)  # Unsafe

In this example, an attacker can enter __import__('os').system('any arbitrary command') to execute arbitrary OS commands. For instance, the attacker could enter a command like rm -rf /, which attempts to delete all files in the root directory of the system where the code is located.

Lack of input validation and sanitization

Without proper validation, an attacker can enter a path to any file, and the application reads it like this:

1user_input = input("Enter filename: ")
2with open(user_input, 'r') as file:  # Vulnerable to directory traversal
3    content = file.read()

Here, an attacker can enter a path like /etc/passwd to read sensitive system files.

Risks associated with dynamic code construction

Dynamically constructing command strings without proper validation can lead to command injection vulnerabilities. For instance, take a look at this code:

1import os
2
3directory = input("Enter the directory to list: ")
4command = f"ls {directory}"  # Vulnerable to Command Injection
5os.system(command)

In this example, an attacker could enter a directory string like ; cat /etc/passwd to list the contents of the system's password file or, even worse, execute other malicious commands. The semicolon (;) allows the attacker to chain commands, making it possible to execute arbitrary commands on the host system.

Insecure deserialization

Insecure deserialization can occur when untrusted data is deserialized without proper validation or sanitation. An attacker can exploit this to execute arbitrary code. For instance, take a look at this example:

1import pickle
2serialized_data = input("Enter serialized data: ")
3deserialized_data = pickle.loads(serialized_data.encode('latin1'))  # Unsafe deserialization

If an attacker wanted to exploit this code, they could craft a serialized object that, when deserialized, runs arbitrary code, such as spawning a reverse shell. This type of attack can occur in everyday applications, like an online shopping cart. For instance, imagine you're using one, and it stores your cart as a special code. If the app doesn't thoroughly check to make sure the code is safe when it reloads, an attacker can send deceptive code that appears as a shopping cart but secretly carries out malicious actions. When the app unwittingly runs this dangerous code, it opens the door for the attacker. They can steal your data, manipulate the app, or even gain control of the entire system, posing a significant security risk. 

Mitigating code injection vulnerabilities

While understanding vulnerabilities is the first step, learning how to mitigate them effectively is crucial. The following are countermeasures for the vulnerabilities previously discussed:

Safeguard user-controlled inputs

Avoid direct execution of user inputs. If needed, employ strict allowlisting techniques where only specified input patterns are accepted:

1ALLOWED_COMMANDS = ["start", "stop", "restart"]
2user_input = input("Enter your command: ")
3if user_input in ALLOWED_COMMANDS:
4    exec(user_input)
5else:
6    print("Invalid command.")

Use eval() and related functions securely

Use safer alternatives, like literal_eval(), or avoid the use of eval() entirely. If its usage is unavoidable, ensure inputs are sanitized:

1from ast import literal_eval
2user_input = input("Enter expression: ")
3try:
4    result = literal_eval(user_input)
5except ValueError:
6    print("Invalid expression.")

Implement input validation and sanitization

Use regular expressions or other validation techniques to ensure only valid file names or paths are accepted:

1import re
2user_input = input("Enter filename: ")
3if re.match("^[a-zA-Z0-9_\-/]+\.txt$", user_input):
4    with open(user_input, 'r') as file:
5        content = file.read()
6else:
7    print("Invalid filename.")

For example, with literal_eval(), the input 2 + 3 would be evaluated as the numeric value 5, while an input like __import__('os').system('rm -rf /') would raise a ValueError instead of executing the command. This effectively mitigates the security concerns associated with using eval() and provides a safer way to evaluate expressions that involve literals.

Address dynamic code construction

Use parameterized queries or prepared statements to prevent SQL injection:

1import sqlite3
2connection = sqlite3.connect('database.db')
3cursor = connection.cursor()
4input_username = input("Enter username: ")
5query = "SELECT * FROM users WHERE username = ?"
6cursor.execute(query, (input_username,))

Deserialize securely

Only deserialize trusted data. If you need to deserialize user input, use safer formats like JSON or ensure thorough validation before deserialization:

1import json
2serialized_data = input("Enter serialized data: ")
3try:
4    deserialized_data = json.loads(serialized_data)
5except json.JSONDecodeError:
6    print("Invalid data.")

Enforce strong access controls

Applying the principle of least privilege ensures that users and processes have the minimal access (or permissions) needed to accomplish their tasks. This reduces the attack surface by limiting what attackers can do if they exploit a vulnerability.

Make sure you implement strong access controls to restrict unauthorized access to sensitive areas of the application. In the next section, you'll learn about additional measures that can enhance the security of your codebase.

Secure coding conventions in Python

So far, the most effective ways to mitigate code injection vulnerabilities have been discussed, but writing secure code involves more than that. It also includes embracing a set of best practices and conventions that help create robust and secure applications. When writing Python code, you should consider the following:

Use developer security tools to find and fix issues

Snyk offers free Python security tools that assist in identifying and remediating vulnerabilities in your application code, open source dependencies, containers, IaC configurations, and more. You can plug Snyk into your Git repositories, CI/CD workflows, command-line tooling, and IDEs to help identify and fix vulnerabilities. 

For example, Snyk will identify vulnerable open source dependencies in your application and show you how to fix them easily. Here’s a list of vulnerable Python code in pip packages in the Snyk Vulnerability Database

blog-python-code-injection-vulndb

Incorporating tools like Snyk in your development workflow provides an extra layer of defense, enabling a more robust and secure application development process. Snyk provides insightful articles on integrating their tools into your workflow:

Regular security scanning with Snyk brings a lot of advantages, such as access to an up-to-date vulnerability database and collaborative security tools designed to work within a team environment. The benefits extend beyond merely safeguarding your code, they also enhance the entire development experience. With features tailored for Python and an emphasis on preventing code injection, Snyk transforms the way you approach security in your coding practices, making it an invaluable resource for producing secure, high-quality software.

Implement secure logging practices

Avoid logging sensitive user data or system information that might be useful to an attacker. Make sure you use logging frameworks that support secure configurations.

Cultivate a secure coding environment

Following are some more strategies you can employ to create a secure coding environment:

  • Regular code reviews and security audits: Ensure that code is regularly reviewed by security experts so that you can catch vulnerabilities early.

  • Keep software and libraries up-to-date: Always use the latest versions of libraries and frameworks, as they often include security patches.

  • Encourage a security-focused mindset: Promote a culture where security is a primary concern, ensuring that all team members are trained in secure coding practices.

Wrapping up

In this article, you learned about the risks of code injection in Python and why you need to follow secure coding rules. Code injection can happen in many forms, but when armed with the right knowledge and tools like Snyk, you can reduce those dangers. Don't wait for a problem to ruin your work. Add Snyk's security tools to your Python development process and make your code both safer and more effective.