Skip to main content

File encryption in Python: An in-depth exploration of symmetric and asymmetric techniques

著者:

Keshav Malik

feature-python-encryption

2023年11月22日

0 分で読めます

In our modern world, we constantly share private, confidential, and sensitive information over digital channels. A fundamental component of this communication is file encryption — transforming data into an unreadable format using encryption algorithms.

There are two main types of encryption: symmetric and asymmetric. Symmetric encryption is like a lockbox with a single key. If two people have the key, they can pass the box between them, safe in the knowledge that only they can open it. Asymmetric encryption, on the other hand, uses a pair of keys. Imagine a box you can lock with one key but can only unlock with a second key. You could safely share the locking key (the public key) because nobody can open the box (decrypt the data) without the unlocking key (private key).

This article dives into the world of encryption in Python. For symmetric encryption, we’ll focus on Amazon’s Key Management Service (KMS) and PyNaCl SecretBox. Then, we’ll look at asymmetric encryption and PyNaCl’s public/private box.

Prerequisites

To follow along with this tutorial, you need:

  • Python installed (v3.7+)

  • pipenv installed. This Python dependency manager will help you avoid conflicts between package versions. Install it with pip install pipenv.

  • Installation of required packages (aws-encryption-sdk, PyNaCl, and cryptography) by running pipenv install aws-encryption-sdk pynacl cryptography

  • A working AWS account and a KMS key (keep the key ID and AWS region information handy — we’ll use it when we invoke the Python program).

  • The AWS command-line tool (awscli) installed. Run aws configure to set up AWS settings on your machine. 

  • A new project set up in an isolated environment, created using the following:

mkdir python_encryption && cd python_encryption
pipenv shell

To see the final product, check out the complete project code.

Symmetric encryption in Python

Symmetric encryption uses the same key to encrypt and decrypt data. It’s simple, fast, efficient, and ideal for securing large amounts of data.

For this tutorial, we’ll use Amazon KMS and PyNaCl SecretBox to implement symmetric encryption in Python. Amazon KMS helps us create, control, and manage encryption keys on the Amazon Web Services (AWS) cloud. Think of it as your keyring in the cloud, with an added layer of security. SecretBox is a Python library that makes encryption easy and accessible. Think of it as your personal safe box.

Using Amazon KMS and encryption SDKs for symmetric encryption

Amazon KMS is a cloud-based service for managing cryptographic keys. It works seamlessly with other AWS services and encryption software development kits (SDKs) and leverages the robust security provided by AWS. Its tight integration with AWS services and encryption SDKs makes it perfect for AWS-based applications. 

The most compelling reason to use KMS for symmetric encryption lies in managing shared secret keys. Securely handling these keys can be the most challenging aspect of encryption. Amazon KMS elegantly circumvents these challenges by adopting an envelope encryption pattern. In this approach, the keys themselves are encrypted alongside the data. This means you primarily depend on a master key in a hardware security module, which remains inaccessible to unauthorized entities. This key wrapping mechanism provides a streamlined encryption process, effectively sidelining the complexities of key management, rotation, and other associated hurdles.

However, Amazon KMS may be unnecessary if you only need a simple, small-scale solution. It’s also a paid service, so costs can spiral with heavy usage.

Now, let’s use Amazon KMS to encrypt a file in a Python application.

Import required modules

To interact with Amazon KMS, we first need to install and import aws-encryption-sdk — the AWS Encryption SDK for Python that simplifies encryption and decryption processes. 

Create a file called main.py and add the following code to it:

1import aws_encryption_sdk
2
3# Configuration
4KEY_ARNs = [
5    'arn:aws:kms:<REGION>:xxx:key/<KMS-KEY-ID>'
6]
7
8client = aws_encryption_sdk.EncryptionSDKClient()
9kms_key_provider = aws_encryption_sdk.StrictAwsKmsMasterKeyProvider(
10    key_ids=KEY_ARNs)
11

Note: The KMS key ARN can be copied from the AWS KMS Dashboard.

Encrypt the file

We'll use the AWS Encryption SDK to perform the encryption. The SDK integrates closely with AWS KMS and provides a seamless encryption experience.

Add the following code to main.py:

1def encrypt_file(file_path, encrypted_path):
2    with open(file_path, 'rb') as pt_file, open(encrypted_path, 'wb') as ct_file:
3        with client.stream(
4            mode='e',
5            source=pt_file,
6            key_provider=kms_key_provider
7        ) as encryptor:
8            for chunk in encryptor:
9                ct_file.write(chunk)

Decrypt the file

We'll again use the AWS Encryption SDK to decrypt the file. The decryption process reads the encrypted file and then uses the KMS keys to decrypt the data seamlessly.

Add the following commands to main.py:

1def decrypt_file(encrypted_path, decrypted_path):
2    with open(encrypted_path, 'rb') as ct_file, open(decrypted_path, 'wb') as pt_file:
3        with client.stream(
4            mode='d',
5            source=ct_file,
6            key_provider=kms_key_provider
7        ) as decryptor:
8            for chunk in decryptor:
9                pt_file.write(chunk)

Call the functions

We’ve created the functions, and now we need to call them and run the Python program. First, create a file called myfile.txt and add some random content.

Then, within main.py, call the following functions:

1encrypt_file('myfile.txt', 'encrypted_file.txt')
2decrypt_file('encrypted_file.txt', 'decrypted_file.txt')

When we execute python main.py to run the script, we encrypt and then decrypt myfile.txt, leaving its content unchanged. 

Note: For the sake of simplicity, we’ve hard-coded the Key ID (as part of Key ARN) in the code. Please use the .env file to store sensitive details.

Using PyNaCl’s SecretBox for symmetric encryption

PyNaCl is a Python binding to the networking and cryptography library libsodium. It contains a treasure trove of cryptography tools, including SecretBox, which provides symmetric encryption functionality. The beauty of SecretBox lies in its simplicity, with just one key to encrypt and decrypt our data.

Let’s look at encrypting a sensitive file within a Python application using SecretBox. 

First, create a nacl.txt file and add some placeholder text. Then, import the required modules into a new Python program file named sym.py:

1from nacl import utils
2from nacl.secret import SecretBox
3import os

The next step is to generate and load the secret key. The generate_key function generates a random key, and the load_key function loads the generated key from the file path and returns it:

1def generate_key(secret_key_file):
2    secret_key = utils.random(SecretBox.KEY_SIZE)
3    with open(secret_key_file, 'wb') as f:
4        f.write(secret_key)
5    return secret_key
6
7def load_key(secret_key_file):
8    if not os.path.exists(secret_key_file):
9        print(f"Key file {secret_key_file} not found.")
10    with open(secret_key_file, 'rb') as f:
11        key = f.read()
12    if len(key) != SecretBox.KEY_SIZE:
13        print("Incorrect key length.")
14    return key

Now, we encrypt the file. The encrypt_file function needs the file_path and key to encrypt the sensitive file. It finds the file by its file_path, then reads and encrypts the contents using the secret key, writing the encrypted text to a new .enc extension file:

1def encrypt_file(file_path, key):
2    if not os.path.exists(file_path):
3        print(f"Key file {secret_key_file} not found.")
4    secret_box = SecretBox(key)
5    with open(file_path, 'rb') as f:
6        plaintext = f.read()
7    ciphertext = secret_box.encrypt(plaintext)
8    with open(file_path+'.enc', 'wb') as f:
9        f.write(ciphertext)

Just as we have a function to encrypt, we also need a similar function to decrypt. The decrypt_file function takes the file_path and key to decrypt a file. First, it checks whether an encrypted .enc file exists (it does — we generated it in the last step). Then, it reads its content and decrypts it using the same secret key. Finally, it writes the decrypted contents to a new file with .dec extension:

1def decrypt_file(file_path, key):
2    if not os.path.exists(file_path+'.enc'):
3        print(f"Key file {secret_key_file} not found.")
4    secret_box = SecretBox(key)
5    with open(file_path+'.enc', 'rb') as f:
6        ciphertext = f.read()
7    plaintext = secret_box.decrypt(ciphertext)
8    with open(file_path+'.dec', 'wb') as f:
9        f.write(plaintext)

The last step is to call the functions that we just created:

1secret_key_file = 'secret.key'
2
3# generate key for encryption
4key = generate_key(secret_key_file)
5encrypt_file('nacl.txt', key)
6
7# load key for decryption
8key = load_key(secret_key_file)
9decrypt_file('nacl.txt', key)

You can now run the Python program with the sym.py command in your terminal.

Note: This code isn’t ready for use in a real-life application or production environment. Securely handling encryption keys is a complex task that involves extra steps, such as rotating keys regularly. This code lacks those crucial security measures.

Asymmetric encryption in Python

Where symmetric encryption uses one key, asymmetric encryption uses two: a public key for encryption and a private key for decryption. It’s like a lockable mailbox. Anyone can deposit a letter through the slot (encryption), but only someone with a unique key can open it to read the letter (decryption).

Asymmetric encryption is especially suited to secure communication over the internet. If you want to send a secret message to a friend, you can encrypt it with their public key, and only they can decrypt it with their private key. Asymmetric encryption secures digital signatures and certificates — the foundation of authentication in the digital world.

Using PyNaCl’s public/private box for asymmetric encryption

So, how do we perform asymmetric encryption in Python? Let’s look at another tool from PyNaCl — the public/private box. 

PyNaCl’s public/private box is similar to a personal mailbox. It’s an interface for encrypting and decrypting messages with two keys: public and private. We can openly share the public key used to encrypt messages. But we must keep the private key, used for decryption, secret.

First, create a file, asym_nacl.txt, and add some dummy text. We need to enable safe storage and retrieval of encryption keys. The save_key function writes the key to the file_path, while load_key fetches and returns the key from the file_path

Add the following code to a new file, asym.py:

1from nacl.public import PrivateKey, Box
2import pickle
3
4def save_key(file_path, key):
5    with open(file_path, 'wb') as f:
6        pickle.dump(key, f)
7
8def load_key(file_path):
9    with open(file_path, 'rb') as f:
10        key = pickle.load(f)
11    return key

Now, we need to implement functions to encrypt and decrypt our sensitive file. The encrypt_file function uses public keys to encrypt the file and store it in a .enc file with encrypted text, and the decrypt_file function uses private keys to open the file (decrypt it) and store it in a .dec file. 

Add the following functions to asym.py:

1def encrypt_file(file_path, public_key):
2    private_key = PrivateKey.generate()
3    box = Box(private_key, public_key)
4
5    with open(file_path, 'rb') as f:
6        plaintext = f.read()
7
8    ciphertext = box.encrypt(plaintext)
9
10    with open(file_path+'.enc', 'wb') as f:
11        f.write(ciphertext)
12
13    return private_key
14
15def decrypt_file(file_path, private_key, public_key):
16    box = Box(private_key, public_key)
17
18    with open(file_path+'.enc', 'rb') as f:
19        ciphertext = f.read()
20
21    plaintext = box.decrypt(ciphertext)
22
23    with open(file_path+'.dec', 'wb') as f:
24        f.write(plaintext)

Finally, it’s time to see the results of all our hard work. We generate two keys: the public key, which we use to lock our file, and a private key, which we save. When we need to access the file, we load the private key we saved, along with the corresponding public key, decrypting it. 

Add the final function calls to the asym.py file:

1private_key = PrivateKey.generate()
2public_key = private_key.public_key
3
4encryption_private_key = encrypt_file('asym_nacl.txt', public_key)
5save_key('private.key', encryption_private_key)
6
7encryption_private_key = load_key('private.key')
8decrypt_file('asym_nacl.txt', private_key, encryption_private_key.public_key)

Run the program using python asym.py from your local terminal.

Note: The code above demonstrates how encryption and decryption work. However, it’s not safe for use in real-world applications. The key file is not protected, so if someone gains access to it, they can unlock your files.

Pros and cons of PyNaCI’s public/private box

PyNaCI’s public/private box provides strong security. It also allows secure communication, even over insecure channels, which makes it ideal for internet communication.

On the other hand, key management can be tricky with the public/private box. We must store private keys securely and ensure the right public keys are used for encryption. Asymmetric encryption can also be slower than symmetric encryption because it involves more complex operations.

Wrapping up our exploration

In this article, we explored some powerful Python tools for implementing encryption methods. Amazon KMS and PyNaCl’s SecretBox are great for symmetric encryption, while PyNaCl’s public/private box excels at asymmetric encryption.

These tools offer a range of benefits for secure data handling. Amazon KMS has supreme scalability, PyNaCl’s SecretBox has high performance and broad compatibility, and PyNaCl’s public/private box has highly secure communication abilities. Together, these tools make up a comprehensive encryption toolkit.

When selecting a method of encryption, we must consider how sensitive our data is, the environment in which we’ll transmit the data, and how we’ll manage the keys. There’s no perfect encryption method, but we easily find the one that best meets our needs.

feature-python-encryption

CI/CDパイプラインをレベルアップする方法

プロダクションにプッシュする前に、これらの8つのヒントでパイプ内のセキュリティ問題をキャッチする方法を確認してみましょう⭐️