Generating fake security data with Python and faker-security
Snyk recently open sourced our faker-security Python package to help anyone working with security data. In this blog post, we’ll briefly go over what this Python package is and how to use it. But first, we’ll get some context for how the
factory_boy Python package can be used in combination with
faker-security to improve your test-writing experience during development.
Note: Some knowledge of Python is helpful for getting the most out of this post.
Testing with Faker and factory_boy
Snyk believes strongly in the ability of automated tests to make our code maintainable. Tests allow us to iterate and develop features quickly, and confidently make changes to our code without fearing we may inadvertently break existing features in the process.
Our commitment to testing drives us to find new ways to simplify the testing experience for the test writers and readers within our teams.
factory_boy are two of our favorite packages for testing Python projects. Together, they generate fake instances of models we use in testing.
Faker is a Python package that allows you to generate fake data for many different kinds of fields, like usernames, dates, and URLs.
factory_boy is another Python package that helps integrate
Faker’s data generation into your code by defining factory classes.
What we love about
factory_boy, in particular, is that it allows a test author to focus on pinning the data they care about within their tests, while leaving
Faker to generate all the other data that the test does not care about. This greatly improves test readability by reducing the required lines of code and removing noise from fields you do not need to worry about.
To see the difference in action, compare a test that’s written with
factory_boy to one that isn’t in the following examples.
from django.contrib.auth.models import User def test_correct_email_address(): user = User( first_name="Sherlock", last_name="Holmes", username="sherlock.holmes", email="firstname.lastname@example.org", is_admin=False, ) assert has_valid_email(user) is True
from tests.factories import UserFactory def test_correct_email_address(): user = UserFactory(email="email@example.com") assert has_valid_email(user) is True
The test importing
UserFactory is exactly equivalent to the one which does not. However, it is shorter, easier to read, and clearly displays the fields that matter to the test. In comparison, the non-factory test is longer and makes it difficult to understand which fields actually matter for the purposes of the test. This is a fairly simple example, but the difference becomes even more pronounced as test complexity increases.
UserFactory class can be defined in
tests/factories.py once and re-used in all of your tests:
import factory from django.contrib.auth.models import User from factory.django import DjangoModelFactory class UserFactory(DjangoModelFactory): class Meta: model = User username = factory.Faker("slug") first_name = factory.Faker("first_name") last_name = factory.Faker("last_name") email = factory.Faker("email") is_admin = False
When dealing with security data, we often need to generate data for security fields like CVSSv3 vectors and CVE identifiers.
Fakerdoes not have a direct way of providing this data by default, but it does allow you to add your own providers, which is exactly where
faker-security comes into play.
What is faker-security?
faker-security is a Python package that acts as a
Faker provider, allowing you to randomly generate security-related data for your projects. Currently,
faker-security supports data generation for:
- CVSSv3 vectors
- CVSSv2 vectors
- semver versions
- NPM semver version ranges
In the future, we hope to cover more generation methods and types of version ranges — like the Maven
Building on our previous examples, if we want to create a
VulnerabilityFactory of some kind to generate fake data, we would define it as follows:
import factory from factory.django import DjangoModelFactory from faker_security.providers import SecurityProvider from myproject.models import Vulnerability factory.Faker.add_provider(SecurityProvider) class VulnerabilityFactory(DjangoModelFatory): class Meta: model = Vulnerability cvss_v3_vector = factory.Faker("cvssv3") cve_id = factory.Faker("cve") cwe_id = factory.Faker("cwe")
How to use faker-security
faker-security can be installed via
pip install faker-security
If you want to use it within your project, add it to your dependency file of choice. This is typically your project’s
requirements.txt file. If you are using a higher-level package manager like
pipenv, follow their instructions for adding new packages.
Once installed, you just need to configure
factory_boy to make use of
If you are running tests with pytest, we recommend setting up
factory_boy in your
conftest.py file as follows:
import factory from faker_security.providers import SecurityProvider def pytest_configure(): factory.Faker.add_provider(SecurityProvider)
Moving forward with faker-security
Faker is a great way to simplify your tests, and with faker-security, you now have a quick and easy way to generate fake security data for all your projects!
We hope you find this package as useful as we do and would love to have you contribute! Please star our GitHub repo and send pull requests and contributions. Happy testing!
Secure your Python apps with Snyk
Find and fix vulns in your Python code, dependencies, containers, and configs.