Data loss prevention for developers
May 24, 2023
14 mins readA security violation in the form of a data breach can create costly damage to a company's reputation. But what exactly is a data breach? The European Commission has divided data breaches into three distinct categories — confidentiality breaches, integrity breaches, and availability breaches:
A confidentiality breach takes place when there is an unauthorized or accidental disclosure of, or access to, personal data.
An integrity breach occurs when there is an unauthorized or accidental alteration of personal data.
An availability breach occurs when there is an accidental or unauthorized loss of access to, or destruction of, personal data.
In this article, you'll learn more about what a data breach is and how you can prevent data breaches when designing and developing your software.
What is data loss?
Before you can learn how to prevent data loss, it's important to define what data loss actually is since there are several forms it can take, like ransomware or hacking attack.
Ransomware can occur when, through the downloading of malware, a company has its data encrypted and made unavailable unless a large sum of money is paid to the hacker in order to decrypt the data. A hacker can also threaten to publish this data unless the sum is paid, causing a confidentiality breach.
How a data leak can happen
There are cases in which a data breach is not carried out by a competitor or someone who wants financial gain from harming a company. For instance, a data breach can also occur through human error. According to the GRC eLearning website, 82 percent of all data breaches involve a human element, meaning a manual action caused the breach in the first place. Without malice in mind, data loss can still occur if an employee loses a physical device, such as a company laptop. This is why training workshops for all companies are imperative in order to reduce these scenarios.
Under European law, astronomical fines of up to €20 million or 4 percent turnover (whichever is higher) can be ordered. Tessain published an article documenting the largest data breach fines issued this year. Amazon tops the list with an astounding €746 million ($877 million USD), followed by WhatsApp being fined €225 million ($255 million USD).
Data sensitivity and data loss severity
The severity of each data breach will also be determined according to the type of data leaked or lost. This includes personal data, such as biometric, health, genetic, political, and religious information. For instance, if a company has fingerprint readers that let employees in and out of office doors, these fingerprints would be stored on company servers. If the company experiences a hack where all the data residing on its servers is now in the hands of a malicious third party, the biometric data of all employees are now at risk. The more sensitive the data leak, the larger the fine incurred.
How to prevent data loss
Due to the growing amount of data and increased fines for corporations that experience data loss, it's clearly a growing concern. Let's take a look at how you can prevent data loss and leaks — or at least mitigate their occurrence when developing software.
Choose the right architecture
Preventing data leakage requires much more than implementing good security policies. You'll need to choose the right architecture to minimize data loss. Enterprises can opt for a domain-driven design (DDD) approach that allows proper separation of the system components.
Separating the presentation layer from business logic and domain entities makes it easy for developers to achieve loose coupling and high cohesion. You can achieve this by using data transfer objects (DTOs), these are objects that transport data between processes while minimizing the number of method calls between the client and server.
Since most data transfers involve remote communication, retrieving information often requires multiple requests. For example, a client application may make the following remote calls to the server to retrieve someone's name:
You can see that you need multiple remote calls to get all the information you need. However, you can use a DTO to hold all the different values and send this DTO as a response to the client's request. The client can then interact with the DTO locally and retrieve the values as needed. This method saves you several remote calls and minimizes data exposure to the network:
Identify sensitive data
Securing sensitive data should be a top priority from the beginning of the SDLC process. You need to properly plan how to store data related to finances, business objectives, health records, etc.
One key consideration is how to keep personally identifiable information (PII). Any data that uniquely identifies a person from within or outside the enterprise falls under PII data. Legislation on how to store and use PII data varies across regions, but any loss or leakage of such data can warrant hefty financial penalties and damage your reputation.
A good practice is to collect as little data as possible. If your application doesn't require some data in the first place, don't collect it. Less data means less time securing it. Or at the least, breaches will be less severe. Implementing obfuscation techniques such as data masking can also be helpful.
Encrypt and hash data
Encryption is one of the oldest methods to safeguard sensitive information. You can protect data in transit and at rest with different encryption techniques. If you're using SQL tables to store a large number of data, consider using transparent data encryption methods. The United States government recommends using the Advanced Encryption Standard (AES) algorithm to secure all stored data.
When developing your application, focus on transferring data over secure channels. Data in transit should use the Transport Layer Security (TLS) protocol to prevent eavesdropping and tampering. You should also restrict the use of public services when moving enterprise data.
Utilize secret management tools
Modern applications deal with many configuration secrets like database access, API encryption keys, and environment variables. You need proper planning to safeguard these secrets. A robust secret management tool, like HashiCorp Vault, provides a secure way to store these.
However, when choosing such a tool, you should look for one that makes retrieving the values as seamless as possible. Storing confidential data this way reduces the size of your configuration files while minimizing the risk of data leakage via repositories.
Access limitations
Putting access limitations in place can help to prevent any potential leaks and data loss events. Access limitations are where access is only given to the people who need it. It goes without saying that entry-level employees should not have the same visibility rights as the CEO level. This is often referred to as role-based access control (RBAC), where employees only have access to the type of information needed to complete their day-to-day tasks, and all other network access is limited.
Least privilege
The principle of least privilege is one that grants the minimum amount of access required for one to complete their work. This is considered one of the most effective ways to prevent data leaks as it centrally manages all employees' access. When you limit super user and admin access, if an employee's credentials are leaked or accessed by a company outsider, the chances of sensitive data getting leaked are minor because the employee's access is limited.
There are a few key steps to successfully enable the concept of least privilege. Namely, a digital vault with all user accesses and their access rights needs to be created and managed together with an audit of all company employees, especially privileged accounts. In addition, there should be cautious steps taken to protect passwords, SSH keys, and endpoints.
Two-factor authentication
Another common practice that can help prevent data leaks is two-factor authentication. Two-factor authentication can help prevent brute-force attacks and can even protect against situations where weak passwords would otherwise be all that stands in the way of an attacker and vast amounts of data.
Data storage (cloud or on-premise)
Data can be stored via the cloud using several cloud services, such as Microsoft Azure and Amazon Web Services (AWS), or on a company's property via physical servers. Having the right procedures (both technical and organizational) in place — such as using encryption, setting up different user permissions, and avoiding uploading confidential data — can help you prevent a data breach.
However, before implementing these procedures, you need to keep in mind the principle of privacy by design, which companies like Snyk implement through various company-wide standards. However, some companies choose to store their data on-premise, keeping very sensitive data offline and accessible on an as-needed basis.
Privacy by design
The term privacy by design means that privacy needs to be kept in mind from the beginning of software development, whether you're working on a patch or an entirely new project. There are several guides and certifications companies can apply for to make sure they're doing their utmost to prevent attacks and data leaks. For instance, you can apply for ISO 27001, which is a standard that protects all information present in the cloud, in forms (digital and paper-based), and digitally present on virtual machines or company devices.
Data loss prevention software
Several companies use the help of security software in order to prevent the occurrence of data loss and leaks. Most data loss protection software, such as BetterCloud and Symantec, focus on blocking actions — which often involves prohibiting users from performing actions via USB storage devices or an accidental malware download. Other companies use firewalls to protect the entire company's technological ecosystem, encrypt all their repositories to prevent malicious leaks, or use two-factor authentication for employees to log into their work accounts. All these measures control the potential occurrence of sharing sensitive or critical data and information outside of the company environment.
Mitigating the severity of a data leak
Companies often look to reduce the amount of data stored in order to help mitigate data leak severity. Article 5(1)(c) in the European General Data Protection Regulation (GDPR) policy states that data minimization procedures need to be put in place, such as the automatic deletion of old data. For instance, if company XYZ Ltd. has customers who have not been retained and have not returned in the span of, say, ten years, all these customers' data can and should be erased.
How Snyk combats data loss
Snyk is a secure developer platform that integrates directly with developer tools, providing the security a company needs for the entire tech stack and developmental lifecycle by implementing the privacy-by-design concept previously discussed. Snyk employs a developer-first approach, meaning that it integrates directly with the pipeline the developer is working on in order to add security at the very base layer of development where the code is built and regularly maintained. This means that while the code is developed securely from the very beginning, so is the entire environment with the use of the Snyk Container — which provides you with the ability to package your applications and detects any vulnerabilities therein.
Passing all commits via the Snyk Cloud provides an additional layer of security on top of what was already discussed. Integrating with Amazon Web Services (AWS), Azure, Google Cloud, Terraform, AWS CloudFormation, Kubernetes, Docker, and Open Policy Agent (OPA), snapshots are taken intermittently to prevent data loss and ensure your implementation is up-to-date with the latest implemented policies as it syncs automatically with the latest updated standards.
In the case of open source code implementation, Snyk scans and monitors any possible vulnerabilities at each stage of the pipeline and release management process. Through real-time and historical reporting, Snyk will be able to detect any vulnerabilities or security risks from the initial installation of the open source code, during the running of the code, and during the deployment process.
Reduce the risk of data leaks
Data breaches compromise your data and can have severe consequences, ranging from financial loss, identity theft, humiliation, loss of reputation, and breaches in contracts. Leaks can even have more serious repercussions, including death. Putting the right procedures in place, including the ISO 27000 standards, RBAC, and privacy by design, help you and your company avoid a catastrophic event.
You can also employ tools from companies like Snyk so that you can safeguard your company, employees, and their families. Snyk offers a number of products that can help prevent data loss regardless of what data pipeline or infrastructure you have in place. Try Snyk for free today!
Developer loved. Security trusted.
Snyk's dev-first tooling provides integrated and automated security that meets your governance and compliance needs.