Skip to main content

How to avoid web cache poisoning attacks

Written by:
Najia Gul

Najia Gul

September 11, 2023

15 mins read

Web cache poisoning is a cyber attack that wreaks havoc on unsuspecting websites. It exploits vulnerabilities by caching mechanisms that web servers, proxies, and content delivery networks (CDNs) use, compromising data integrity. Malicious actors can use cache poisoning to deliver malicious payloads, tamper with sensitive information, or redirect users to fraudulent websites.

In this article, we’ll comprehensively explore web cache poisoning attacks and how they work. We’ll also discuss the most effective mitigation strategies to help safeguard our web applications.

Understanding web caching and its importance

Web caching is a crucial process for improving a website’s performance. It involves storing web content, such as HTML files, images, scripts, and stylesheets, in temporary locations called caches. These caches make it quicker and easier to access that content again for subsequent user requests.

There are several types of caching mechanisms based on where we store the web caches. These include:

  • Browser caching: Browser caching involves web browsers storing copies of static resources on the user’s device. When that user revisits a website, the browser checks its cache for previously downloaded files. If those files are still valid, the browser retrieves them instead of sending a new request to the server.

  • Server caching: While browser caching occurs on the client side, server caching stores content on the server itself for later reuse. When a user sends a request to the origin server, the cache saves a copy of the generated response. Then, for any subsequent requests for the same content, the cache delivers the copy instead of the origin server.

  • Content delivery network (CDN) caching: CDNs are geographically distributed networks of servers that cache and deliver static web content. When a user wants to access that content again, the edge server closest to their location handles the request. This relieves the origin server from handling all requests.

Caching reduces the time and resources necessary to retrieve web content, leading to faster load times and a better user experience. It also reduces the server load, allowing it to handle a higher volume of requests.

However, caching also introduces several potential risks, especially when not implemented or managed properly. Cached content can become outdated or incorrect if not refreshed promptly, leading to inconsistencies or security vulnerabilities that attackers can exploit.

Caching sensitive information can also pose privacy risks. For example, if the cache contains a user’s personal details or account credentials, unauthorized access can lead to data breaches and privacy violations. 

How web cache poisoning attacks work

Web cache poisoning aims to deceive the caching infrastructure into serving compromised or unauthorized content to users. This process usually follows four steps:

  1. Identifying a vulnerable target: The attacker searches for a website with potential flaws in their cache mechanisms, such as misconfigurations or improper input validation.

  2. Manipulating HTTP requests: The attacker sends HTTP requests to the target website designed to exploit those vulnerabilities.

  3. Exploiting cache mechanisms: The attacker leverages various techniques to either inject their own malicious content or manipulate the cache’s existing content.

  4. Serving compromised content: Once the attack is complete, subsequent requests to the cache retrieve the poisoned content instead of the legitimate content.

Successful web cache poisoning attacks can expose sensitive or confidential information. Attackers can then manipulate the cached content to deface the target website, spread false information, or distribute malicious links.

Worse still, attackers often combine web cache poisoning with other cyber attacks to increase the damage they can do. For example, in 2018, attackers successfully poisoned the cache of the British Airways website. They injected a malicious payload that redirected users to a fraudulent website. As a result, the attackers were able to steal the email addresses and credit card information of over 40,000 UK users.

Common web cache poisoning attack vectors

To safeguard against cache poisoning attacks, we need to identify and mitigate any potential website vulnerabilities. Here are some of the most common attack vectors and how malicious actors can exploit them:

  • HTTP header manipulation: Attackers often modify HTTP headers to associate their malicious content with legitimate URLs or cache keys. For example, they can manipulate the Cache-Control HTTP header to instruct the caching system to store the attacker’s content. This action allows the attacker to serve their malicious payload to subsequent users who request the same resource, even if the legitimate content has changed.

  • Query string manipulation: Attackers can exploit how web applications handle query strings. They manipulate the query parameters appended to the URL to create multiple cache variations for the same resource, which can potentially include a malicious payload.

  • Cookie manipulation: Attackers can also poison the cache by modifying the values and attributes of cookies. They do this to trick the caching system into storing the attacker’s content in the cache under the context of a legitimate user session.

Keep in mind, however, that these are only a few examples. Attackers are constantly exploring new techniques and vulnerabilities to achieve their goals. Keeping an eye on these common vectors alone isn’t enough to safeguard web data.

Cache poisoning attack through HTTP Header Manipulation

Let’s review an instance of web cache poisoning where an attacker exploits a vulnerability in a website using HTTP headers. The X-Forwarded-Host (XFH) header is a common target. This de facto standard header is used for identifying the original host requested by the client in the HTTP/HTTPS request to your application or API. The value of this header is typically a hostname, representing the original target of the client’s request. 

Although this header is designed to be helpful in environments using a reverse proxy or load balancer, it could be exploited by an attacker if not validated correctly. However, it’s important to note that an attacker could potentially use many other headers to perform a similar type of attack. 

Now, let’s turn to our example, highlighting a potential vulnerability in JavaScript code that uses a caching service. For this demonstration, we'll use Redis.

1const express = require('express');
2const app = express();
3const redis = require('redis');
4const client = redis.createClient();
5
6// Middleware to handle caching
7app.use((req, res, next) => {
8    const key = req.originalUrl; // use request url as a key
9
10    client.get(key, (err, data) => {
11        if (err) throw err;
12
13        if (data !== null) {
14            res.send(data);
15        } else {
16            next();
17        }
18    });
19});
20
21app.get('/user', (req, res, next) => {
22    const host = req.headers['x-forwarded-host'] || 'unknown host';
23    const response = `Welcome, user ${req.query.userId}! You're accessing from ${host}`;
24
25    // Cache the response using the URL as a key
26    client.set(req.originalUrl, response, 'EX', 3600); // Cache for 1 hour
27
28    res.send(response);
29    next();
30});
31
32app.listen(3000, () => {
33  console.log('Server running on port 3000');
34});

The caching middleware uses the original URL as the key to cache responses in Redis. The originalURL refers to everything after the hostname (for example, /user?userId=1 in http://example.com/user?userId=1). If the response linked to this URL exists in the cache, it’s sent immediately. Otherwise, the request continues to the appropriate route handler, which, in our case, generates a user-specific greeting and caches it.

However, this implementation can potentially lead to cache poisoning attacks. An attacker could take advantage of the fact that we’re caching responses based on the URLs. Let’s consider a situation where the X-Forwarded-Host header, a user-controlled input, was used in the generation of the response. The code shows the user greeting based on the X-Forwarded-Host value (for example, “Welcome, user ${req.query.userId}, you’re accessing from ${req.headers['x-forwarded-host']}!”). If this value isn’t validated properly, an attacker could include malicious scripts in this header, which would then be cached and served to other users.

Here’s how this could play out:

  • The attacker sends a GET request to /user?userId=1 with X-Forwarded-Host header set as foo."><script>alert(document.cookie)</script>.

  • The server generates a greeting that includes the header’s value, caches it with the key /user?userId=1, and sends it back to the attacker.

  • Now, every subsequent request to /user?userId=1 serves a malicious greeting from the cache.

One strategy for strengthening the security of the above code is to sanitize the URL used for the cache key. Here’s how you might implement a basic sanitize function that uses the escape method from the validator module:

1const validator = require('validator');
2
3function sanitize(url) {
4    let parsedUrl;
5
6    try {
7        parsedUrl = new URL(url);
8    } catch (err) {
9        throw new Error('Invalid URL');
10    }
11
12    parsedUrl.searchParams.forEach((val, param) => {
13        // Escaping potentially harmful characters in the query parameters
14        parsedUrl.searchParams.set(param, validator.escape(val));
15    });
16
17    return parsedUrl.toString();
18}

The sanitize function we’ve implemented can be effectively utilized to remove any characters from the req.originalUrl that could potentially pose a threat:

1const key = sanitize(req.originalUrl);

Moreover, it’s possible to ensure the validity of user inputs through the execution of a middleware function within our Express server. The example below demonstrates a basic approach to validating the headers:

1const validator = require('validator');
2
3app.use((req, res, next) => {
4    const forwardedHostHeader = req.headers['x-forwarded-host'];
5    if (forwardedHostHeader) {
6        const isFQDN = validator.isFQDN(forwardedHostHeader);
7        if (!isFQDN) {
8            // If the header does not represent a fully qualified domain name, the request is rejected
9            return res.status(400).send('Invalid X-Forwarded-Host header');
10        }
11    }
12
13    next();
14});

Here, we’re taking advantage of the isFQDN function from the validator module to ascertain if the X-Forwarded-Host header houses a valid domain name. If it turns out that it doesn’t, the middleware function rejects the request, sending back a 400 Bad Request status. 

Implementing this validation step is vital to stop the injection of harmful scripts via the X-Forwarded-Host header. As we discussed earlier, such a scenario could lead to a cache poisoning attack.

Best practices for mitigating web cache poisoning

Establishing a strong caching policy is the most effective way to prevent web cache poisoning. This policy clearly defines what content to cache, for how long, and under what conditions. When defining a caching policy, we must consider factors such as the data type, authentication requirements, and dynamic content that we shouldn’t cache.

Other techniques that can help secure caching mechanisms include:

  • Cache key normalization: Normalizing cache keys can help prevent variations due to input formatting or case sensitivity.

  • Validate user input: Implementing strict input validation and sanitization techniques can prevent injection attacks that can lead to cache poisoning. These techniques include input filtering, parameter safelisting, and regular expression checks.

  • Cache-control headers: Cache-control headers help enforce caching behavior and mitigate risks. For example, using headers like “no-store” and “no-cache” can prevent the caching of sensitive data.

  • Don’t trust third-party inputs: Be cautious when relying on third-party inputs like headers, cookies, or query strings. Validate and sanitize all external inputs thoroughly before using them in cache-related operations. Treat all user-supplied data as potentially malicious, and apply strict input validation and filtering to prevent attackers from injecting crafted content into the cache.

  • Use web application firewalls (WAF): Deploying a robust WAF can help detect and block cache poisoning attempts. WAFs analyze incoming requests and identify suspicious patterns that indicate cache poisoning. We can configure the WAF to alert or block these requests to provide an additional layer of defense against such attacks.

Monitoring and detecting web cache poisoning attacks

Even with the right mitigation strategies in place, web cache poisoning can still occur. So, it’s essential to regularly monitor web traffic and analyze logs to detect potential attacks. Look for unusual or suspicious activity, such as anomalies in request patterns, unexpected cache variations, or unusual content that the cache is serving 

Another best practice is to conduct regular security and penetration testing. These assessments help identify weaknesses that can lead to cache poisoning attacks. Security testing includes vulnerability scans and security audits. Penetration testing goes a step further by simulating real-world attack scenarios. It emulates the techniques and mindset of an attacker to specific weaknesses that regular security tests might miss. 

Many tools and resources can help detect and prevent web cache poisoning attacks. For example, we can implement:

  • Vulnerability scanners, such as Snyk, that we can use to automatically identify common cache poisoning vulnerabilities.

  • Caching diagnostics and analysis tools to help analyze caching configurations, monitor cache behavior, and detect anomalies.

Developer loved. Security trusted.

Snyk's dev-first tooling provides integrated and automated security that meets your governance and compliance needs.

Summary 

Web cache poisonings are attacks that force caches to serve outdated or manipulated content without the user’s knowledge. They can lead to data leaks and privacy violations, ultimately damaging a website’s reputation.

Our best defense against these attacks is to establish a strong caching policy and stay informed about the latest security trends for web applications. We also need to follow security best practices, like implementing proper input validation and normalizing our cache keys.

Following these secure development practices can help us reap the benefits of web caching while minimizing the risk of a poisoned cache.

Snyk Top 10: Vulnerabilites you should know

Find out which types of vulnerabilities are most likely to appear in your projects based on Snyk scan results and security research.