Negative Caching

Recently while investigating caching, I came across the idea of “Negative Caching”. According to Wikipedia, a negative cache is a cache that also stores “negative” responses or failures. Put simply, you can store the knowledge that something does not exist, as opposed to a positive cache that stores information about something that does exist. (In the rest of this article we will indicate a “normal” cache as a “positive cache”.)

But Why?

A positive cache is useful if success – retrieving an object or resource – is more expensive than storing information about what you want. In contrast, a negative cache is useful if failure – determining that something does not exist – is very expensive. However, this is a niche scenario because for the majority of modern software, failure in a resource lookup is usually not that expensive.

Ok… But Why?

The only applications of negative caching that I could find were for DNS caching.

A DNS lookup for a record that does not exist can be expensive since the query travels through a large network of DNS servers if the record has not yet propagated. Propagation can take hours or days, so if a request for a DNS record returns a negative result, you (the DNS server) can likely save yourself a lot of trouble by remembering that result for a few hours.

If a DNS server is doing negative caching, and it does a lookup for a record but gets no result, the system can remember this failed lookup and remember that the record does not exist. This way the server can use this information to return a negative result immediately instead of trying to (expensively) find it again the next time somebody asks for it. The negative cache records can expire over time so that once the actual DNS record has been propagated, the search will find it.

Once again: Cache Invalidation Is Hard

If a positive cache contains an entry and the resource underlying that entry is updated, the positive cache entry must be invalidated, or risk returning incorrect results. Application level errors resulting from inappropriately greedy caches are notoriously difficult to diagnose because usually caching fades into the fabric of your system. Clearing the cache at any level is usually the first line of attack when you are getting strange results that you don’t expect, whether at the database or at the browser.

By the same token, if a negative cache contains an entry for something that doesn’t exist, and that underlying thing appears, you can miss out on something that actually does exist until the cache entry expires. Imagine your system telling you that something doesn’t exist when you can verify by other means that it does! Expiration by timeout or by providing cache clearing directions is therefore very important.

Conclusion

You can store the knowledge that something does not exist if determining non-existence is expensive. This solution is pretty rare, but does have its uses in the right situations.

Advertisements

Leave a comment

Filed under Software Engineering

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s