Caching — Work

Caching is the process of storing copies of files in a cache, or a temporary storage location, so that they can be accessed more quickly.

In simpler words, caching is a technique to preserve data in a temporary—faster—storage, like memory. But why do we need to preserve it in temporary storage? In short, we do that so the data can be accessed quickly and delivered to users faster.

Why is Caching Important

Perhaps, when our application doesn't have that many users accessing it daily, caching may seem less useful. That's because the database queries aren't that many yet, and the system is still responsive enough to deliver results quickly.

But imagine we have this kind of app that is used by so many people— let's say some sort of IMDB (Internet Movie Database) that's used by many websites as a reference for movies, or even a social media platform like Letterboxd. Then there's this one movie that gets accessed quite often by many people.

GET /movies/the-perks-of-being-a-wallflower

{
    "slug": "the-perks-of-being-a-wallflower",
    "title": "The Perks of Being a Wallflower",
    "year": 2012,
    "genre": "Drama, Romance",
    "director": "Stephen Chbosky",
    "synopsis": "A socially awkward teenager finds solace in two charismatic seniors who welcome him to the real world.",
    "rating": "PG13",
    "duration": "103 min",
    "poster": "https://www.imdb.com/images/the-perks-of-being-a-wallflower.jpg",
    "release_date": "2012-09-21",
    "language": "English",
    "country": "USA"
}

That particular endpoint, even though it is accessed by thousands or millions of people at once, will always return the same result—unless the movie details are changed. It would be unfortunate for our system to hit the database with thousands of queries, only to return the same result, but slower.

When an API call is made with similar parameters, the system retrieves the response from the cache and does not need to make a trip to the server, nor does it have to repeat the same operations to deliver the same result

By implementing caching, we don't need to touch "less useful" data in the database, thus reducing the latency and load, which in the end improves overall system performance. "Less useful" data in this context means data that holds the same value for almost every user, such as movie details.

The information we cache usually contains the response data from endpoints accessed by users. So if there's a user that sends a request to that particular endpoint, we can directly return the cached data without the need to access the database.

Where is Cache Stored?

As mentioned earlier, cache can be saved in temporary storage, whatever it may be, which is generally faster than a regular disk-based database. One example of this is Redis. Other options include memcached, node-cache, and many others that you can find yourself.

On the other hand, if we're talking about client-side caching, the cache will be stored in local storage. Even though it isn't stored directly in memory, client-side caching tends to be faster because the client doesn't have to reach the server. The local browser can handle the request and retrieve the response from its cache.

About Redis

Redis is the world's fastest in-memory database. It provides cloud and on-prem solutions for caching, vector search, and NoSQL databases that seamlessly fit into any tech stack.

Redis itself is very confident that its in-memory database is the fastest and can provide both cloud and on-premise services for caching. Not only caching, it also supports vector search, NoSQL database functionality, and many other features. Redis can be implemented in many tech stacks, including ExpressJS or Golang, with ease. Redis also provides many types of data structures, such as strings, hashes, lists, sets, sorted sets, and many others. Redis can also be used as a message broker with its publish-subscribe protocol.

As mentioned earlier, Redis is an in-memory data store, meaning that all data is stored temporarily in RAM (Random Access Memory), which is volatile. If the server somehow crashes, all the data saved there will be lost, including the cache, which is expected behavior. However, if you wish your data to persist, Redis also provides a data persistence option, which saves your data to disk instead.

It's actually pretty easy to use Redis. we can enter the Redis terminal by using redis-cli command in terminal. To test the connection, we can simply send ping, and the server will reply with PONG if the connection is established. To save data to Redis, we can use SET [key] [value], with the key and value you desire. For example, if I want to set the username to my name, I can simply use:

SET username yogarn

We can also add an expiry time to the data so it will be deleted automatically. To do that, we can use the EX flag. For example:

SET username yogarn EX 60

This command sets my username to yogarn and automatically deletes it after 60 seconds.

The KEYS command followed by an argument can be used to get a list of keys. For example, using an asterisk like KEYS * will give us a list of all keys we have set. To get the data based on a key, we can simply use GET [key], such as:

GET username

And we will get yogarn in return.

Types of Cache and How They Work

Caching can be classified into three categories: client cache on the browser level, server cache, and a hybrid approach combining the client and server.

Client-Side Caching

Client-side caching is usually done automatically by the browser using local storage. As a result, clients can access some sites without an internet connection at all, because the data is already stored on their own devices.

To implement that, we can set the Cache-Control header for each server response that we want the browser to cache. We can also set the cache expiry time using the desired max-age value. There are a few directives we can use, such as public, private, no-cache, and no-store.

public mean sthat the response can be cached by anyone, including proxies, CDN, and also browsers.
private means that the response can only be cached by browser.
no-cache doesn't necessarily mean that browser should'nt cached response, but rather that each cache needs to be validated to server.
no-store means the response should never be cached at all — not by the browser, proxies, or any storage.

When we don't want to do client-side caching, we may need to specify it using the no-cache directive. We need to do this so that the client—especially the browser—will validate the response with the server before using any cached version. If we don't set this explicitly, some unexpected behavior might occur, because by default, cache storage is allowed to apply caching heuristics. Unfortunately, many HTTP/1.0 caches don't natively support the no-cache directive. As an alternative, we can use max-age=0 with must-revalidate.

Server-Side Caching

It's a different world from client-side caching—server-side caching stores its cache on the server. More specifically, it's most likely stored in the server's memory. Its main purpose is to prevent repeated access to the database for similar queries. In a RESTful API architecture, when a request reaches the service layer, it should first check for the requested data in its in-memory database before accessing the actual database. If the requested resource is already in the cache, then the service should return that data directly.

However, if the resource isn't available in the cache, the service layer may proceed to query the database. Once the data is retrieved, the service should store a copy in the in-memory database, so that future similar requests can be served directly from the cache—without needing to touch the database at all.

Inside server-side caching, there are several implementations that developers often use. They are as follows:

Cache Aside (Lazy Loading). This is the simplest one. The server checks the cache first; if it results in a miss, the server retrieves the resource from the database and then saves it in the cache. This is the type of cache we discussed earlier.
Write-Through. In this approach, data is written to both the cache and the database simultaneously. This ensures that the cache is always consistent with the database. But one thing's for sure: this kind of implementation takes more time and uses more storage. So, why not just use the entire cache as a database instead?
Write-Back (Write-Behind). Here, data is saved to the cache first, and then asynchronously written to the database. But what happens if the cache fails before the data is synchronized?

What Should We Cache?

Generally, we only need to cache static data or responses that rarely change, and more importantly, are accessed by many users. As mentioned earlier, caching will only benefit the server if—and only if—the resources are accessed frequently without needing to contact the database. It's totally fine if we want to implement caching for all the endpoints we have, but the benefit might not feel much different compared to caching only the important ones. Let's say we have a website that displays movie details, like IMDB or Letterboxd. The most important endpoints to cache would be the responses for movie details, since that data rarely changes, but accessed by many users.

Cache Invalidation

There are only two hard things in Computer Science: cache invalidation and naming things.

When we implement caching, there will be a problem when the data saved in the cache is expired or not the same as the database. If we take the previous example of IMDB, we may cache movie details, including the rating. However, if the rating of that movie changes in the database, but not in the cache, then we might deliver wrong data to the clients. This is called data inconsistency.

Of course, we never want that to happen. In this problem, cache invalidation is a way out. Cache invalidation is a technique to invalidate the cache when it isn't relevant anymore. By doing that, we can be sure that every piece of data we send to the client is the correct one, synchronized with the database. However, the implementation isn't as easy as the theory. You may find the reasons why further below.

Redis itself actually provide something called expiry time, that we tried before using redis-cli. By using that expiry time, any cache that exceed the time limit, will automatically deleted. This is very useful if we're talking about cache. However, this method for cache invalidation is highly advised to be used only as last resort. There will be a time when the cache is stale or different than what we have in database, long before the expiry time is exceeded.

To overcome this, we can do some tricks in the services. If there's a request that may alter the data state in the database—such as POST, PUT, PATCH, DELETE, or others— we can manually delete the cache of that particular resource. That is actually the sole reason why cache invalidation is hard. It requires patience and accuracy to determine which resource should be invalidated when one piece of data changes. It might not seem that complicated in the current example. However, let's imagine a more complex scenario.

Each movie detail now includes reviews from various users. Each review may also provide a rating, which affects the overall movie rating as well. Now, which actions should invalidate the movie details cache?

Update Movie
Delete Movie
Post Review
Update Review
Delete Review
Update Users
Delete Users

The example above is actually pretty simple, but it's good enough to show how complex cache invalidation could be. And keep in mind, the example only involves two resources: movie details and user reviews.

Cache Miss

A cache miss refers to a state where data requested by a component (such as a processor) is not found in the cache, a hardware or software component that stores data for future requests.

If we're implementing the cache-aside method, a cache miss will occur at least once when the request first comes. This is normal and cannot be avoided—thus, it's called a compulsory miss.

This type of miss is unavoidable as it is inherent in the first reference to the data. The only way to eliminate compulsory misses would be to have an infinite prefetch of data, which is not feasible in real-world systems.

So, why do we need to set an expiry time in the cache? Doesn't that just create another cache miss once the data exceeds the expiry time?

Correct—if we use an expiry time, the data inside the cache, even if it's not stale, will be deleted once it exceeds the expiry limit. The only reason we do that is to save space. It's very possible that we might run out of storage—especially memory, which is kind of expensive. If we run out of memory space, we'll eventually face a capacity miss—a condition where the server is unable to cache any more data because there's no space left.

However, there's a solution we can implement: key eviction. We can use several techniques such as Least Recently Used (LRU), Most Recently Used (MRU), Least Frequently Used (LFU), First-In First-Out (FIFO), and many others. By default, Redis uses volatile-LRU first. It will eliminate the least recently used data that has an expiry time.

Where to Implement Caching Logic?

If we're talking about REST API, especially in the context of clean architecture, it usually consists of three main layers: handler, service, and repository. The handler usually does the job of parsing the request, sanitizing input, handling authentication, validation, and so on, before passing the request to the service layer. The service layer handles the business logic. Whatever your business needs to do, you should do it here. Then, the service layer will pass the data to the repository layer. The repository layer usually has only one responsibility—communicating with the database. It will eventually return the data back to the service layer, and the service will pass it again to the handler, which finally delivers it to the client as a response.

So, where should we implement the caching logic?

The handler might seem like a relevant place, since we can stop the request early and directly return the data before it even reaches the service layer. This would certainly improve performance, as the request doesn't need to be processed further. But what if we want to implement logging in our app, which is usually executed at the end of the request? In that case, this is certainly not the best place to put caching logic, since some middleware may be bypassed.

How about the repository layer? Its main function is to communicate with the database, so in a way, it's not that different from a memory database, right? Well, yes and no. If you're not planning to fetch external resources that need to be cached, then the repository layer might seem like a good place. You can even use the raw database query as the cache key, so the cached data would be a direct representation of the DB response.

However, the service layer is also an ideal place. Its main job is to handle business logic—including caching. You can also cache external resources if needed. The cached data can reflect the actual response, so we can return it directly without needing to map or transform it further.

In the end, it all depends on your specific needs. But if I had to choose, I would go with the service layer, as it's the easiest and most flexible place to implement caching.

Cache Trade-offs

Speed v. accuracy

Caching will definitely improve overall server performance, since it uses memory instead of accessing the regular database. However, the data we cache can become stale—or simply put, inaccurate. It requires extra effort to handle cache invalidation manually for each resource. The examples shown earlier might seem simple, but things get much harder with more complex data structures.

Memory Usage v. Cache Size

Cache isn't a free space. It needs memory to store its data. That means we must sacrifice part of our server's memory for caching. The more storage we dedicate to cache, the more memory it consumes—this could affect other processes and reduce overall system performance. Not only that, but frequent cache misses and low hit ratios may make caching practically useless.

Data Freshness v. Caching Duration

Storing cache for a long time sounds like a reasonable strategy to increase hit ratio and reduce cache misses. However, it always comes back to cache invalidation. How confident are you that your app handles cache invalidation properly? If there's just one spot where you forget to invalidate the cache, then the system will rely entirely on the expiry time. The longer the expiry time, the longer stale data might live in the cache.

Conclusion

Caching is indeed a brilliant solution to boost overall system performance significantly. It's especially effective for high-traffic endpoints that serve static data. That said, you;re still free to implement caching on any endpoint you like. Just remember—you might not feel the benefits as much if the traffic is low or the data changes frequently. There are also some drawbacks worth considering before jumping into caching.

Last but not least, the decision is always yours—whether or not to implement caching. Caching isn't a must if your daily user count is still relatively low. And that's totally fine—explore when you're ready.