Note
Recommended to read PART-4.
Introduction
In this part we will talk about reducing the latency of the response by the server. In the process of server sending the response back to client, one of the main latency factor is the expensive read/write operations on database. Making read/operations from the memory is always slow. The best strategy to reduce the latency caused by memory operations is caching the data.
Caching
The caching of data means temporary storage of the data that is either frequently accessed or result of expensive operations which can be served quickly in the subsequent requests. Caching greatly affects the performance of service by decreasing the latency of data access.

How cache works?
The web server calls the cache to retrieve the data. If the data is present in the cache (cache-hit), the data is returned. If the data is not present in the cache (cache-miss), the data is then accessed from the database storage, saved into the cache first and then data is returned. This is called ‘read-through caching.’ It is called read-through because we are always reading through the cache only, even if the data is accessed from the database (in case of cache miss).

Content Delivery Network (CDN)
CDN is a network of geographically dispersed web servers across multiple locations which are responsible for faster delivery of static contents like Javascript files, images, videos, HTML, etc.
CDN Servers cache the static content like images and videos and quickly serve this static content for uninterrupted streaming. CDNs improve the performance of an application by decreasing the latency in the following way –
- Caching the contents on the server which are widely requested.
- Decreasing the distance between — where content is stored and where it needs to go. Closer the geographic proximity between the CDN and the client from where the request arises, faster is the delivery.
- Compressing the files for speeding up the transfer.

One example of CDN is AWS Cloud front. (Other examples include — Azure CDN, akamai CDN etc.)
Let’s take an example of Youtube for a second. Every day thousands of new videos are uploaded on Youtube. It stores these videos in some object data cloud store like Google Drive (with Google File System). These videos are stored statically. They can only be deleted.
Similarly Instagram stores videos and images which are all static content. Any viral video with millions of views needs to be served to the maximum users, so a CDN will cache this data and quickly serve this video to every requesting device.
In case a CDN does not have the content, very similar to cache, the content will be then fetched from its origin (where it is actually stored, like Google Cloud or Amazon S3).
CDNs also serve as a reliable data store. In case of outages or network congestion, server going down etc., CDNs can still serve the content seamlessly without letting the users know.
Note that, web-servers do not store static content and they are only fetched by CDNs for faster delivery.
Finally it looks like..
Now let’s see how our design architecture look with Cache and CDN in the picture —

In the above diagram —
- User enters the website amazinglyme.com and presses send.
- The request goes to the DNS to resolve the domain to IP address.
- DNS might resolve the IP of either the Load Balancer or the CDN.
- It depends on the use case on how the architecture is designed. In some cases all the traffic goes through the CDN and then it goes to the Load Balancer. In other cases, depending how the DNS is configured, the IP of either the CDN or LB is returned.
- Eventually the request reaches the load balancer either directly from client or via CDN.
- The load balancer receives the request and based on how it is configured, it redirects the request to Server N.
- The web server receives the request and first tries to retrieve the data from Cache. In case the data is not present in cache, it accesses the data base and then sends response back to the client’s system via load balancer.
Now let’s see how can we enhance user experience by storing the user session using stateful vs stateless architecture. See you in part-6.