Now your website amazinglyme.com is gathering a lot of traffic. Recently you added a new feature in your website, wherein, if a user signs up with an email id, you will send a ‘Welcome’ email notification to the user. To achieve this, you have registered with a third-party notification service. Every time a user signs up, your main service will call the third-party email notifying API (let’s call it : 3p-notify API). See below –
This feature was working well but suddenly you observe that the traffic on your website is very high – almost 10k requests per second on your website, however the 3p-notify API has the bottle-neck of only 2k requests per second. This means that there are around 10k users signing up every second on your website, but the third-party only supports sending out 2k emails per second. Due to this bottle-neck, your server crashes and your website goes down. Oh, snap !
What’s the root-cause?
The main problem is that our main service is ‘synchronously‘ calling the 3p-notify API and waits for the response from the 3rd Party. This means our service is dependent on the 3rd party for it complete and then proceed with further execution. So if the third-party is down or affected, it will break our execution as well. See below –
What to do?
After some discussion, you realise that sending out welcome notification is not the primary goal of your main service i.e. email notification service is not Tier-1.
Tier-1 services are those which fulfil the main purpose of an application. These are very critical for business and if these services go down, it will result in significant impact on customer experience and company will bear loss. For example, the main goal of any Online Payment Application (like PhonePe, GooglePay) is to ensure that users are able to seamlessly perform online transactions. Thus, transaction processing service is tier-1 service. On the other hand, sending exclusive offers or discount emails to customers is still relevant but not tier-1. Read more about various tiers of an application.
Now that we know email notification is not tier-1, we can afford to have signed up users to receive the ‘Welcome‘emails with some delay.
If we can find a way to cut the dependency with the 3rd party and call it asynchronously, that would solve our problem. Hence, we extract the logic of calling the 3p-notify API and make it a micro-service (let’s call it: amazingme-notification-service)
Messages to the Res’Queue’
Our plan is to somehow decouple our main service with our amazingme-notification-service that calls 3p-notify API.
Now, if we can have a temporary data store where we ‘publish or push‘ the signed up user email data as request and our amazingme-notification-service ‘asynchronously‘ picks up that data and call 3p-notify API without affecting the performance of our main service, that’d be great !
That’s exactly what a message queue does.
A message queue is an asynchronous durable component which has volatile storage to store requests in buffer and support asynchronous communication between two services. The basic architecture is simple. Input services (called producers) publish the data (messages or requests) into the message queue. Other services (called consumers) consumes the data from the message queue and perform the desired action with that message. See a standard message queue –
image from – https://www.cloudamqp.com/blog/what-is-message-queuing.html
See how we implement the solution below –
Phew ! Message queues saved the day..
Using message queues, we have solved our bottle-neck issue. Now users who have signed up will receive the email notification without having to face an outage of the main site. The notification can be delayed by few milliseconds due to the added latency for the execution of amazingme-notification-service.
The key thing to understand here is that message queue is acting as a buffer here, so amazingme-notification-service can pick the request from queue one-by-one and let the 3p-notify-API to finish execution, rather than getting bombarded with the incoming requests.
Benefits of Message Queues
Decoupling – As we saw in our example, how we can decouple tier-1 services with tier-2 or tier-N services and still communicate with the services asynchronously.
Resiliency – Many a times, message queues are used in “backup and retry mechanism“. Some times our online transactions goes into pending or uncertain state. Those transactions are put in a queue and the retry service from other end consumes each transaction data and confirms with the third party (Mastercard or Visa) if the transaction was successful or a failure. If failed, the refund process is then initiated.
Batch processing – One of the use cases of message queues is batch processing. Sometimes a service needs to perform an operation on huge data. It is much more efficient to insert 1000 records into a database at a time instead of 1 at a time, 1000 times rather than bulk load. Using a message queue helps us achieve that.