A cloud to on-premise journey by Dukaan

Nov 15, 2023

In the previous two articles, I covered 2 interesting topics:

Single Cloud to Multi-Cloud
Cloud journey adopted by companies
1. Using fully cloud-managed (PaaS) services
2. Self-managed services using the cloud just as an IaaS

Today, I bring you a third article in this series about migrating back from cloud to on-premise (for the most part) by Dukaan.

Disclaimer: This post uses facts presented by Dukaan’s CTO in multiple videos (see references section) available on YouTube. This aims to help readers understand how to make technology choices based on their business requirements. Thus, this is in no way a means to point out the correctness or incorrectness of something. This is a summary and review post where I want to add my thoughts on top of Dukaan’s view.

What is Dukaan?

Dukaan helps businesses create online stores for desktop and mobile. Their mission “is to reshape the digital retail landscape by defining the future of commerce.“ They compete with the likes of Shopify but aim to provide better performance at a much lower cost as they cater to developing countries.

What is Dukaan’s overall approach to Technology?

Solving for low latency page load across the world: Dukaan aims to have a higher level of control over its content delivery network (CDN) and origin servers. They use an anycast network to reduce the time taken for page load across the world. A significant portion of their mobile traffic from in-app browser page loads where users quickly drop if the page does not load fast enough. If you do not believe me, see this article about how a 100ms increase in page load costs a significant portion of sales.
Being operationally nimble: In their opinion IaaS services of the cloud can offer significantly lower cost-performance while offering limited support for regions to be able to offer Time to First Byte (TTFB) of less than 50ms from anywhere in the world. For example, using AWS EBS as opposed to native SSDs has limits on throughput and IOPS due to replication to multiple servers in an availability zone to ensure high availability. But in a container world, a case could be made to run them on bare metal servers with direct NVMe SSDs as the persistent database runs separately from the web applications that use them.
1. Overall, they saw significant cost savings by moving the majority of their workload to on-premise data centers (over 90% cost reduction is what they claim but I am not sure if that accounts for everything). They only rely on AWS RDS for disaster recovery of their Postgres database which replicates in real-time to AWS RDS.
2. This ensures that they can rely on AWS RDS’s robust service for backup and recovery while still taking risks with their self-managed Database. They claim an RTO of under 10 minutes which is very impressive.
Use the best of all services: They do seem to use many cloud providers and services for various purposes.
1. AWS RDS for disaster recovery
2. Google Storage as an origin server for static content
3. Various CDN providers
4. On-premise and fully managed data centers

My View

It does seem like they have cracked the cost, operationally nimble, and low global latency game.

Rely on a large bare metal server in their chosen fully managed private data center as opposed to using public cloud providers for IaaS. I think the reason could be due to the limited number of regions supported by each cloud provider.
If you want low latency in all parts of a country, you need to cover more ground literally. For example, AWS in India has two regions Mumbai and Hyderabad. A general idea of latency is for every 100 KM of network distance, it adds 1 ms overhead.
- India - North to South has a round-trip of ~6000 KM which adds 60ms due to the distance overhead.
- India - East to West has a round-trip of ~6000 KM which adds 60ms due to the distance overhead.
- Thus, using just Mumbai and Hyderabad data centers for user access everywhere in India could add significant latency and thus make it unlikely for you to serve dynamic content in sub 50ms.
They need to be extremely cost-efficient if they want to win in India and other developing nations.
They make ~50 deployments a day with their fully automated CI/CD process with Kubernetes usage across the board.

While I think we should take their claims with a pinch of salt, there is a case to be made about tailoring your engineering needs to the business model and economics. If the engineering choices align with their unit economics and business expectations, then the organization is on a path to success. Thus the takeaway is to be clear about your business needs and make conscious decisions even though they challenge the status quo.

References

Podcasts from Dukaan’s CTO
1. October 2023
2. August 2022
Dukaan - https://mydukaan.io/about-us
Observability at Dukaan - https://last9.io/blog/how-last9-won-dukaan-over/

The Never Ending Sprint

Discussion about this post