πŸ›οΈ

WhatsApp Architecture Case Study

System Architecture Intermediate 2 min read 400 words
Case Study System Design

WhatsApp Architecture Case Study

How WhatsApp handles 100+ billion messages daily with remarkable efficiency.

Architecture Overview

WhatsApp is known for its incredibly efficient architecture, handling massive scale with a relatively small engineering team.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      Mobile Clients                          β”‚
β”‚              (iOS, Android, Web, Desktop)                    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚
                          β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   Load Balancers                             β”‚
β”‚               (Geographic Distribution)                       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         β”‚                                    β”‚
β–Ό                         β–Ό                                    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Connection    β”‚  β”‚   Message     β”‚  β”‚    Media Storage      β”‚
β”‚ Servers       β”‚  β”‚   Routing     β”‚  β”‚    (S3/CDN)           β”‚
β”‚ (XMPP/Noise)  β”‚  β”‚   Servers     β”‚  β”‚                       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Core Technology Stack

Erlang/OTP

WhatsApp’s backend is primarily built on Erlang, chosen for:

  1. Concurrency: Lightweight processes (millions per server)
  2. Fault Tolerance: β€œLet it crash” philosophy
  3. Hot Code Swapping: Update without downtime
  4. Distributed Computing: Built-in distribution
%% Example: Erlang process handling
-module(message_handler).
-export([start/0, handle/1]).

start() ->
    spawn(fun() -> loop() end).

loop() ->
    receive
        {send, Message, To} ->
            route_message(Message, To),
            loop();
        stop ->
            ok
    end.

FreeBSD Operating System

  • Highly tuned for networking
  • Better performance than Linux for their workload
  • Custom kernel optimizations

Key Components

1. Connection Management

  • Protocol: Custom protocol based on XMPP (simplified)
  • Encryption: Signal Protocol (end-to-end)
  • Connections: Long-lived TCP connections
  • Compression: Efficient binary protocol

2. Message Flow

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Sender  │───▢│  Server  │───▢│  Server  │───▢│ Receiver β”‚
β”‚  Client  β”‚    β”‚  (Home)  β”‚    β”‚  (Dest)  β”‚    β”‚  Client  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
                     β–Ό
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚ Mnesia/MySQL β”‚
              β”‚  (Offline)   β”‚
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Message States:

  • Single checkmark: Delivered to server
  • Double checkmark: Delivered to recipient
  • Blue checkmarks: Read by recipient

3. Data Storage

Component Storage Purpose
Messages (offline) Mnesia β†’ MySQL Store until delivered
User profiles MySQL Account data
Media files Amazon S3 Images, videos, documents
Keys Local device End-to-end encryption keys

4. Media Handling

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Client   │───▢│   Upload   │───▢│    S3      β”‚
β”‚  Uploads   β”‚    β”‚   Server   β”‚    β”‚  Storage   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                          β”‚
                                          β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Client   │◀───│    CDN     │◀───│  Generate  β”‚
β”‚  Downloads β”‚    β”‚            β”‚    β”‚    URL     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Scalability Strategies

1. Server Efficiency

  • 2 million connections per server (Erlang’s strength)
  • Custom memory management
  • Optimized garbage collection

2. Database Optimization

  • Read replicas for scaling reads
  • Sharding by user ID
  • Minimal data storage (messages deleted after delivery)

3. Caching

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Request   │────▢│  Memcached  β”‚ (Hit: Return)
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
                          β”‚ (Miss)
                          β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚    MySQL    β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

End-to-End Encryption

Signal Protocol Implementation

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                 Key Exchange                         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  1. Identity Key (long-term)                        β”‚
β”‚  2. Signed Pre-Key (medium-term)                    β”‚
β”‚  3. One-Time Pre-Keys (single use)                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚
                          β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Double Ratchet Algorithm                β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  - Forward secrecy                                   β”‚
β”‚  - Break-in recovery                                 β”‚
β”‚  - Per-message keys                                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Group Messaging

  • Sender Keys for efficiency
  • Each member has unique key
  • Server cannot decrypt messages

Performance Metrics

Metric Value
Daily Messages 100+ billion
Monthly Active Users 2+ billion
Engineers (2014) ~50
Servers (2014) ~550
Messages/second 1+ million

Design Principles

1. Simplicity

  • Focus on core messaging functionality
  • Minimal features, maximum reliability
  • Simple user experience

2. Efficiency

  • Binary protocol (not JSON/XML)
  • Minimal server storage
  • Optimized network usage

3. Privacy

  • End-to-end encryption by default
  • Minimal data collection
  • Messages not stored on servers

4. Reliability

  • Messages always delivered
  • Offline message queuing
  • Automatic reconnection

Lessons for Architects

1. Choose the Right Technology

Erlang was perfect for WhatsApp’s needs:

  • Concurrent connections
  • Fault tolerance
  • Low latency

2. Optimize Ruthlessly

  • Every byte counts
  • Profile and measure
  • Custom solutions when needed

3. Keep It Simple

  • Fewer features, done well
  • Minimal dependencies
  • Clear architecture

4. Plan for Scale

  • Design for millions from day one
  • Horizontal scaling capability
  • Efficient resource usage

C# Equivalent Patterns

Connection Handling (SignalR)

public class ChatHub : Hub
{
    public async Task SendMessage(string user, string message)
    {
        await Clients.User(user).SendAsync("ReceiveMessage", message);
    }

    public override async Task OnConnectedAsync()
    {
        await Groups.AddToGroupAsync(Context.ConnectionId, "Online");
        await base.OnConnectedAsync();
    }
}

Message Queue Pattern

public class MessageService
{
    private readonly IMessageQueue _queue;

    public async Task SendMessageAsync(Message message)
    {
        if (await IsUserOnline(message.RecipientId))
        {
            await DeliverDirectly(message);
        }
        else
        {
            await _queue.EnqueueForDelivery(message);
        }
    }
}

Sources

  • Arhitectura/WhatsApp architecture.gif

πŸ“š Related Articles