πŸ›οΈ

System Design Fundamentals

System Architecture Beginner 4 min read 600 words
System Design

System Design Fundamentals

A comprehensive guide to core system design concepts essential for building scalable, reliable systems and for technical interviews.

Scalability

What Is Scalability?

Scalability describes a system’s elasticity - its ability to adapt to change and demand. Good scalability protects against downtime and ensures service quality.

Horizontal vs Vertical Scaling

Aspect Horizontal Scaling Vertical Scaling
Definition Adding more machines/nodes Adding resources to existing machine
Also Known As Scaling Out Scaling Up
Example Add 3 more servers Upgrade CPU, RAM, SSD
Cost Linear (more commodity hardware) Exponential (high-end hardware)
Complexity Higher (distributed systems) Lower (single system)
Limit Virtually unlimited Hardware limits
Downtime Zero (add nodes online) Possible (during upgrades)
Horizontal Scaling:
[User] β†’ [Load Balancer] β†’ [Server 1]
                        β†’ [Server 2]
                        β†’ [Server 3]

Vertical Scaling:
[User] β†’ [Beefier Server (more CPU, RAM, etc.)]

Caching

What is Caching?

Caching acts as a local store for data - retrieving from this temporary storage is faster than retrieving from the database. Think of it as short-term memory: limited space but fast, containing recently/frequently accessed items.

How Cache Works

First Request (Cache Miss):
[Client] β†’ [App Server] β†’ [Cache] βœ— β†’ [Database]
                              ↓
                         Store Result

Second Request (Cache Hit):
[Client] β†’ [App Server] β†’ [Cache] βœ“ β†’ Return immediately

Cache Levels

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚     L1 CPU Cache (Fastest)          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚     L2 CPU Cache                    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚     L3 CPU Cache                    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚     RAM (Primary Memory)            β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚     Application Cache (Redis)       β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚     Browser Cache                   β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚     CDN Cache                       β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚     Disk (Secondary Memory)         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Types of Cache

1. Application Server Cache

In-memory cache alongside the application server.

// Simple in-memory cache with IMemoryCache
public class ProductService
{
    private readonly IMemoryCache _cache;
    private readonly IProductRepository _repo;

    public async Task<Product> GetProductAsync(int id)
    {
        return await _cache.GetOrCreateAsync($"product:{id}", async entry =>
        {
            entry.AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(10);
            return await _repo.GetByIdAsync(id);
        });
    }
}

Drawback: Doesn’t work well with multiple servers (load balancer causes cache misses).

2. Distributed Cache

Cache is distributed across multiple nodes using consistent hashing.

// Distributed cache with Redis
public class ProductService
{
    private readonly IDistributedCache _cache;

    public async Task<Product> GetProductAsync(int id)
    {
        var cached = await _cache.GetStringAsync($"product:{id}");
        if (cached != null)
            return JsonSerializer.Deserialize<Product>(cached);

        var product = await _repo.GetByIdAsync(id);
        await _cache.SetStringAsync($"product:{id}",
            JsonSerializer.Serialize(product),
            new DistributedCacheEntryOptions
            {
                AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(10)
            });
        return product;
    }
}

3. Global Cache

Single shared cache space for all nodes.

4. CDN (Content Delivery Network)

Geographically distributed servers caching static content (HTML, CSS, JS, images, videos).

User in Europe β†’ CDN Edge Server (Europe) β†’ Origin Server (if cache miss)
User in Asia   β†’ CDN Edge Server (Asia)   β†’ Origin Server (if cache miss)

Cache Eviction Policies

Policy Description Use Case
LRU Least Recently Used - evicts oldest accessed General purpose
LFU Least Frequently Used - evicts least accessed Varying access patterns
FIFO First In First Out - evicts oldest added Simple scenarios
TTL Time To Live - expires after set time Time-sensitive data

Cache Invalidation Strategies

// Write-Through: Update cache and DB together
public async Task UpdateProductAsync(Product product)
{
    await _repo.UpdateAsync(product);
    await _cache.SetStringAsync($"product:{product.Id}",
        JsonSerializer.Serialize(product));
}

// Write-Behind: Update cache, async DB update
public async Task UpdateProductAsync(Product product)
{
    await _cache.SetStringAsync($"product:{product.Id}",
        JsonSerializer.Serialize(product));
    _backgroundQueue.Enqueue(() => _repo.UpdateAsync(product));
}

// Cache-Aside: Application manages cache
public async Task<Product> GetProductAsync(int id)
{
    var cached = await _cache.GetStringAsync($"product:{id}");
    if (cached != null)
        return JsonSerializer.Deserialize<Product>(cached);

    var product = await _repo.GetByIdAsync(id);
    if (product != null)
        await _cache.SetStringAsync($"product:{id}",
            JsonSerializer.Serialize(product));
    return product;
}

Load Balancing

What is a Load Balancer?

A load balancer distributes incoming traffic among servers to provide:

  • High availability - if one server fails, others handle traffic
  • Efficient utilization - no single server is overloaded
  • High performance - optimized response times

Without Load Balancer (Problems)

[Users] β†’ [Single Server] ← Single Point of Failure!
                         ← Gets Overloaded!

With Load Balancer

[Users] β†’ [Load Balancer] β†’ [Server 1] βœ“
                         β†’ [Server 2] βœ“
                         β†’ [Server 3] βœ“
          Health checks ensure only healthy servers receive traffic

Load Balancer Placement

[Client] β†’ [LB] β†’ [Web Servers]
                  [LB] β†’ [App Servers]
                         [LB] β†’ [Cache Servers]
                                [LB] β†’ [Database Servers]

Types of Load Balancers

By Layer

Type OSI Layer Routing Based On
L4 Transport IP, Port, Protocol
L7 Application URL, Headers, Cookies, Content
GSLB Geographic Location, Server Health, Proximity

By Implementation

  • Hardware: Physical appliances (F5, Citrix) - expensive but powerful
  • Software: Applications (NGINX, HAProxy) - flexible and cost-effective
  • Virtual: VMs in cloud environments

Load Balancing Algorithms

// 1. Round Robin - Sequential distribution
public class RoundRobinBalancer
{
    private int _current = -1;
    private readonly List<string> _servers;

    public string GetNextServer()
    {
        _current = (_current + 1) % _servers.Count;
        return _servers[_current];
    }
}

// 2. Weighted Round Robin - Based on server capacity
public class WeightedRoundRobinBalancer
{
    private readonly List<(string Server, int Weight)> _servers;
    // Server with weight 3 gets 3x more requests than weight 1
}

// 3. Least Connections - To server with fewest active connections
public class LeastConnectionsBalancer
{
    private readonly Dictionary<string, int> _connections;

    public string GetNextServer()
    {
        return _connections.OrderBy(c => c.Value).First().Key;
    }
}

// 4. IP Hash - Same client always goes to same server
public class IpHashBalancer
{
    public string GetServer(string clientIp)
    {
        int hash = clientIp.GetHashCode();
        return _servers[Math.Abs(hash) % _servers.Count];
    }
}

// 5. Least Response Time - To fastest responding server
// Combines response time + active connections

Database Replication

What is Database Replication?

Copying data from a primary database to replica databases to improve:

  • Availability - system continues if primary fails
  • Performance - read queries distributed across replicas
  • Reliability - data redundancy

Replication Topologies

1. Master-Slave (Primary-Replica)
   [Primary] β†’ [Replica 1]
            β†’ [Replica 2]
            β†’ [Replica 3]
   Writes: Primary only
   Reads: Any node

2. Master-Master (Multi-Primary)
   [Primary 1] ↔ [Primary 2]
   Writes: Any primary
   Reads: Any node
   Conflict resolution needed

3. Chain Replication
   [Primary] β†’ [Replica 1] β†’ [Replica 2]
   Sequential propagation

Benefits of Replication

Benefit Description
High Availability System continues if one database fails
Load Distribution Read queries spread across replicas
Geographic Distribution Data closer to users
Analytics Separation Run heavy queries on replica
Disaster Recovery Built-in backup

System Design Interview Tips

Key Concepts to Demonstrate

  1. Scalability: How to handle 10x, 100x traffic
  2. Availability: What happens when components fail
  3. Performance: Caching, CDN, load balancing strategies
  4. Data Management: Replication, sharding, consistency
  5. Trade-offs: CAP theorem, consistency vs availability

Interview Approach

1. Clarify Requirements (5 min)
   - Functional: What should the system do?
   - Non-functional: Scale, latency, availability targets

2. High-Level Design (10-15 min)
   - Components: API, services, database, cache
   - Data flow: How requests move through system

3. Deep Dive (15-20 min)
   - Database schema
   - API design
   - Caching strategy
   - Load balancing

4. Address Bottlenecks (5 min)
   - Single points of failure
   - Scaling limitations
   - Trade-offs made

Common Questions

  • Design a URL shortener
  • Design a rate limiter
  • Design Twitter/Instagram feed
  • Design a notification system
  • Design a distributed cache

Sources

πŸ“š Related Articles