System Design Fundamentals

A comprehensive guide to core system design concepts essential for building scalable, reliable systems and for technical interviews.

Scalability

What Is Scalability?

Scalability describes a system’s elasticity - its ability to adapt to change and demand. Good scalability protects against downtime and ensures service quality.

Horizontal vs Vertical Scaling

Aspect	Horizontal Scaling	Vertical Scaling
Definition	Adding more machines/nodes	Adding resources to existing machine
Also Known As	Scaling Out	Scaling Up
Example	Add 3 more servers	Upgrade CPU, RAM, SSD
Cost	Linear (more commodity hardware)	Exponential (high-end hardware)
Complexity	Higher (distributed systems)	Lower (single system)
Limit	Virtually unlimited	Hardware limits
Downtime	Zero (add nodes online)	Possible (during upgrades)

Horizontal Scaling:
[User] → [Load Balancer] → [Server 1]
                        → [Server 2]
                        → [Server 3]

Vertical Scaling:
[User] → [Beefier Server (more CPU, RAM, etc.)]

Caching acts as a local store for data - retrieving from this temporary storage is faster than retrieving from the database. Think of it as short-term memory: limited space but fast, containing recently/frequently accessed items.

How Cache Works

First Request (Cache Miss):
[Client] → [App Server] → [Cache] ✗ → [Database]
                              ↓
                         Store Result

Second Request (Cache Hit):
[Client] → [App Server] → [Cache] ✓ → Return immediately

Cache Levels

┌─────────────────────────────────────┐
│     L1 CPU Cache (Fastest)          │
├─────────────────────────────────────┤
│     L2 CPU Cache                    │
├─────────────────────────────────────┤
│     L3 CPU Cache                    │
├─────────────────────────────────────┤
│     RAM (Primary Memory)            │
├─────────────────────────────────────┤
│     Application Cache (Redis)       │
├─────────────────────────────────────┤
│     Browser Cache                   │
├─────────────────────────────────────┤
│     CDN Cache                       │
├─────────────────────────────────────┤
│     Disk (Secondary Memory)         │
└─────────────────────────────────────┘

Types of Cache

1. Application Server Cache

In-memory cache alongside the application server.

// Simple in-memory cache with IMemoryCache
public class ProductService
{
    private readonly IMemoryCache _cache;
    private readonly IProductRepository _repo;

    public async Task<Product> GetProductAsync(int id)
    {
        return await _cache.GetOrCreateAsync($"product:{id}", async entry =>
        {
            entry.AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(10);
            return await _repo.GetByIdAsync(id);
        });
    }
}

Drawback: Doesn’t work well with multiple servers (load balancer causes cache misses).

2. Distributed Cache

Cache is distributed across multiple nodes using consistent hashing.

// Distributed cache with Redis
public class ProductService
{
    private readonly IDistributedCache _cache;

    public async Task<Product> GetProductAsync(int id)
    {
        var cached = await _cache.GetStringAsync($"product:{id}");
        if (cached != null)
            return JsonSerializer.Deserialize<Product>(cached);

        var product = await _repo.GetByIdAsync(id);
        await _cache.SetStringAsync($"product:{id}",
            JsonSerializer.Serialize(product),
            new DistributedCacheEntryOptions
            {
                AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(10)
            });
        return product;
    }
}

3. Global Cache

Single shared cache space for all nodes.

4. CDN (Content Delivery Network)

Geographically distributed servers caching static content (HTML, CSS, JS, images, videos).

User in Europe → CDN Edge Server (Europe) → Origin Server (if cache miss)
User in Asia   → CDN Edge Server (Asia)   → Origin Server (if cache miss)

Cache Eviction Policies

Policy	Description	Use Case
LRU	Least Recently Used - evicts oldest accessed	General purpose
LFU	Least Frequently Used - evicts least accessed	Varying access patterns
FIFO	First In First Out - evicts oldest added	Simple scenarios
TTL	Time To Live - expires after set time	Time-sensitive data

Cache Invalidation Strategies

// Write-Through: Update cache and DB together
public async Task UpdateProductAsync(Product product)
{
    await _repo.UpdateAsync(product);
    await _cache.SetStringAsync($"product:{product.Id}",
        JsonSerializer.Serialize(product));
}

// Write-Behind: Update cache, async DB update
public async Task UpdateProductAsync(Product product)
{
    await _cache.SetStringAsync($"product:{product.Id}",
        JsonSerializer.Serialize(product));
    _backgroundQueue.Enqueue(() => _repo.UpdateAsync(product));
}

// Cache-Aside: Application manages cache
public async Task<Product> GetProductAsync(int id)
{
    var cached = await _cache.GetStringAsync($"product:{id}");
    if (cached != null)
        return JsonSerializer.Deserialize<Product>(cached);

    var product = await _repo.GetByIdAsync(id);
    if (product != null)
        await _cache.SetStringAsync($"product:{id}",
            JsonSerializer.Serialize(product));
    return product;
}

Load Balancing

What is a Load Balancer?

A load balancer distributes incoming traffic among servers to provide:

High availability - if one server fails, others handle traffic
Efficient utilization - no single server is overloaded
High performance - optimized response times

Without Load Balancer (Problems)

[Users] → [Single Server] ← Single Point of Failure!
                         ← Gets Overloaded!

With Load Balancer

[Users] → [Load Balancer] → [Server 1] ✓
                         → [Server 2] ✓
                         → [Server 3] ✓
          Health checks ensure only healthy servers receive traffic

Load Balancer Placement

[Client] → [LB] → [Web Servers]
                  [LB] → [App Servers]
                         [LB] → [Cache Servers]
                                [LB] → [Database Servers]

Types of Load Balancers

By Layer

Type	OSI Layer	Routing Based On
L4	Transport	IP, Port, Protocol
L7	Application	URL, Headers, Cookies, Content
GSLB	Geographic	Location, Server Health, Proximity

By Implementation

Hardware: Physical appliances (F5, Citrix) - expensive but powerful
Software: Applications (NGINX, HAProxy) - flexible and cost-effective
Virtual: VMs in cloud environments

Load Balancing Algorithms

// 1. Round Robin - Sequential distribution
public class RoundRobinBalancer
{
    private int _current = -1;
    private readonly List<string> _servers;

    public string GetNextServer()
    {
        _current = (_current + 1) % _servers.Count;
        return _servers[_current];
    }
}

// 2. Weighted Round Robin - Based on server capacity
public class WeightedRoundRobinBalancer
{
    private readonly List<(string Server, int Weight)> _servers;
    // Server with weight 3 gets 3x more requests than weight 1
}

// 3. Least Connections - To server with fewest active connections
public class LeastConnectionsBalancer
{
    private readonly Dictionary<string, int> _connections;

    public string GetNextServer()
    {
        return _connections.OrderBy(c => c.Value).First().Key;
    }
}

// 4. IP Hash - Same client always goes to same server
public class IpHashBalancer
{
    public string GetServer(string clientIp)
    {
        int hash = clientIp.GetHashCode();
        return _servers[Math.Abs(hash) % _servers.Count];
    }
}

// 5. Least Response Time - To fastest responding server
// Combines response time + active connections

Database Replication

What is Database Replication?

Copying data from a primary database to replica databases to improve:

Availability - system continues if primary fails
Performance - read queries distributed across replicas
Reliability - data redundancy

Replication Topologies

1. Master-Slave (Primary-Replica)
   [Primary] → [Replica 1]
            → [Replica 2]
            → [Replica 3]
   Writes: Primary only
   Reads: Any node

2. Master-Master (Multi-Primary)
   [Primary 1] ↔ [Primary 2]
   Writes: Any primary
   Reads: Any node
   Conflict resolution needed

3. Chain Replication
   [Primary] → [Replica 1] → [Replica 2]
   Sequential propagation

Benefits of Replication

Benefit	Description
High Availability	System continues if one database fails
Load Distribution	Read queries spread across replicas
Geographic Distribution	Data closer to users
Analytics Separation	Run heavy queries on replica
Disaster Recovery	Built-in backup

System Design Interview Tips

Key Concepts to Demonstrate

Scalability: How to handle 10x, 100x traffic
Availability: What happens when components fail
Performance: Caching, CDN, load balancing strategies
Data Management: Replication, sharding, consistency
Trade-offs: CAP theorem, consistency vs availability

Interview Approach

1. Clarify Requirements (5 min)
   - Functional: What should the system do?
   - Non-functional: Scale, latency, availability targets

2. High-Level Design (10-15 min)
   - Components: API, services, database, cache
   - Data flow: How requests move through system

3. Deep Dive (15-20 min)
   - Database schema
   - API design
   - Caching strategy
   - Load balancing

4. Address Bottlenecks (5 min)
   - Single points of failure
   - Scaling limitations
   - Trade-offs made

Common Questions

Design a URL shortener
Design a rate limiter
Design Twitter/Instagram feed
Design a notification system
Design a distributed cache

Sources

Interviuri/Interviu Microsoft/System Design.docx
Reference: System Design Primer

System Design Fundamentals

System Design Fundamentals

Scalability

What Is Scalability?

Horizontal vs Vertical Scaling

Caching

What is Caching?

How Cache Works

Cache Levels

Types of Cache

1. Application Server Cache

2. Distributed Cache

3. Global Cache

4. CDN (Content Delivery Network)

Cache Eviction Policies

Cache Invalidation Strategies

Load Balancing

What is a Load Balancer?

Without Load Balancer (Problems)

With Load Balancer

Load Balancer Placement

Types of Load Balancers

By Layer

By Implementation

Load Balancing Algorithms

Database Replication

What is Database Replication?

Replication Topologies

Benefits of Replication

System Design Interview Tips

Key Concepts to Demonstrate

Interview Approach

Common Questions

Sources

📚 Related Articles

9 Core Components of Microservice Architecture

Domain-Driven Hexagonal Architecture

.NET Reference Architecture Repositories

System Architecture

Clean Architecture Folder Structure