<?xml version="1.0" encoding="UTF-8"?>
<rss  xmlns:atom="http://www.w3.org/2005/Atom" 
      xmlns:media="http://search.yahoo.com/mrss/" 
      xmlns:content="http://purl.org/rss/1.0/modules/content/" 
      xmlns:dc="http://purl.org/dc/elements/1.1/" 
      version="2.0">
<channel>
<title>Vectoring AI</title>
<link>https://vectoringai.com/pages/system-design.html</link>
<atom:link href="https://vectoringai.com/pages/system-design.xml" rel="self" type="application/rss+xml"/>
<description>System design interview questions covering scalability, distributed systems, caching, load balancing, databases, and real-world system architecture for FAANG+ interviews.</description>
<generator>quarto-1.9.36</generator>
<lastBuildDate>Thu, 21 May 2026 00:00:00 GMT</lastBuildDate>
<item>
  <title>System Design Interview QA - 1</title>
  <dc:creator>Vectoring AI</dc:creator>
  <link>https://vectoringai.com/posts/system-design/System-Design-Interview-QA-1.html</link>
  <description><![CDATA[ 




<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>This is <strong>Part 1</strong> of our System Design Interview QA series, focusing on the <strong>foundational concepts</strong> that underpin every system design interview. System design is about designing the entire architecture of a software system — understanding how components fit together at scale, how failures are handled, and how trade-offs are made across scalability, reliability, performance, databases, APIs, networking, and security.</p>
<blockquote class="blockquote">
<p>For infrastructure deep dives (load balancing, caching, Kubernetes, etc.), see <a href="../../posts/system-design/System-Design-Interview-QA-2.html">System Design Interview QA - 2</a>. For hands-on design problems (URL shortener, chat system, etc.), see <a href="../../posts/system-design/System-Design-Interview-QA-3.html">System Design Interview QA - 3</a>.</p>
</blockquote>
<hr>
</section>
<section id="q1-what-is-scalability-and-how-do-you-scale-a-system" class="level2">
<h2 class="anchored" data-anchor-id="q1-what-is-scalability-and-how-do-you-scale-a-system">Q1: What Is Scalability and How Do You Scale a System?</h2>
<p><strong>Answer:</strong></p>
<p>Scalability is the ability of a system to handle growing amounts of work by adding resources. There are two fundamental approaches: <strong>vertical scaling</strong> (bigger machines) and <strong>horizontal scaling</strong> (more machines).</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    subgraph Vertical["Vertical Scaling (Scale Up)"]
        V1["Small Server&lt;br/&gt;4 CPU, 16GB RAM"]
        V1 --&gt;|"Upgrade"| V2["Large Server&lt;br/&gt;64 CPU, 512GB RAM"]
    end

    subgraph Horizontal["Horizontal Scaling (Scale Out)"]
        LB["Load Balancer"]
        LB --&gt; S1["Server 1"]
        LB --&gt; S2["Server 2"]
        LB --&gt; S3["Server 3"]
        LB --&gt; S4["Server N..."]
    end

    style Vertical fill:#6cc3d5,stroke:#333,color:#fff
    style Horizontal fill:#56cc9d,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="vertical-vs-horizontal-scaling" class="level3">
<h3 class="anchored" data-anchor-id="vertical-vs-horizontal-scaling">Vertical vs Horizontal Scaling</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 12%">
<col style="width: 40%">
<col style="width: 46%">
</colgroup>
<thead>
<tr class="header">
<th>Aspect</th>
<th>Vertical Scaling (Scale Up)</th>
<th>Horizontal Scaling (Scale Out)</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Approach</strong></td>
<td>Add more CPU/RAM/disk to one machine</td>
<td>Add more machines behind a load balancer</td>
</tr>
<tr class="even">
<td><strong>Complexity</strong></td>
<td>Simple — no code changes</td>
<td>Complex — need stateless design, data partitioning</td>
</tr>
<tr class="odd">
<td><strong>Cost</strong></td>
<td>Exponential cost for high-end hardware</td>
<td>Linear cost — commodity hardware</td>
</tr>
<tr class="even">
<td><strong>Limit</strong></td>
<td>Hard ceiling (largest machine available)</td>
<td>Practically unlimited</td>
</tr>
<tr class="odd">
<td><strong>Downtime</strong></td>
<td>Requires restart to upgrade</td>
<td>Zero downtime — add/remove nodes</td>
</tr>
<tr class="even">
<td><strong>Failure</strong></td>
<td>Single point of failure</td>
<td>Fault tolerant — one node fails, others serve</td>
</tr>
<tr class="odd">
<td><strong>Example</strong></td>
<td>Upgrade PostgreSQL server from 32GB to 256GB RAM</td>
<td>Shard PostgreSQL across 8 nodes</td>
</tr>
</tbody>
</table>
</section>
<section id="key-scaling-strategies" class="level3">
<h3 class="anchored" data-anchor-id="key-scaling-strategies">Key Scaling Strategies</h3>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    SCALE["Scaling Strategies"]
    SCALE --&gt; LB["Load Balancing&lt;br/&gt;(distribute traffic)"]
    SCALE --&gt; CACHE["Caching&lt;br/&gt;(reduce DB load)"]
    SCALE --&gt; SHARD["Database Sharding&lt;br/&gt;(split data across DBs)"]
    SCALE --&gt; ASYNC["Async Processing&lt;br/&gt;(message queues)"]
    SCALE --&gt; CDN["CDN&lt;br/&gt;(serve static content at edge)"]
    SCALE --&gt; MS["Microservices&lt;br/&gt;(scale services independently)"]

    style SCALE fill:#56cc9d,stroke:#333,color:#fff
    style LB fill:#ffce67,stroke:#333
    style CACHE fill:#6cc3d5,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<table class="caption-top table">
<colgroup>
<col style="width: 27%">
<col style="width: 36%">
<col style="width: 36%">
</colgroup>
<thead>
<tr class="header">
<th>Strategy</th>
<th>What It Does</th>
<th>When to Use</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Load balancing</strong></td>
<td>Distribute requests across servers</td>
<td>Always, for any multi-server setup</td>
</tr>
<tr class="even">
<td><strong>Caching</strong></td>
<td>Store frequently accessed data in memory</td>
<td>Read-heavy workloads (80/20 rule)</td>
</tr>
<tr class="odd">
<td><strong>Database sharding</strong></td>
<td>Split data across multiple databases</td>
<td>Data too large for single DB, or write throughput limit hit</td>
</tr>
<tr class="even">
<td><strong>Async processing</strong></td>
<td>Offload work to background queues</td>
<td>Long-running tasks (email, video transcoding)</td>
</tr>
<tr class="odd">
<td><strong>CDN</strong></td>
<td>Cache static assets at edge locations</td>
<td>Global user base, static content</td>
</tr>
<tr class="even">
<td><strong>Microservices</strong></td>
<td>Break monolith into independently scalable services</td>
<td>When different components have different scaling needs</td>
</tr>
</tbody>
</table>
</section>
<section id="stateless-vs-stateful-services" class="level3">
<h3 class="anchored" data-anchor-id="stateless-vs-stateful-services">Stateless vs Stateful Services</h3>
<pre><code>Stateless (preferred for horizontal scaling):
  - Server holds NO user session data
  - Any server can handle any request
  - Session stored in external store (Redis, DB)
  - Easy to add/remove servers

Stateful (harder to scale):
  - Server holds user session in memory
  - Requests must be routed to same server (sticky sessions)
  - Server failure loses user state
  - Scaling requires state migration</code></pre>
<hr>
</section>
</section>
<section id="q2-how-do-you-ensure-reliability-and-fault-tolerance" class="level2">
<h2 class="anchored" data-anchor-id="q2-how-do-you-ensure-reliability-and-fault-tolerance">Q2: How Do You Ensure Reliability and Fault Tolerance?</h2>
<p><strong>Answer:</strong></p>
<p>Reliability means a system continues to work correctly even when things go wrong — hardware failures, software bugs, network issues, or traffic spikes. Fault tolerance is achieved through <strong>redundancy</strong>, <strong>replication</strong>, <strong>graceful degradation</strong>, and <strong>automatic recovery</strong>.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    subgraph Redundancy["Redundancy at Every Layer"]
        LB1["Load Balancer&lt;br/&gt;(Active)"]
        LB2["Load Balancer&lt;br/&gt;(Standby)"]
        LB1 --&gt; APP1["App Server 1"]
        LB1 --&gt; APP2["App Server 2"]
        LB1 --&gt; APP3["App Server 3"]
        APP1 --&gt; DB_P["DB Primary"]
        APP2 --&gt; DB_P
        APP3 --&gt; DB_P
        DB_P --&gt;|"Replication"| DB_R1["DB Replica 1"]
        DB_P --&gt;|"Replication"| DB_R2["DB Replica 2"]
    end

    style LB1 fill:#56cc9d,stroke:#333,color:#fff
    style DB_P fill:#6cc3d5,stroke:#333,color:#fff
    style LB2 fill:#ffce67,stroke:#333
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="reliability-patterns" class="level3">
<h3 class="anchored" data-anchor-id="reliability-patterns">Reliability Patterns</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 29%">
<col style="width: 41%">
<col style="width: 29%">
</colgroup>
<thead>
<tr class="header">
<th>Pattern</th>
<th>Description</th>
<th>Example</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Replication</strong></td>
<td>Keep multiple copies of data/services</td>
<td>3 DB replicas across availability zones</td>
</tr>
<tr class="even">
<td><strong>Failover</strong></td>
<td>Automatically switch to backup when primary fails</td>
<td>Primary DB fails → promote replica</td>
</tr>
<tr class="odd">
<td><strong>Health checks</strong></td>
<td>Monitor component health, remove unhealthy nodes</td>
<td>Load balancer pings <code>/health</code> every 5s</td>
</tr>
<tr class="even">
<td><strong>Circuit breaker</strong></td>
<td>Stop calling a failing service, fail fast</td>
<td>If payment API errors &gt;50%, stop calling for 30s</td>
</tr>
<tr class="odd">
<td><strong>Retry with backoff</strong></td>
<td>Retry failed requests with increasing delay</td>
<td>Retry after 1s, 2s, 4s, 8s (exponential backoff)</td>
</tr>
<tr class="even">
<td><strong>Bulkhead</strong></td>
<td>Isolate failures to prevent cascading</td>
<td>Separate thread pools for payment vs catalog</td>
</tr>
<tr class="odd">
<td><strong>Graceful degradation</strong></td>
<td>Serve partial functionality when subsystems fail</td>
<td>Show cached feed if recommendation service is down</td>
</tr>
</tbody>
</table>
</section>
<section id="availability-levels" class="level3">
<h3 class="anchored" data-anchor-id="availability-levels">Availability Levels</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 14%">
<col style="width: 29%">
<col style="width: 34%">
<col style="width: 21%">
</colgroup>
<thead>
<tr class="header">
<th>Level</th>
<th>Downtime/Year</th>
<th>Downtime/Month</th>
<th>Use Case</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>99%</strong> (two 9s)</td>
<td>3.65 days</td>
<td>7.3 hours</td>
<td>Internal tools</td>
</tr>
<tr class="even">
<td><strong>99.9%</strong> (three 9s)</td>
<td>8.76 hours</td>
<td>43.8 minutes</td>
<td>SaaS applications</td>
</tr>
<tr class="odd">
<td><strong>99.99%</strong> (four 9s)</td>
<td>52.6 minutes</td>
<td>4.4 minutes</td>
<td>E-commerce, banking</td>
</tr>
<tr class="even">
<td><strong>99.999%</strong> (five 9s)</td>
<td>5.26 minutes</td>
<td>26.3 seconds</td>
<td>DNS, payment processing</td>
</tr>
</tbody>
</table>
</section>
<section id="failure-handling-flow" class="level3">
<h3 class="anchored" data-anchor-id="failure-handling-flow">Failure Handling Flow</h3>
<pre><code>Request comes in:
  1. Load balancer routes to healthy server
     - If server unreachable → try next server
  2. Server processes request
     - If downstream service fails → circuit breaker
       - Circuit CLOSED: forward request normally
       - Circuit OPEN: return cached/fallback response immediately
       - Circuit HALF-OPEN: try one request, if success → close
  3. Database write
     - Write to primary → replicate to replicas
     - If primary fails → promote replica (automatic failover)
  4. Return response
     - If timeout → client retries with exponential backoff
     - If persistent failure → graceful degradation (partial response)</code></pre>
<hr>
</section>
</section>
<section id="q3-what-are-the-key-performance-optimization-strategies" class="level2">
<h2 class="anchored" data-anchor-id="q3-what-are-the-key-performance-optimization-strategies">Q3: What Are the Key Performance Optimization Strategies?</h2>
<p><strong>Answer:</strong></p>
<p>Performance optimization reduces <strong>latency</strong> (time to respond) and increases <strong>throughput</strong> (requests handled per second). The key principle is to identify and eliminate bottlenecks at each layer: network, application, and database.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph LR
    subgraph Latency["Latency Numbers Every Engineer Should Know"]
        L1["L1 cache: 0.5 ns"]
        L2["L2 cache: 7 ns"]
        L3["RAM access: 100 ns"]
        L4["SSD read: 150 μs"]
        L5["HDD seek: 10 ms"]
        L6["Same datacenter roundtrip: 0.5 ms"]
        L7["Cross-region roundtrip: 150 ms"]
    end

    style Latency fill:#56cc9d,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="performance-optimization-by-layer" class="level3">
<h3 class="anchored" data-anchor-id="performance-optimization-by-layer">Performance Optimization by Layer</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 28%">
<col style="width: 40%">
<col style="width: 32%">
</colgroup>
<thead>
<tr class="header">
<th>Layer</th>
<th>Strategy</th>
<th>Impact</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Network</strong></td>
<td>CDN for static assets</td>
<td>Reduces latency by 10-100x for global users</td>
</tr>
<tr class="even">
<td><strong>Network</strong></td>
<td>HTTP/2 multiplexing, gzip compression</td>
<td>Fewer connections, smaller payloads</td>
</tr>
<tr class="odd">
<td><strong>Network</strong></td>
<td>Connection pooling</td>
<td>Avoid TCP handshake overhead per request</td>
</tr>
<tr class="even">
<td><strong>Application</strong></td>
<td>Caching (Redis/Memcached)</td>
<td>Sub-millisecond reads vs 10-100ms DB queries</td>
</tr>
<tr class="odd">
<td><strong>Application</strong></td>
<td>Async processing (message queues)</td>
<td>Don’t block user on slow operations</td>
</tr>
<tr class="even">
<td><strong>Application</strong></td>
<td>Pagination and lazy loading</td>
<td>Return only what user needs now</td>
</tr>
<tr class="odd">
<td><strong>Database</strong></td>
<td>Indexing</td>
<td>Speed up queries from O(n) to O(log n)</td>
</tr>
<tr class="even">
<td><strong>Database</strong></td>
<td>Read replicas</td>
<td>Distribute read load across multiple DBs</td>
</tr>
<tr class="odd">
<td><strong>Database</strong></td>
<td>Query optimization</td>
<td>Avoid N+1 queries, use JOINs efficiently</td>
</tr>
<tr class="even">
<td><strong>Database</strong></td>
<td>Denormalization</td>
<td>Trade storage for faster reads (avoid JOINs)</td>
</tr>
</tbody>
</table>
</section>
<section id="caching-strategy-deep-dive" class="level3">
<h3 class="anchored" data-anchor-id="caching-strategy-deep-dive">Caching Strategy Deep Dive</h3>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    REQ["Request"]
    REQ --&gt; APP["Application"]
    APP --&gt; CHECK{"Cache&lt;br/&gt;hit?"}
    CHECK --&gt;|"Hit"| RETURN["Return cached data&lt;br/&gt;(&lt; 1ms)"]
    CHECK --&gt;|"Miss"| DB["Query Database&lt;br/&gt;(10-100ms)"]
    DB --&gt; UPDATE["Update Cache"]
    UPDATE --&gt; RETURN2["Return data"]

    style CHECK fill:#ffce67,stroke:#333
    style RETURN fill:#56cc9d,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<table class="caption-top table">
<colgroup>
<col style="width: 41%">
<col style="width: 33%">
<col style="width: 25%">
</colgroup>
<thead>
<tr class="header">
<th>Caching Pattern</th>
<th>How It Works</th>
<th>Best For</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Cache-aside</strong> (lazy loading)</td>
<td>App checks cache → if miss, query DB → write to cache</td>
<td>General purpose, read-heavy</td>
</tr>
<tr class="even">
<td><strong>Write-through</strong></td>
<td>Write to cache AND DB simultaneously</td>
<td>When reads immediately follow writes</td>
</tr>
<tr class="odd">
<td><strong>Write-behind</strong> (write-back)</td>
<td>Write to cache first, async write to DB</td>
<td>Write-heavy, can tolerate brief inconsistency</td>
</tr>
<tr class="even">
<td><strong>Read-through</strong></td>
<td>Cache itself fetches from DB on miss</td>
<td>Simpler app code, cache acts as primary interface</td>
</tr>
</tbody>
</table>
</section>
<section id="cache-invalidation-strategies" class="level3">
<h3 class="anchored" data-anchor-id="cache-invalidation-strategies">Cache Invalidation Strategies</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 29%">
<col style="width: 38%">
<col style="width: 32%">
</colgroup>
<thead>
<tr class="header">
<th>Strategy</th>
<th>Description</th>
<th>Trade-off</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>TTL (Time-To-Live)</strong></td>
<td>Cache expires after N seconds</td>
<td>Simple but may serve stale data</td>
</tr>
<tr class="even">
<td><strong>Event-driven invalidation</strong></td>
<td>Invalidate on write/update event</td>
<td>Fresh data but more complex</td>
</tr>
<tr class="odd">
<td><strong>Version-based</strong></td>
<td>Key includes version number, bump on update</td>
<td>No stale data, slight overhead</td>
</tr>
</tbody>
</table>
<hr>
</section>
</section>
<section id="q4-how-do-distributed-systems-work-and-what-are-the-key-challenges" class="level2">
<h2 class="anchored" data-anchor-id="q4-how-do-distributed-systems-work-and-what-are-the-key-challenges">Q4: How Do Distributed Systems Work and What Are the Key Challenges?</h2>
<p><strong>Answer:</strong></p>
<p>A distributed system is a collection of independent computers that appear to users as a single coherent system. They are necessary when a single machine cannot handle the load, data, or availability requirements.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    subgraph Challenges["8 Fallacies of Distributed Computing"]
        F1["1. The network is NOT reliable"]
        F2["2. Latency is NOT zero"]
        F3["3. Bandwidth is NOT infinite"]
        F4["4. The network is NOT secure"]
        F5["5. Topology DOES change"]
        F6["6. There is NOT one administrator"]
        F7["7. Transport cost is NOT zero"]
        F8["8. The network is NOT homogeneous"]
    end

    style Challenges fill:#ff7851,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="cap-theorem" class="level3">
<h3 class="anchored" data-anchor-id="cap-theorem">CAP Theorem</h3>
<p>Every distributed data store can provide at most <strong>two of three</strong> guarantees simultaneously:</p>
<table class="caption-top table">
<colgroup>
<col style="width: 33%">
<col style="width: 36%">
<col style="width: 30%">
</colgroup>
<thead>
<tr class="header">
<th>Property</th>
<th>Definition</th>
<th>Example</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Consistency</strong></td>
<td>Every read receives the most recent write</td>
<td>All replicas return the same value</td>
</tr>
<tr class="even">
<td><strong>Availability</strong></td>
<td>Every request receives a response (success or failure)</td>
<td>System never refuses a request</td>
</tr>
<tr class="odd">
<td><strong>Partition Tolerance</strong></td>
<td>System operates despite network failures between nodes</td>
<td>Nodes can’t communicate but keep serving</td>
</tr>
</tbody>
</table>
<pre><code>Since network partitions WILL happen, you must choose:

  CP (Consistency + Partition Tolerance):
    → During partition: reject requests rather than return stale data
    → Examples: MongoDB, HBase, Redis Cluster, ZooKeeper
    → Use for: Banking, inventory, leader election

  AP (Availability + Partition Tolerance):
    → During partition: serve requests even if data might be stale
    → Examples: Cassandra, DynamoDB, CouchDB
    → Use for: Social media feeds, shopping carts, analytics</code></pre>
</section>
<section id="consensus-algorithms" class="level3">
<h3 class="anchored" data-anchor-id="consensus-algorithms">Consensus Algorithms</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 26%">
<col style="width: 21%">
<col style="width: 30%">
<col style="width: 21%">
</colgroup>
<thead>
<tr class="header">
<th>Algorithm</th>
<th>Purpose</th>
<th>How It Works</th>
<th>Used In</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Paxos</strong></td>
<td>Agreement among nodes</td>
<td>Proposer → Acceptors → Learners; majority quorum</td>
<td>Google Chubby</td>
</tr>
<tr class="even">
<td><strong>Raft</strong></td>
<td>Leader-based consensus</td>
<td>Elect leader → leader replicates log entries; easier to understand</td>
<td>etcd, CockroachDB</td>
</tr>
<tr class="odd">
<td><strong>Gossip Protocol</strong></td>
<td>Information dissemination</td>
<td>Nodes periodically exchange state with random peers</td>
<td>Cassandra, DynamoDB</td>
</tr>
</tbody>
</table>
</section>
<section id="consistency-models" class="level3">
<h3 class="anchored" data-anchor-id="consistency-models">Consistency Models</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 19%">
<col style="width: 27%">
<col style="width: 25%">
<col style="width: 27%">
</colgroup>
<thead>
<tr class="header">
<th>Model</th>
<th>Guarantee</th>
<th>Latency</th>
<th>Use Case</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Strong consistency</strong></td>
<td>Read always returns latest write</td>
<td>High (synchronous replication)</td>
<td>Banking transactions</td>
</tr>
<tr class="even">
<td><strong>Eventual consistency</strong></td>
<td>Reads converge to latest write over time</td>
<td>Low (async replication)</td>
<td>Social media likes/counts</td>
</tr>
<tr class="odd">
<td><strong>Causal consistency</strong></td>
<td>Preserves cause-and-effect ordering</td>
<td>Medium</td>
<td>Comment threads, chat</td>
</tr>
<tr class="even">
<td><strong>Read-your-writes</strong></td>
<td>User sees their own writes immediately</td>
<td>Medium</td>
<td>User profile updates</td>
</tr>
</tbody>
</table>
<hr>
</section>
</section>
<section id="q5-what-infrastructure-components-make-up-a-production-system" class="level2">
<h2 class="anchored" data-anchor-id="q5-what-infrastructure-components-make-up-a-production-system">Q5: What Infrastructure Components Make Up a Production System?</h2>
<p><strong>Answer:</strong></p>
<p>A production system is composed of multiple infrastructure layers working together. Understanding each component’s role and how they interact is essential for system design.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    USERS["Users"]
    USERS --&gt; DNS["DNS&lt;br/&gt;(Route53, Cloudflare)"]
    DNS --&gt; CDN["CDN&lt;br/&gt;(CloudFront, Akamai)"]
    CDN --&gt; LB["Load Balancer&lt;br/&gt;(ALB, Nginx)"]
    LB --&gt; API["API Gateway"]
    API --&gt; SVC1["Service A"]
    API --&gt; SVC2["Service B"]
    API --&gt; SVC3["Service C"]
    SVC1 --&gt; CACHE["Cache&lt;br/&gt;(Redis / Memcached)"]
    SVC1 --&gt; DB["Database&lt;br/&gt;(PostgreSQL / MySQL)"]
    SVC2 --&gt; MQ["Message Queue&lt;br/&gt;(Kafka / RabbitMQ)"]
    MQ --&gt; WORKER["Background Workers"]
    WORKER --&gt; STORE["Object Storage&lt;br/&gt;(S3)"]
    SVC3 --&gt; SEARCH["Search Engine&lt;br/&gt;(Elasticsearch)"]

    subgraph Observability
        LOG["Logging&lt;br/&gt;(ELK Stack)"]
        METRIC["Metrics&lt;br/&gt;(Prometheus / Grafana)"]
        TRACE["Tracing&lt;br/&gt;(Jaeger / Zipkin)"]
    end

    style LB fill:#56cc9d,stroke:#333,color:#fff
    style CACHE fill:#ffce67,stroke:#333
    style DB fill:#6cc3d5,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="infrastructure-components-reference" class="level3">
<h3 class="anchored" data-anchor-id="infrastructure-components-reference">Infrastructure Components Reference</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 25%">
<col style="width: 20%">
<col style="width: 23%">
<col style="width: 30%">
</colgroup>
<thead>
<tr class="header">
<th>Component</th>
<th>Purpose</th>
<th>Examples</th>
<th>When to Use</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>DNS</strong></td>
<td>Domain → IP resolution, geographic routing</td>
<td>Route53, Cloudflare DNS</td>
<td>Always — entry point for all traffic</td>
</tr>
<tr class="even">
<td><strong>CDN</strong></td>
<td>Cache static content at edge locations globally</td>
<td>CloudFront, Akamai, Fastly</td>
<td>Static assets, global user base</td>
</tr>
<tr class="odd">
<td><strong>Load Balancer</strong></td>
<td>Distribute traffic across servers</td>
<td>ALB/NLB (AWS), Nginx, HAProxy</td>
<td>Multiple app servers</td>
</tr>
<tr class="even">
<td><strong>API Gateway</strong></td>
<td>Routing, auth, rate limiting, protocol translation</td>
<td>Kong, AWS API Gateway, Envoy</td>
<td>Microservices architecture</td>
</tr>
<tr class="odd">
<td><strong>Cache</strong></td>
<td>In-memory store for frequently accessed data</td>
<td>Redis, Memcached</td>
<td>Read-heavy workloads</td>
</tr>
<tr class="even">
<td><strong>Message Queue</strong></td>
<td>Async communication, decouple producers/consumers</td>
<td>Kafka, RabbitMQ, SQS</td>
<td>Background processing, event-driven</td>
</tr>
<tr class="odd">
<td><strong>Object Storage</strong></td>
<td>Store blobs (images, videos, backups)</td>
<td>S3, GCS, Azure Blob</td>
<td>Media files, backups, data lake</td>
</tr>
<tr class="even">
<td><strong>Search Engine</strong></td>
<td>Full-text search, analytics</td>
<td>Elasticsearch, OpenSearch</td>
<td>Product search, log analysis</td>
</tr>
<tr class="odd">
<td><strong>Container Orchestration</strong></td>
<td>Deploy, scale, manage containerized services</td>
<td>Kubernetes, ECS</td>
<td>Microservices deployment</td>
</tr>
</tbody>
</table>
</section>
<section id="monolith-vs-microservices" class="level3">
<h3 class="anchored" data-anchor-id="monolith-vs-microservices">Monolith vs Microservices</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 24%">
<col style="width: 30%">
<col style="width: 45%">
</colgroup>
<thead>
<tr class="header">
<th>Aspect</th>
<th>Monolith</th>
<th>Microservices</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Deployment</strong></td>
<td>Single deployable unit</td>
<td>Independent services, independent deployments</td>
</tr>
<tr class="even">
<td><strong>Scaling</strong></td>
<td>Scale everything together</td>
<td>Scale each service independently</td>
</tr>
<tr class="odd">
<td><strong>Complexity</strong></td>
<td>Simple to develop and deploy initially</td>
<td>Complex: service discovery, distributed tracing</td>
</tr>
<tr class="even">
<td><strong>Data</strong></td>
<td>Single shared database</td>
<td>Database per service (data isolation)</td>
</tr>
<tr class="odd">
<td><strong>Team</strong></td>
<td>Single team, tight coupling</td>
<td>Small teams own individual services</td>
</tr>
<tr class="even">
<td><strong>Failure</strong></td>
<td>One bug can crash entire system</td>
<td>Failure isolated to one service (with proper design)</td>
</tr>
<tr class="odd">
<td><strong>Best for</strong></td>
<td>Small teams, early-stage products</td>
<td>Large teams, complex domains, different scaling needs</td>
</tr>
</tbody>
</table>
</section>
<section id="when-to-move-from-monolith-to-microservices" class="level3">
<h3 class="anchored" data-anchor-id="when-to-move-from-monolith-to-microservices">When to Move from Monolith to Microservices</h3>
<pre><code>Start with a monolith. Split when:
  1. Team size &gt; 10-15 engineers (coordination overhead)
  2. Different components have vastly different scaling needs
  3. Deployment of one feature blocks another team
  4. Different services need different tech stacks
  5. You need independent failure isolation

Do NOT split prematurely — microservices add operational complexity:
  - Service discovery
  - Distributed transactions (Saga pattern)
  - Network latency between services
  - Distributed debugging and tracing
  - Data consistency across service boundaries</code></pre>
<hr>
</section>
</section>
<section id="q6-how-do-you-design-restful-apis-and-choose-between-api-styles" class="level2">
<h2 class="anchored" data-anchor-id="q6-how-do-you-design-restful-apis-and-choose-between-api-styles">Q6: How Do You Design RESTful APIs and Choose Between API Styles?</h2>
<p><strong>Answer:</strong></p>
<p>APIs are the contracts between system components. Choosing the right API style and designing clean, consistent interfaces is a core system design skill.</p>
<section id="api-style-comparison" class="level3">
<h3 class="anchored" data-anchor-id="api-style-comparison">API Style Comparison</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 20%">
<col style="width: 28%">
<col style="width: 22%">
<col style="width: 28%">
</colgroup>
<thead>
<tr class="header">
<th>Style</th>
<th>Protocol</th>
<th>Format</th>
<th>Best For</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>REST</strong></td>
<td>HTTP</td>
<td>JSON</td>
<td>CRUD web services, public APIs</td>
</tr>
<tr class="even">
<td><strong>GraphQL</strong></td>
<td>HTTP</td>
<td>JSON</td>
<td>Complex queries, frontend-driven data needs</td>
</tr>
<tr class="odd">
<td><strong>gRPC</strong></td>
<td>HTTP/2</td>
<td>Protobuf (binary)</td>
<td>Low-latency microservice communication</td>
</tr>
<tr class="even">
<td><strong>WebSocket</strong></td>
<td>TCP (upgraded HTTP)</td>
<td>Any</td>
<td>Real-time bidirectional (chat, gaming)</td>
</tr>
<tr class="odd">
<td><strong>Webhook</strong></td>
<td>HTTP (push)</td>
<td>JSON</td>
<td>Event notifications (payment processed, build complete)</td>
</tr>
</tbody>
</table>
</section>
<section id="rest-api-design-principles" class="level3">
<h3 class="anchored" data-anchor-id="rest-api-design-principles">REST API Design Principles</h3>
<pre><code>Good REST API design:

  Resources (nouns, not verbs):
    ✅ GET    /api/v1/users              → List users
    ✅ GET    /api/v1/users/123          → Get user 123
    ✅ POST   /api/v1/users              → Create user
    ✅ PUT    /api/v1/users/123          → Update user 123
    ✅ DELETE /api/v1/users/123          → Delete user 123
    ❌ GET    /api/v1/getUser?id=123     → Verb in URL (bad)

  Nested resources:
    GET  /api/v1/users/123/orders       → Orders for user 123
    GET  /api/v1/users/123/orders/456   → Specific order

  Pagination:
    GET /api/v1/users?page=2&amp;limit=20
    GET /api/v1/users?cursor=abc123&amp;limit=20  (cursor-based, preferred)

  Filtering and sorting:
    GET /api/v1/users?role=admin&amp;sort=-created_at

  Versioning:
    /api/v1/users  → URL path versioning (most common)
    Accept: application/vnd.api.v1+json  → Header versioning</code></pre>
</section>
<section id="http-status-codes" class="level3">
<h3 class="anchored" data-anchor-id="http-status-codes">HTTP Status Codes</h3>
<table class="caption-top table">
<thead>
<tr class="header">
<th>Code</th>
<th>Meaning</th>
<th>When to Use</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>200</strong></td>
<td>OK</td>
<td>Successful GET, PUT</td>
</tr>
<tr class="even">
<td><strong>201</strong></td>
<td>Created</td>
<td>Successful POST (resource created)</td>
</tr>
<tr class="odd">
<td><strong>204</strong></td>
<td>No Content</td>
<td>Successful DELETE</td>
</tr>
<tr class="even">
<td><strong>400</strong></td>
<td>Bad Request</td>
<td>Invalid input, validation error</td>
</tr>
<tr class="odd">
<td><strong>401</strong></td>
<td>Unauthorized</td>
<td>Missing or invalid authentication</td>
</tr>
<tr class="even">
<td><strong>403</strong></td>
<td>Forbidden</td>
<td>Authenticated but insufficient permissions</td>
</tr>
<tr class="odd">
<td><strong>404</strong></td>
<td>Not Found</td>
<td>Resource doesn’t exist</td>
</tr>
<tr class="even">
<td><strong>409</strong></td>
<td>Conflict</td>
<td>Duplicate resource, version conflict</td>
</tr>
<tr class="odd">
<td><strong>429</strong></td>
<td>Too Many Requests</td>
<td>Rate limit exceeded</td>
</tr>
<tr class="even">
<td><strong>500</strong></td>
<td>Internal Server Error</td>
<td>Unhandled server error</td>
</tr>
<tr class="odd">
<td><strong>503</strong></td>
<td>Service Unavailable</td>
<td>Server overloaded or in maintenance</td>
</tr>
</tbody>
</table>
</section>
<section id="idempotency" class="level3">
<h3 class="anchored" data-anchor-id="idempotency">Idempotency</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 23%">
<col style="width: 35%">
<col style="width: 20%">
<col style="width: 20%">
</colgroup>
<thead>
<tr class="header">
<th>Method</th>
<th>Idempotent?</th>
<th>Safe?</th>
<th>Notes</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>GET</strong></td>
<td>Yes</td>
<td>Yes</td>
<td>Retrieves data, no side effects</td>
</tr>
<tr class="even">
<td><strong>PUT</strong></td>
<td>Yes</td>
<td>No</td>
<td>Same request produces same result</td>
</tr>
<tr class="odd">
<td><strong>DELETE</strong></td>
<td>Yes</td>
<td>No</td>
<td>Deleting same resource twice = same outcome</td>
</tr>
<tr class="even">
<td><strong>POST</strong></td>
<td><strong>No</strong></td>
<td>No</td>
<td>Use idempotency keys (e.g., <code>Idempotency-Key: uuid</code>)</td>
</tr>
<tr class="odd">
<td><strong>PATCH</strong></td>
<td>No</td>
<td>No</td>
<td>Partial update — result depends on current state</td>
</tr>
</tbody>
</table>
</section>
<section id="pagination-offset-vs-cursor" class="level3">
<h3 class="anchored" data-anchor-id="pagination-offset-vs-cursor">Pagination: Offset vs Cursor</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 45%">
<col style="width: 27%">
<col style="width: 27%">
</colgroup>
<thead>
<tr class="header">
<th>Approach</th>
<th>Pros</th>
<th>Cons</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Offset</strong> (<code>?page=5&amp;limit=20</code>)</td>
<td>Simple, can jump to any page</td>
<td>Slow on large datasets (OFFSET scans rows); inconsistent with inserts</td>
</tr>
<tr class="even">
<td><strong>Cursor</strong> (<code>?cursor=abc&amp;limit=20</code>)</td>
<td>Consistent, fast (indexed seek); handles real-time inserts</td>
<td>Can’t jump to arbitrary page</td>
</tr>
</tbody>
</table>
<hr>
</section>
</section>
<section id="q7-how-do-you-choose-the-right-database" class="level2">
<h2 class="anchored" data-anchor-id="q7-how-do-you-choose-the-right-database">Q7: How Do You Choose the Right Database?</h2>
<p><strong>Answer:</strong></p>
<p>Database selection is one of the most impactful decisions in system design. The choice depends on data structure, access patterns, consistency requirements, and scale.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    CHOOSE["Choose Your Database"]
    CHOOSE --&gt; REL["Relational (SQL)"]
    CHOOSE --&gt; DOC["Document Store"]
    CHOOSE --&gt; KV["Key-Value Store"]
    CHOOSE --&gt; COL["Wide-Column Store"]
    CHOOSE --&gt; GRAPH["Graph Database"]
    CHOOSE --&gt; TS["Time-Series DB"]
    CHOOSE --&gt; SEARCH["Search Engine"]

    REL --&gt; REL_EX["PostgreSQL, MySQL&lt;br/&gt;ACID, complex queries, JOINs"]
    DOC --&gt; DOC_EX["MongoDB, CouchDB&lt;br/&gt;Flexible schema, nested data"]
    KV --&gt; KV_EX["Redis, DynamoDB&lt;br/&gt;Cache, session, simple lookups"]
    COL --&gt; COL_EX["Cassandra, HBase&lt;br/&gt;Write-heavy, time-series-like"]
    GRAPH --&gt; GRAPH_EX["Neo4j, Neptune&lt;br/&gt;Relationships, social networks"]
    TS --&gt; TS_EX["InfluxDB, TimescaleDB&lt;br/&gt;Metrics, IoT, monitoring"]
    SEARCH --&gt; SEARCH_EX["Elasticsearch&lt;br/&gt;Full-text search, analytics"]

    style CHOOSE fill:#56cc9d,stroke:#333,color:#fff
    style REL fill:#6cc3d5,stroke:#333,color:#fff
    style DOC fill:#ffce67,stroke:#333
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="database-selection-guide" class="level3">
<h3 class="anchored" data-anchor-id="database-selection-guide">Database Selection Guide</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 43%">
<col style="width: 40%">
<col style="width: 16%">
</colgroup>
<thead>
<tr class="header">
<th>Requirement</th>
<th>Best Choice</th>
<th>Why</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>Complex relationships, ACID transactions</td>
<td><strong>PostgreSQL / MySQL</strong></td>
<td>Strong consistency, JOINs, mature tooling</td>
</tr>
<tr class="even">
<td>Flexible schema, nested documents</td>
<td><strong>MongoDB</strong></td>
<td>Schema-less, easy horizontal scaling</td>
</tr>
<tr class="odd">
<td>Ultra-fast key-value lookups, caching</td>
<td><strong>Redis</strong></td>
<td>In-memory, sub-millisecond latency</td>
</tr>
<tr class="even">
<td>Massive write throughput, append-only</td>
<td><strong>Cassandra</strong></td>
<td>Distributed, tunable consistency, linear scaling</td>
</tr>
<tr class="odd">
<td>Social graph, recommendations</td>
<td><strong>Neo4j</strong></td>
<td>Optimized for traversing relationships</td>
</tr>
<tr class="even">
<td>Full-text search, log analytics</td>
<td><strong>Elasticsearch</strong></td>
<td>Inverted index, near real-time search</td>
</tr>
<tr class="odd">
<td>Time-series data (metrics, IoT)</td>
<td><strong>TimescaleDB / InfluxDB</strong></td>
<td>Optimized for time-bucketed queries</td>
</tr>
<tr class="even">
<td>Globally distributed, strong consistency</td>
<td><strong>CockroachDB / Spanner</strong></td>
<td>Distributed SQL, serializable isolation</td>
</tr>
</tbody>
</table>
</section>
<section id="sql-vs-nosql-trade-offs" class="level3">
<h3 class="anchored" data-anchor-id="sql-vs-nosql-trade-offs">SQL vs NoSQL Trade-offs</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 25%">
<col style="width: 53%">
<col style="width: 21%">
</colgroup>
<thead>
<tr class="header">
<th>Aspect</th>
<th>SQL (Relational)</th>
<th>NoSQL</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Schema</strong></td>
<td>Fixed schema, migrations required</td>
<td>Flexible / schema-less</td>
</tr>
<tr class="even">
<td><strong>Consistency</strong></td>
<td>ACID (strong by default)</td>
<td>BASE (eventual, tunable)</td>
</tr>
<tr class="odd">
<td><strong>Scaling</strong></td>
<td>Vertical (primarily), read replicas</td>
<td>Horizontal (built-in sharding)</td>
</tr>
<tr class="even">
<td><strong>Queries</strong></td>
<td>Complex JOINs, aggregations, SQL</td>
<td>Simple lookups, limited JOINs</td>
</tr>
<tr class="odd">
<td><strong>Transactions</strong></td>
<td>Multi-table transactions native</td>
<td>Limited (single-partition or Saga pattern)</td>
</tr>
<tr class="even">
<td><strong>Best for</strong></td>
<td>Financial, e-commerce, complex relationships</td>
<td>High-scale, simple access patterns, flexible data</td>
</tr>
</tbody>
</table>
</section>
<section id="database-scaling-strategies" class="level3">
<h3 class="anchored" data-anchor-id="database-scaling-strategies">Database Scaling Strategies</h3>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    DBSCALE["Database Scaling"]
    DBSCALE --&gt; RR["Read Replicas&lt;br/&gt;(scale reads)"]
    DBSCALE --&gt; SHARD["Sharding&lt;br/&gt;(scale writes + storage)"]
    DBSCALE --&gt; PART["Partitioning&lt;br/&gt;(split tables)"]
    DBSCALE --&gt; POOL["Connection Pooling&lt;br/&gt;(scale connections)"]

    RR --&gt; RR_D["Primary handles writes&lt;br/&gt;Replicas handle reads&lt;br/&gt;Async replication"]
    SHARD --&gt; SHARD_D["Split data by key&lt;br/&gt;(user_id % N shards)&lt;br/&gt;Each shard is a full DB"]

    style DBSCALE fill:#56cc9d,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
</section>
<section id="sharding-strategies" class="level3">
<h3 class="anchored" data-anchor-id="sharding-strategies">Sharding Strategies</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 28%">
<col style="width: 37%">
<col style="width: 17%">
<col style="width: 17%">
</colgroup>
<thead>
<tr class="header">
<th>Strategy</th>
<th>How It Works</th>
<th>Pros</th>
<th>Cons</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Hash-based</strong></td>
<td><code>shard = hash(key) % N</code></td>
<td>Even distribution</td>
<td>Adding shards requires reshuffling</td>
</tr>
<tr class="even">
<td><strong>Range-based</strong></td>
<td><code>shard 1: A-M, shard 2: N-Z</code></td>
<td>Range queries efficient</td>
<td>Hotspots if data is skewed</td>
</tr>
<tr class="odd">
<td><strong>Directory-based</strong></td>
<td>Lookup table maps key → shard</td>
<td>Flexible, no reshuffling</td>
<td>Lookup table is single point of failure</td>
</tr>
<tr class="even">
<td><strong>Consistent hashing</strong></td>
<td>Hash ring, minimal key movement on changes</td>
<td>Add/remove nodes easily</td>
<td>Slightly uneven with few nodes (use virtual nodes)</td>
</tr>
</tbody>
</table>
<hr>
</section>
</section>
<section id="q8-how-does-networking-work-in-distributed-systems" class="level2">
<h2 class="anchored" data-anchor-id="q8-how-does-networking-work-in-distributed-systems">Q8: How Does Networking Work in Distributed Systems?</h2>
<p><strong>Answer:</strong></p>
<p>Understanding networking fundamentals is essential for system design — from how a request reaches your server to how services communicate internally.</p>
<section id="how-a-web-request-works-end-to-end" class="level3">
<h3 class="anchored" data-anchor-id="how-a-web-request-works-end-to-end">How a Web Request Works (End-to-End)</h3>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph LR
    BROWSER["Browser"]
    BROWSER --&gt;|"1. DNS Lookup"| DNS["DNS Server&lt;br/&gt;→ IP address"]
    DNS --&gt;|"2. TCP Handshake"| LB["Load Balancer"]
    LB --&gt;|"3. TLS Handshake&lt;br/&gt;(HTTPS)"| APP["App Server"]
    APP --&gt;|"4. Process Request"| DB["Database"]
    DB --&gt;|"5. Response"| APP
    APP --&gt;|"6. HTTP Response"| BROWSER

    style BROWSER fill:#6cc3d5,stroke:#333,color:#fff
    style LB fill:#56cc9d,stroke:#333,color:#fff
    style APP fill:#ffce67,stroke:#333
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<pre><code>Step-by-step breakdown:
  1. DNS resolution: browser.com → 93.184.216.34  (~50ms first time, cached after)
  2. TCP handshake: SYN → SYN-ACK → ACK           (~1 RTT = 0.5ms same DC, 150ms cross-region)
  3. TLS handshake: Certificate exchange, key setup (~1-2 RTT additional for HTTPS)
  4. HTTP request: GET /api/users                   (headers + body)
  5. Server processes, queries DB, builds response
  6. HTTP response: 200 OK + JSON payload
  7. Browser renders response</code></pre>
</section>
<section id="communication-protocols" class="level3">
<h3 class="anchored" data-anchor-id="communication-protocols">Communication Protocols</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 25%">
<col style="width: 17%">
<col style="width: 25%">
<col style="width: 32%">
</colgroup>
<thead>
<tr class="header">
<th>Protocol</th>
<th>Layer</th>
<th>Use Case</th>
<th>Key Property</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>TCP</strong></td>
<td>Transport</td>
<td>Most web traffic, databases</td>
<td>Reliable, ordered delivery</td>
</tr>
<tr class="even">
<td><strong>UDP</strong></td>
<td>Transport</td>
<td>Video streaming, gaming, DNS</td>
<td>Fast, no handshake, unreliable</td>
</tr>
<tr class="odd">
<td><strong>HTTP/1.1</strong></td>
<td>Application</td>
<td>Traditional web APIs</td>
<td>Text-based, one request per connection</td>
</tr>
<tr class="even">
<td><strong>HTTP/2</strong></td>
<td>Application</td>
<td>Modern web APIs</td>
<td>Multiplexing, header compression, binary</td>
</tr>
<tr class="odd">
<td><strong>HTTP/3 (QUIC)</strong></td>
<td>Application</td>
<td>Next-gen web</td>
<td>UDP-based, zero-RTT, faster handshake</td>
</tr>
<tr class="even">
<td><strong>WebSocket</strong></td>
<td>Application</td>
<td>Real-time communication</td>
<td>Full-duplex, persistent connection</td>
</tr>
<tr class="odd">
<td><strong>gRPC</strong></td>
<td>Application</td>
<td>Microservice calls</td>
<td>HTTP/2 + Protobuf, streaming support</td>
</tr>
</tbody>
</table>
</section>
<section id="real-time-communication-patterns" class="level3">
<h3 class="anchored" data-anchor-id="real-time-communication-patterns">Real-Time Communication Patterns</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 16%">
<col style="width: 24%">
<col style="width: 16%">
<col style="width: 24%">
<col style="width: 18%">
</colgroup>
<thead>
<tr class="header">
<th>Pattern</th>
<th>How It Works</th>
<th>Latency</th>
<th>Server Load</th>
<th>Best For</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Short polling</strong></td>
<td>Client sends HTTP request every N seconds</td>
<td>High (N sec delay)</td>
<td>High (many requests)</td>
<td>Simple status checks</td>
</tr>
<tr class="even">
<td><strong>Long polling</strong></td>
<td>Client sends request, server holds until data available</td>
<td>Medium</td>
<td>Medium</td>
<td>Notifications, chat fallback</td>
</tr>
<tr class="odd">
<td><strong>Server-Sent Events (SSE)</strong></td>
<td>Server pushes events over single HTTP connection</td>
<td>Low</td>
<td>Low</td>
<td>Live feeds, dashboards</td>
</tr>
<tr class="even">
<td><strong>WebSocket</strong></td>
<td>Full-duplex persistent TCP connection</td>
<td>Very low</td>
<td>Low</td>
<td>Chat, gaming, real-time collaboration</td>
</tr>
</tbody>
</table>
</section>
<section id="dns-and-load-balancing-at-network-level" class="level3">
<h3 class="anchored" data-anchor-id="dns-and-load-balancing-at-network-level">DNS and Load Balancing at Network Level</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 25%">
<col style="width: 40%">
<col style="width: 33%">
</colgroup>
<thead>
<tr class="header">
<th>Level</th>
<th>Technology</th>
<th>Purpose</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>DNS-level</strong></td>
<td>Route53, Cloudflare</td>
<td>Geographic routing, failover between data centers</td>
</tr>
<tr class="even">
<td><strong>L4 (Transport)</strong></td>
<td>NLB, HAProxy (TCP mode)</td>
<td>Route based on IP/port, very fast, no content inspection</td>
</tr>
<tr class="odd">
<td><strong>L7 (Application)</strong></td>
<td>ALB, Nginx, Envoy</td>
<td>Route based on URL path, headers, content; SSL termination</td>
</tr>
</tbody>
</table>
<hr>
</section>
</section>
<section id="q9-how-do-you-design-for-security-in-distributed-systems" class="level2">
<h2 class="anchored" data-anchor-id="q9-how-do-you-design-for-security-in-distributed-systems">Q9: How Do You Design for Security in Distributed Systems?</h2>
<p><strong>Answer:</strong></p>
<p>Security must be designed into every layer of a system — from network perimeter to data at rest. In system design interviews, demonstrating security awareness distinguishes senior candidates.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    subgraph Perimeter["Perimeter Security"]
        FW["Firewall / WAF"]
        DDOS["DDoS Protection&lt;br/&gt;(Cloudflare, Shield)"]
    end

    subgraph Network["Network Security"]
        TLS["TLS / HTTPS&lt;br/&gt;(encryption in transit)"]
        VPC["VPC / Private Subnets"]
        SG["Security Groups"]
    end

    subgraph Application["Application Security"]
        AUTH["Authentication&lt;br/&gt;(OAuth 2.0, JWT)"]
        AUTHZ["Authorization&lt;br/&gt;(RBAC, ABAC)"]
        VALID["Input Validation&lt;br/&gt;(prevent injection)"]
        RL["Rate Limiting"]
    end

    subgraph Data["Data Security"]
        ENC["Encryption at Rest&lt;br/&gt;(AES-256)"]
        HASH["Password Hashing&lt;br/&gt;(bcrypt, argon2)"]
        MASK["Data Masking&lt;br/&gt;(PII protection)"]
    end

    Perimeter --&gt; Network --&gt; Application --&gt; Data

    style Perimeter fill:#ff7851,stroke:#333,color:#fff
    style Network fill:#ffce67,stroke:#333
    style Application fill:#56cc9d,stroke:#333,color:#fff
    style Data fill:#6cc3d5,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="authentication-vs-authorization" class="level3">
<h3 class="anchored" data-anchor-id="authentication-vs-authorization">Authentication vs Authorization</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 23%">
<col style="width: 25%">
<col style="width: 28%">
<col style="width: 23%">
</colgroup>
<thead>
<tr class="header">
<th>Concept</th>
<th>Question</th>
<th>Mechanism</th>
<th>Example</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Authentication (AuthN)</strong></td>
<td>“Who are you?”</td>
<td>Username/password, OAuth, SSO, MFA</td>
<td>Login with Google</td>
</tr>
<tr class="even">
<td><strong>Authorization (AuthZ)</strong></td>
<td>“What can you do?”</td>
<td>RBAC, ABAC, ACL, policy engines</td>
<td>Admin can delete users, viewer cannot</td>
</tr>
</tbody>
</table>
</section>
<section id="token-based-authentication-flow-oauth-2.0-jwt" class="level3">
<h3 class="anchored" data-anchor-id="token-based-authentication-flow-oauth-2.0-jwt">Token-Based Authentication Flow (OAuth 2.0 + JWT)</h3>
<pre><code>1. User logs in → Auth Server validates credentials
2. Auth Server issues:
   - Access token (JWT, short-lived: 15-60 min)
   - Refresh token (opaque, long-lived: 7-30 days)
3. Client sends Access token in header: Authorization: Bearer &lt;token&gt;
4. API Gateway / Service validates JWT:
   - Verify signature (no DB call needed)
   - Check expiration
   - Extract user ID, roles from claims
5. Token expired → client uses Refresh token to get new Access token
6. Refresh token expired → user must log in again</code></pre>
</section>
<section id="jwt-structure" class="level3">
<h3 class="anchored" data-anchor-id="jwt-structure">JWT Structure</h3>
<pre><code>Header.Payload.Signature

Header:  {"alg": "RS256", "typ": "JWT"}
Payload: {"sub": "user123", "role": "admin", "exp": 1716300000, "iat": 1716296400}
Signature: HMACSHA256(base64(header) + "." + base64(payload), secret)

Key design decisions:
  - Use RS256 (asymmetric) for microservices (public key verification, no shared secret)
  - Keep payload small (don't put entire user profile)
  - Set short expiration (15 min) + use refresh tokens
  - Never store sensitive data in JWT (it's base64, not encrypted)</code></pre>
</section>
<section id="common-security-threats-and-mitigations" class="level3">
<h3 class="anchored" data-anchor-id="common-security-threats-and-mitigations">Common Security Threats and Mitigations</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 24%">
<col style="width: 39%">
<col style="width: 36%">
</colgroup>
<thead>
<tr class="header">
<th>Threat</th>
<th>Description</th>
<th>Mitigation</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>SQL injection</strong></td>
<td>Malicious SQL in user input</td>
<td>Parameterized queries, ORM</td>
</tr>
<tr class="even">
<td><strong>XSS</strong></td>
<td>Injecting scripts into web pages</td>
<td>Input sanitization, CSP headers</td>
</tr>
<tr class="odd">
<td><strong>CSRF</strong></td>
<td>Forged requests from authenticated browser</td>
<td>CSRF tokens, SameSite cookies</td>
</tr>
<tr class="even">
<td><strong>DDoS</strong></td>
<td>Overwhelming system with traffic</td>
<td>Rate limiting, WAF, CDN, auto-scaling</td>
</tr>
<tr class="odd">
<td><strong>Man-in-the-middle</strong></td>
<td>Intercepting network traffic</td>
<td>TLS everywhere, certificate pinning</td>
</tr>
<tr class="even">
<td><strong>Broken authentication</strong></td>
<td>Weak passwords, no MFA</td>
<td>bcrypt/argon2 hashing, MFA, account lockout</td>
</tr>
<tr class="odd">
<td><strong>Data breach</strong></td>
<td>Unauthorized data access</td>
<td>Encryption at rest, principle of least privilege</td>
</tr>
<tr class="even">
<td><strong>API abuse</strong></td>
<td>Scraping, brute force</td>
<td>Rate limiting, API keys, OAuth scopes</td>
</tr>
</tbody>
</table>
</section>
<section id="security-checklist-for-system-design" class="level3">
<h3 class="anchored" data-anchor-id="security-checklist-for-system-design">Security Checklist for System Design</h3>
<pre><code>✅ HTTPS/TLS for all communication (internal and external)
✅ Authentication at the API gateway layer
✅ Authorization checks at the service level
✅ Input validation and sanitization at system boundaries
✅ Rate limiting per client/IP/API key
✅ Encryption at rest for sensitive data (AES-256)
✅ Password hashing with bcrypt or argon2 (never plain text or MD5)
✅ Secrets in vault (HashiCorp Vault, AWS Secrets Manager) — not in code
✅ Audit logging for security-relevant events
✅ Principle of least privilege for service accounts
✅ Network segmentation (private subnets for DBs, no public access)</code></pre>
<hr>
</section>
</section>
<section id="q10-what-is-back-of-the-envelope-estimation-and-how-do-you-do-it" class="level2">
<h2 class="anchored" data-anchor-id="q10-what-is-back-of-the-envelope-estimation-and-how-do-you-do-it">Q10: What Is Back-of-the-Envelope Estimation and How Do You Do It?</h2>
<p><strong>Answer:</strong></p>
<p>Back-of-the-envelope estimation is a quick calculation technique to estimate system capacity and requirements. Interviewers use it to test whether you can reason about scale and make informed design decisions.</p>
<section id="power-of-2-reference" class="level3">
<h3 class="anchored" data-anchor-id="power-of-2-reference">Power of 2 Reference</h3>
<table class="caption-top table">
<thead>
<tr class="header">
<th>Power</th>
<th>Exact Value</th>
<th>Approximate</th>
<th>Name</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>2^10</td>
<td>1,024</td>
<td>~1 Thousand</td>
<td>1 KB</td>
</tr>
<tr class="even">
<td>2^20</td>
<td>1,048,576</td>
<td>~1 Million</td>
<td>1 MB</td>
</tr>
<tr class="odd">
<td>2^30</td>
<td>1,073,741,824</td>
<td>~1 Billion</td>
<td>1 GB</td>
</tr>
<tr class="even">
<td>2^40</td>
<td>~1.1 × 10^12</td>
<td>~1 Trillion</td>
<td>1 TB</td>
</tr>
<tr class="odd">
<td>2^50</td>
<td>~1.1 × 10^15</td>
<td>~1 Quadrillion</td>
<td>1 PB</td>
</tr>
</tbody>
</table>
</section>
<section id="common-data-sizes" class="level3">
<h3 class="anchored" data-anchor-id="common-data-sizes">Common Data Sizes</h3>
<table class="caption-top table">
<thead>
<tr class="header">
<th>Data Type</th>
<th>Typical Size</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>Character (ASCII)</td>
<td>1 byte</td>
</tr>
<tr class="even">
<td>Character (UTF-8)</td>
<td>1-4 bytes</td>
</tr>
<tr class="odd">
<td>Integer</td>
<td>4-8 bytes</td>
</tr>
<tr class="even">
<td>UUID</td>
<td>16 bytes</td>
</tr>
<tr class="odd">
<td>Timestamp</td>
<td>8 bytes</td>
</tr>
<tr class="even">
<td>Short string (name)</td>
<td>~50 bytes</td>
</tr>
<tr class="odd">
<td>URL</td>
<td>~100 bytes</td>
</tr>
<tr class="even">
<td>Tweet / SMS</td>
<td>~200 bytes</td>
</tr>
<tr class="odd">
<td>JSON API response</td>
<td>~1-10 KB</td>
</tr>
<tr class="even">
<td>Compressed image thumbnail</td>
<td>~10-50 KB</td>
</tr>
<tr class="odd">
<td>Photo (high quality)</td>
<td>~2-5 MB</td>
</tr>
<tr class="even">
<td>Short video (1 min)</td>
<td>~50-100 MB</td>
</tr>
<tr class="odd">
<td>Database row (typical)</td>
<td>~500 bytes - 2 KB</td>
</tr>
</tbody>
</table>
</section>
<section id="qps-queries-per-second-estimation" class="level3">
<h3 class="anchored" data-anchor-id="qps-queries-per-second-estimation">QPS (Queries Per Second) Estimation</h3>
<pre><code>Formula: QPS = DAU × queries_per_user / seconds_per_day

Example: Twitter
  - 500M DAU
  - Each user views feed 5 times/day, each feed = 10 API calls
  - Total queries/day = 500M × 50 = 25B
  - QPS = 25B / 86,400 ≈ 290,000 QPS
  - Peak QPS ≈ 2 × average ≈ 580,000 QPS

Quick shortcut:
  - Seconds in a day ≈ 100,000 (actual: 86,400)
  - 1M requests/day ≈ 10 QPS
  - 100M requests/day ≈ 1,000 QPS
  - 1B requests/day ≈ 10,000 QPS</code></pre>
</section>
<section id="storage-estimation" class="level3">
<h3 class="anchored" data-anchor-id="storage-estimation">Storage Estimation</h3>
<pre><code>Formula: Storage = records_per_day × record_size × retention_period

Example: Chat application
  - 500M DAU, 100 messages/user/day
  - Message size: ~100 bytes (text) + ~100 bytes (metadata) = 200 bytes
  - Daily: 500M × 100 × 200 bytes = 10TB/day
  - Yearly: 10TB × 365 = 3.65 PB/year
  - 5 years with replication (3x): ~55 PB total</code></pre>
</section>
<section id="bandwidth-estimation" class="level3">
<h3 class="anchored" data-anchor-id="bandwidth-estimation">Bandwidth Estimation</h3>
<pre><code>Formula: Bandwidth = QPS × avg_response_size

Example: Image serving
  - 100K QPS, average image = 200KB
  - Bandwidth = 100,000 × 200KB = 20GB/s = 160 Gbps
  - With CDN absorbing 90%: origin bandwidth ≈ 16 Gbps</code></pre>
</section>
<section id="server-estimation" class="level3">
<h3 class="anchored" data-anchor-id="server-estimation">Server Estimation</h3>
<pre><code>Rule of thumb:
  - 1 web server handles ~1,000-10,000 QPS (depends on complexity)
  - 1 DB server handles ~1,000-5,000 QPS (depends on query complexity)
  - 1 cache server (Redis): ~100,000-500,000 QPS

Example: 500K QPS API
  - App servers: 500K / 5,000 = 100 servers (with headroom: 150)
  - DB (with read replicas): 1 primary + 10 read replicas
  - Cache: 500K / 200K = 3 Redis nodes (with replication: 6)</code></pre>
<hr>
</section>
</section>
<section id="summary-table" class="level2">
<h2 class="anchored" data-anchor-id="summary-table">Summary Table</h2>
<table class="caption-top table">
<colgroup>
<col style="width: 13%">
<col style="width: 30%">
<col style="width: 56%">
</colgroup>
<thead>
<tr class="header">
<th>#</th>
<th>Topic</th>
<th>Key Concepts</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>1</td>
<td><strong>Scalability</strong></td>
<td>Vertical vs horizontal scaling, stateless design, load balancing, caching, sharding</td>
</tr>
<tr class="even">
<td>2</td>
<td><strong>Reliability</strong></td>
<td>Replication, failover, circuit breaker, bulkhead, graceful degradation, 99.99% availability</td>
</tr>
<tr class="odd">
<td>3</td>
<td><strong>Performance</strong></td>
<td>Caching strategies, latency numbers, CDN, indexing, read replicas, denormalization</td>
</tr>
<tr class="even">
<td>4</td>
<td><strong>Distributed Systems</strong></td>
<td>CAP theorem, consistency models, consensus (Raft/Paxos), gossip protocol</td>
</tr>
<tr class="odd">
<td>5</td>
<td><strong>Infrastructure</strong></td>
<td>DNS → CDN → LB → API Gateway → Services → DB; monolith vs microservices</td>
</tr>
<tr class="even">
<td>6</td>
<td><strong>APIs</strong></td>
<td>REST vs GraphQL vs gRPC, HTTP status codes, pagination, idempotency, versioning</td>
</tr>
<tr class="odd">
<td>7</td>
<td><strong>Databases</strong></td>
<td>SQL vs NoSQL, sharding strategies, read replicas, choosing the right DB</td>
</tr>
<tr class="even">
<td>8</td>
<td><strong>Networking</strong></td>
<td>TCP/UDP, HTTP/2/3, WebSocket, SSE, DNS, L4 vs L7 load balancing</td>
</tr>
<tr class="odd">
<td>9</td>
<td><strong>Security</strong></td>
<td>AuthN/AuthZ, JWT/OAuth, TLS, encryption at rest, OWASP threats, zero trust</td>
</tr>
<tr class="even">
<td>10</td>
<td><strong>Estimation</strong></td>
<td>QPS, storage, bandwidth, server count, powers of 2, latency numbers</td>
</tr>
</tbody>
</table>
<hr>
</section>
<section id="whats-next" class="level2">
<h2 class="anchored" data-anchor-id="whats-next">What’s Next?</h2>
<p>This article covered foundational system design concepts. Continue with:</p>
<ul>
<li><strong>Infrastructure deep dives:</strong> <a href="../../posts/system-design/System-Design-Interview-QA-2.html">System Design Interview QA - 2</a> — load balancing, caching, message queues, Kubernetes, CI/CD, monitoring</li>
<li><strong>Hands-on design problems:</strong> <a href="../../posts/system-design/System-Design-Interview-QA-3.html">System Design Interview QA - 3</a> — URL shortener, chat system, news feed, video streaming, and more</li>
<li><strong>Design patterns:</strong> <a href="../../posts/design-pattern/Design-Pattern-Interview-QA-1.html">Design Pattern Interview QA - 1</a></li>
<li><strong>Enterprise patterns (Spring, CQRS):</strong> <a href="../../posts/design-pattern/Design-Pattern-Interview-QA-2.html">Design Pattern Interview QA - 2</a></li>
</ul>


</section>

 ]]></description>
  <guid>https://vectoringai.com/posts/system-design/System-Design-Interview-QA-1.html</guid>
  <pubDate>Thu, 21 May 2026 00:00:00 GMT</pubDate>
  <media:content url="https://vectoringai.com/images/system-design/thumb_system_design_interview_qa_300.png" medium="image" type="image/png" height="96" width="144"/>
</item>
<item>
  <title>System Design Interview QA - 2</title>
  <dc:creator>Vectoring AI</dc:creator>
  <link>https://vectoringai.com/posts/system-design/System-Design-Interview-QA-2.html</link>
  <description><![CDATA[ 




<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>This is <strong>Part 2</strong> of our System Design Interview QA series, focusing on <strong>infrastructure components and operational systems</strong> that power production-grade architectures. While Part 1 covered foundational concepts (scalability, CAP theorem, etc.), this article dives deep into <strong>how specific infrastructure components work</strong> and how to design them.</p>
<blockquote class="blockquote">
<p>For foundational concepts (scalability, CAP theorem, APIs), see <a href="../../posts/system-design/System-Design-Interview-QA-1.html">System Design Interview QA - 1</a>. For hands-on design problems (URL shortener, chat system), see <a href="../../posts/system-design/System-Design-Interview-QA-3.html">System Design Interview QA - 3</a>.</p>
</blockquote>
<hr>
</section>
<section id="q1-how-does-load-balancing-work-and-how-do-you-design-a-load-balancer" class="level2">
<h2 class="anchored" data-anchor-id="q1-how-does-load-balancing-work-and-how-do-you-design-a-load-balancer">Q1: How Does Load Balancing Work and How Do You Design a Load Balancer?</h2>
<p><strong>Answer:</strong></p>
<p>A load balancer distributes incoming network traffic across multiple backend servers to ensure no single server is overwhelmed, improving availability, throughput, and fault tolerance.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    CLIENTS["Clients"]
    CLIENTS --&gt; DNS["DNS (Round Robin)&lt;br/&gt;→ multiple LB IPs"]
    DNS --&gt; LB_A["Load Balancer (Active)"]
    DNS --&gt; LB_S["Load Balancer (Standby)&lt;br/&gt;heartbeat monitoring"]

    LB_A --&gt; S1["Server 1 ✅"]
    LB_A --&gt; S2["Server 2 ✅"]
    LB_A --&gt; S3["Server 3 ❌ (unhealthy)"]
    LB_A --&gt; S4["Server 4 ✅"]

    LB_A -.-&gt;|"Health check fails"| S3

    style LB_A fill:#56cc9d,stroke:#333,color:#fff
    style LB_S fill:#ffce67,stroke:#333
    style S3 fill:#ff7851,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="layer-4-vs-layer-7-load-balancing" class="level3">
<h3 class="anchored" data-anchor-id="layer-4-vs-layer-7-load-balancing">Layer 4 vs Layer 7 Load Balancing</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 15%">
<col style="width: 39%">
<col style="width: 45%">
</colgroup>
<thead>
<tr class="header">
<th>Aspect</th>
<th>Layer 4 (Transport)</th>
<th>Layer 7 (Application)</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Operates on</strong></td>
<td>TCP/UDP packets (IP + port)</td>
<td>HTTP headers, URL path, cookies</td>
</tr>
<tr class="even">
<td><strong>Speed</strong></td>
<td>Very fast (no content inspection)</td>
<td>Slower (must parse HTTP)</td>
</tr>
<tr class="odd">
<td><strong>Routing decisions</strong></td>
<td>IP hash, round robin, least connections</td>
<td>URL path, headers, content type</td>
</tr>
<tr class="even">
<td><strong>SSL termination</strong></td>
<td>Passes through (or terminates)</td>
<td>Terminates SSL, inspects content</td>
</tr>
<tr class="odd">
<td><strong>Use case</strong></td>
<td>TCP services, databases, gaming</td>
<td>Web APIs, microservice routing</td>
</tr>
<tr class="even">
<td><strong>Examples</strong></td>
<td>AWS NLB, HAProxy (TCP mode)</td>
<td>AWS ALB, Nginx, Envoy</td>
</tr>
</tbody>
</table>
</section>
<section id="load-balancing-algorithms-deep-dive" class="level3">
<h3 class="anchored" data-anchor-id="load-balancing-algorithms-deep-dive">Load Balancing Algorithms Deep Dive</h3>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    subgraph RR["Round Robin"]
        RR1["Request 1 → Server A"]
        RR2["Request 2 → Server B"]
        RR3["Request 3 → Server C"]
        RR4["Request 4 → Server A"]
    end

    subgraph LC["Least Connections"]
        LC1["Server A: 5 active"]
        LC2["Server B: 2 active ← next request"]
        LC3["Server C: 8 active"]
    end

    subgraph WRR["Weighted Round Robin"]
        WRR1["Server A (weight 5): gets 5 of every 8"]
        WRR2["Server B (weight 2): gets 2 of every 8"]
        WRR3["Server C (weight 1): gets 1 of every 8"]
    end

    style RR fill:#56cc9d,stroke:#333,color:#fff
    style LC fill:#ffce67,stroke:#333
    style WRR fill:#6cc3d5,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<table class="caption-top table">
<colgroup>
<col style="width: 35%">
<col style="width: 32%">
<col style="width: 32%">
</colgroup>
<thead>
<tr class="header">
<th>Algorithm</th>
<th>Best For</th>
<th>Weakness</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Round Robin</strong></td>
<td>Equal-capacity servers, stateless services</td>
<td>Ignores server load</td>
</tr>
<tr class="even">
<td><strong>Weighted Round Robin</strong></td>
<td>Mixed hardware capacities</td>
<td>Static weights, doesn’t adapt</td>
</tr>
<tr class="odd">
<td><strong>Least Connections</strong></td>
<td>Long-lived connections (WebSocket, DB)</td>
<td>May route to slow servers</td>
</tr>
<tr class="even">
<td><strong>Least Response Time</strong></td>
<td>Latency-sensitive services</td>
<td>Requires constant measurement</td>
</tr>
<tr class="odd">
<td><strong>IP Hash</strong></td>
<td>Session affinity without sticky cookies</td>
<td>Uneven with few clients</td>
</tr>
<tr class="even">
<td><strong>Consistent Hashing</strong></td>
<td>Cache distribution (Redis Cluster)</td>
<td>Complex implementation</td>
</tr>
<tr class="odd">
<td><strong>Random</strong></td>
<td>Large server pools, simplicity</td>
<td>Variance with few servers</td>
</tr>
</tbody>
</table>
</section>
<section id="session-persistence-sticky-sessions" class="level3">
<h3 class="anchored" data-anchor-id="session-persistence-sticky-sessions">Session Persistence (Sticky Sessions)</h3>
<pre><code>Problem: User state (shopping cart, login session) lives on one server.
         If next request goes to different server → state lost.

Solutions (from worst to best):
  1. Sticky sessions (cookie/IP-based routing to same server)
     - Simple but defeats load balancing purpose
     - Server failure = lost sessions

  2. Session replication (broadcast sessions to all servers)
     - Network overhead grows O(n²)
     - Memory wasted on every server

  3. Centralized session store (Redis/Memcached) ← RECOMMENDED
     - Any server can handle any request
     - Session stored in Redis with TTL
     - Server failure has zero impact on sessions
     - Scales independently</code></pre>
</section>
<section id="health-check-design" class="level3">
<h3 class="anchored" data-anchor-id="health-check-design">Health Check Design</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 16%">
<col style="width: 29%">
<col style="width: 27%">
<col style="width: 27%">
</colgroup>
<thead>
<tr class="header">
<th>Type</th>
<th>Mechanism</th>
<th>Interval</th>
<th>Use Case</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>TCP check</strong></td>
<td>Can connect to port?</td>
<td>5-10s</td>
<td>Basic availability</td>
</tr>
<tr class="even">
<td><strong>HTTP check</strong></td>
<td><code>GET /health</code> returns 200?</td>
<td>5-10s</td>
<td>Application-level health</td>
</tr>
<tr class="odd">
<td><strong>Deep health check</strong></td>
<td>Checks DB connectivity, disk space, dependencies</td>
<td>30s</td>
<td>Comprehensive readiness</td>
</tr>
</tbody>
</table>
<pre><code>Health check state machine:
  HEALTHY → 3 consecutive failures → UNHEALTHY (remove from pool)
  UNHEALTHY → 2 consecutive successes → HEALTHY (add back to pool)
  
  Drain mode: stop sending new requests, wait for active to complete</code></pre>
<hr>
</section>
</section>
<section id="q2-how-do-you-design-a-caching-system-and-what-caching-strategies-exist" class="level2">
<h2 class="anchored" data-anchor-id="q2-how-do-you-design-a-caching-system-and-what-caching-strategies-exist">Q2: How Do You Design a Caching System and What Caching Strategies Exist?</h2>
<p><strong>Answer:</strong></p>
<p>Caching stores frequently accessed data in fast storage (memory) to reduce latency and database load. A well-designed caching strategy can reduce P99 latency from 100ms to &lt;1ms.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    subgraph Layers["Multi-Layer Caching"]
        CLIENT["Browser Cache&lt;br/&gt;(HTTP cache headers)"]
        CDN["CDN Cache&lt;br/&gt;(static assets, edge)"]
        APP["Application Cache&lt;br/&gt;(Redis / Memcached)"]
        DB_CACHE["Database Cache&lt;br/&gt;(query cache, buffer pool)"]
    end

    CLIENT --&gt; CDN --&gt; APP --&gt; DB_CACHE --&gt; DB["Database"]

    style CLIENT fill:#56cc9d,stroke:#333,color:#fff
    style CDN fill:#ffce67,stroke:#333
    style APP fill:#6cc3d5,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="caching-patterns-implementation-detail" class="level3">
<h3 class="anchored" data-anchor-id="caching-patterns-implementation-detail">Caching Patterns (Implementation Detail)</h3>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph LR
    subgraph CacheAside["Cache-Aside (Lazy Loading)"]
        A1["App checks cache"]
        A1 --&gt;|"miss"| A2["App queries DB"]
        A2 --&gt; A3["App writes to cache"]
    end

    subgraph WriteThrough["Write-Through"]
        B1["App writes to cache"]
        B1 --&gt; B2["Cache writes to DB"]
        B2 --&gt; B3["Confirm to app"]
    end

    subgraph WriteBehind["Write-Behind (Write-Back)"]
        C1["App writes to cache"]
        C1 --&gt; C2["Return immediately"]
        C2 -.-&gt;|"async"| C3["Cache writes to DB later"]
    end

    style CacheAside fill:#56cc9d,stroke:#333,color:#fff
    style WriteThrough fill:#ffce67,stroke:#333
    style WriteBehind fill:#6cc3d5,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<table class="caption-top table">
<colgroup>
<col style="width: 20%">
<col style="width: 29%">
<col style="width: 13%">
<col style="width: 13%">
<col style="width: 22%">
</colgroup>
<thead>
<tr class="header">
<th>Pattern</th>
<th>How It Works</th>
<th>Pros</th>
<th>Cons</th>
<th>Best For</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Cache-aside</strong></td>
<td>App manages cache manually; check cache → miss → query DB → populate cache</td>
<td>Only caches hot data; cache failure non-fatal</td>
<td>Initial requests are slow (cold cache); possible stale data</td>
<td>General purpose, read-heavy</td>
</tr>
<tr class="even">
<td><strong>Write-through</strong></td>
<td>Every write goes to cache AND DB synchronously</td>
<td>Cache always has latest data</td>
<td>Write latency increases (2 writes); caches data that may never be read</td>
<td>Read-after-write consistency</td>
</tr>
<tr class="odd">
<td><strong>Write-behind</strong></td>
<td>Write to cache, async flush to DB</td>
<td>Very fast writes; batch DB writes</td>
<td>Data loss risk if cache crashes before flush</td>
<td>Write-heavy workloads</td>
</tr>
<tr class="even">
<td><strong>Read-through</strong></td>
<td>Cache fetches from DB on miss (cache is the data interface)</td>
<td>Simpler app code</td>
<td>Cache library must support it</td>
<td>When using cache frameworks</td>
</tr>
<tr class="odd">
<td><strong>Refresh-ahead</strong></td>
<td>Proactively refresh cache before TTL expires</td>
<td>No cache miss latency</td>
<td>Wastes resources on rarely accessed keys</td>
<td>Predictable access patterns</td>
</tr>
</tbody>
</table>
</section>
<section id="cache-eviction-policies" class="level3">
<h3 class="anchored" data-anchor-id="cache-eviction-policies">Cache Eviction Policies</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 30%">
<col style="width: 30%">
<col style="width: 38%">
</colgroup>
<thead>
<tr class="header">
<th>Policy</th>
<th>Evicts</th>
<th>Best For</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>LRU</strong> (Least Recently Used)</td>
<td>Item not accessed longest</td>
<td>General purpose (most common)</td>
</tr>
<tr class="even">
<td><strong>LFU</strong> (Least Frequently Used)</td>
<td>Item accessed fewest times</td>
<td>Frequency-based workloads</td>
</tr>
<tr class="odd">
<td><strong>FIFO</strong> (First In First Out)</td>
<td>Oldest inserted item</td>
<td>Simple, time-based freshness</td>
</tr>
<tr class="even">
<td><strong>TTL</strong> (Time-To-Live)</td>
<td>Items past expiration time</td>
<td>Data with known freshness window</td>
</tr>
<tr class="odd">
<td><strong>Random</strong></td>
<td>Random item</td>
<td>When access patterns are uniform</td>
</tr>
</tbody>
</table>
</section>
<section id="redis-vs-memcached" class="level3">
<h3 class="anchored" data-anchor-id="redis-vs-memcached">Redis vs Memcached</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 33%">
<col style="width: 25%">
<col style="width: 40%">
</colgroup>
<thead>
<tr class="header">
<th>Feature</th>
<th>Redis</th>
<th>Memcached</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Data structures</strong></td>
<td>Strings, hashes, lists, sets, sorted sets, streams</td>
<td>Strings only</td>
</tr>
<tr class="even">
<td><strong>Persistence</strong></td>
<td>RDB snapshots + AOF (append-only file)</td>
<td>None (pure cache)</td>
</tr>
<tr class="odd">
<td><strong>Replication</strong></td>
<td>Built-in master-replica</td>
<td>None (client-side)</td>
</tr>
<tr class="even">
<td><strong>Clustering</strong></td>
<td>Redis Cluster (automatic sharding)</td>
<td>Client-side sharding</td>
</tr>
<tr class="odd">
<td><strong>Pub/Sub</strong></td>
<td>Yes</td>
<td>No</td>
</tr>
<tr class="even">
<td><strong>Lua scripting</strong></td>
<td>Yes (atomic operations)</td>
<td>No</td>
</tr>
<tr class="odd">
<td><strong>Memory efficiency</strong></td>
<td>Moderate (overhead per key)</td>
<td>Slab allocator (efficient for uniform sizes)</td>
</tr>
<tr class="even">
<td><strong>Threads</strong></td>
<td>Single-threaded (6.0+ has I/O threads)</td>
<td>Multi-threaded</td>
</tr>
<tr class="odd">
<td><strong>Best for</strong></td>
<td>Complex data, pub/sub, leaderboards, sessions</td>
<td>Simple high-throughput caching</td>
</tr>
</tbody>
</table>
</section>
<section id="cache-stampede-prevention" class="level3">
<h3 class="anchored" data-anchor-id="cache-stampede-prevention">Cache Stampede Prevention</h3>
<pre><code>Problem: Cache key expires → hundreds of requests simultaneously hit DB → DB overload

Solutions:
  1. Lock/mutex: Only one request fetches from DB, others wait
     cache_key = "user:123"
     lock_key = f"lock:{cache_key}"
     if not redis.get(cache_key):
         if redis.set(lock_key, "1", nx=True, ex=5):  # acquire lock
             data = db.query(...)
             redis.set(cache_key, data, ex=300)
             redis.delete(lock_key)
         else:
             wait_for_cache()  # spin until cache populated

  2. Probabilistic early recomputation:
     - Each read checks: should I refresh? (probability increases near TTL)
     - Spreads refresh across time window

  3. Background refresh (refresh-ahead):
     - Background job refreshes popular keys before expiry
     - No stampede possible</code></pre>
<hr>
</section>
</section>
<section id="q3-how-do-message-queues-work-and-when-should-you-use-them" class="level2">
<h2 class="anchored" data-anchor-id="q3-how-do-message-queues-work-and-when-should-you-use-them">Q3: How Do Message Queues Work and When Should You Use Them?</h2>
<p><strong>Answer:</strong></p>
<p>Message queues enable asynchronous communication between services by decoupling producers (senders) from consumers (receivers). They provide buffering, load leveling, and guaranteed delivery.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph LR
    P1["Producer A&lt;br/&gt;(Order Service)"]
    P2["Producer B&lt;br/&gt;(Payment Service)"]
    P1 --&gt; Q["Message Queue&lt;br/&gt;(Kafka / RabbitMQ / SQS)"]
    P2 --&gt; Q
    Q --&gt; C1["Consumer 1&lt;br/&gt;(Email Service)"]
    Q --&gt; C2["Consumer 2&lt;br/&gt;(Analytics Service)"]
    Q --&gt; C3["Consumer 3&lt;br/&gt;(Inventory Service)"]

    style Q fill:#56cc9d,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="when-to-use-a-message-queue" class="level3">
<h3 class="anchored" data-anchor-id="when-to-use-a-message-queue">When to Use a Message Queue</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 29%">
<col style="width: 38%">
<col style="width: 32%">
</colgroup>
<thead>
<tr class="header">
<th>Use Case</th>
<th>Without Queue</th>
<th>With Queue</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Async processing</strong></td>
<td>User waits for email to send (slow)</td>
<td>Return immediately, email sends in background</td>
</tr>
<tr class="even">
<td><strong>Load leveling</strong></td>
<td>Traffic spike crashes service</td>
<td>Queue absorbs spike, consumers process at their pace</td>
</tr>
<tr class="odd">
<td><strong>Decoupling</strong></td>
<td>Service A calls Service B directly (tight coupling)</td>
<td>Service A publishes event, B consumes when ready</td>
</tr>
<tr class="even">
<td><strong>Retry/DLQ</strong></td>
<td>Failed requests are lost</td>
<td>Failed messages retry with backoff, go to dead-letter queue</td>
</tr>
<tr class="odd">
<td><strong>Fan-out</strong></td>
<td>One service calls 5 downstream services</td>
<td>Publish once, 5 consumers process independently</td>
</tr>
</tbody>
</table>
</section>
<section id="kafka-vs-rabbitmq-vs-sqs" class="level3">
<h3 class="anchored" data-anchor-id="kafka-vs-rabbitmq-vs-sqs">Kafka vs RabbitMQ vs SQS</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 21%">
<col style="width: 31%">
<col style="width: 24%">
<col style="width: 21%">
</colgroup>
<thead>
<tr class="header">
<th>Feature</th>
<th>Apache Kafka</th>
<th>RabbitMQ</th>
<th>AWS SQS</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Model</strong></td>
<td>Distributed log (pub/sub + streaming)</td>
<td>Message broker (queues + exchanges)</td>
<td>Managed queue service</td>
</tr>
<tr class="even">
<td><strong>Ordering</strong></td>
<td>Per partition (guaranteed)</td>
<td>Per queue (FIFO mode)</td>
<td>FIFO queues (limited throughput)</td>
</tr>
<tr class="odd">
<td><strong>Throughput</strong></td>
<td>Millions msgs/sec</td>
<td>Tens of thousands msgs/sec</td>
<td>Thousands msgs/sec</td>
</tr>
<tr class="even">
<td><strong>Retention</strong></td>
<td>Configurable (days/weeks/forever)</td>
<td>Until consumed/TTL</td>
<td>14 days max</td>
</tr>
<tr class="odd">
<td><strong>Consumer model</strong></td>
<td>Pull (consumers poll partitions)</td>
<td>Push (broker delivers to consumers)</td>
<td>Pull (long polling)</td>
</tr>
<tr class="even">
<td><strong>Replay</strong></td>
<td>Yes (consumers can re-read from any offset)</td>
<td>No (message gone after ACK)</td>
<td>No</td>
</tr>
<tr class="odd">
<td><strong>Use case</strong></td>
<td>Event streaming, logs, analytics pipeline</td>
<td>Task queues, RPC, routing</td>
<td>Simple async tasks, serverless</td>
</tr>
<tr class="even">
<td><strong>Complexity</strong></td>
<td>High (ZooKeeper/KRaft, partitions, offsets)</td>
<td>Medium (exchanges, bindings)</td>
<td>Low (fully managed)</td>
</tr>
</tbody>
</table>
</section>
<section id="kafka-architecture-deep-dive" class="level3">
<h3 class="anchored" data-anchor-id="kafka-architecture-deep-dive">Kafka Architecture Deep Dive</h3>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    subgraph Producers
        P1["Producer 1"]
        P2["Producer 2"]
    end

    subgraph Kafka["Kafka Cluster"]
        subgraph Topic["Topic: orders (3 partitions)"]
            PART0["Partition 0&lt;br/&gt;[msg1, msg4, msg7...]"]
            PART1["Partition 1&lt;br/&gt;[msg2, msg5, msg8...]"]
            PART2["Partition 2&lt;br/&gt;[msg3, msg6, msg9...]"]
        end
    end

    subgraph ConsumerGroup["Consumer Group: order-processors"]
        C1["Consumer 1&lt;br/&gt;← Partition 0"]
        C2["Consumer 2&lt;br/&gt;← Partition 1"]
        C3["Consumer 3&lt;br/&gt;← Partition 2"]
    end

    P1 --&gt; PART0
    P2 --&gt; PART1
    PART0 --&gt; C1
    PART1 --&gt; C2
    PART2 --&gt; C3

    style Kafka fill:#56cc9d,stroke:#333,color:#fff
    style ConsumerGroup fill:#ffce67,stroke:#333
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
</section>
<section id="delivery-guarantees" class="level3">
<h3 class="anchored" data-anchor-id="delivery-guarantees">Delivery Guarantees</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 22%">
<col style="width: 24%">
<col style="width: 32%">
<col style="width: 22%">
</colgroup>
<thead>
<tr class="header">
<th>Guarantee</th>
<th>Description</th>
<th>How to Achieve</th>
<th>Trade-off</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>At-most-once</strong></td>
<td>Message delivered 0 or 1 times</td>
<td>No retries, fire and forget</td>
<td>May lose messages</td>
</tr>
<tr class="even">
<td><strong>At-least-once</strong></td>
<td>Message delivered 1 or more times</td>
<td>Retry on failure, ACK after processing</td>
<td>May have duplicates</td>
</tr>
<tr class="odd">
<td><strong>Exactly-once</strong></td>
<td>Message delivered exactly 1 time</td>
<td>Idempotent consumers + transactional writes</td>
<td>Complex, slower</td>
</tr>
</tbody>
</table>
<pre><code>Exactly-once in practice:
  - Kafka: Idempotent producer + transactions + consumer offset commit
  - Application-level: Idempotency key in each message
    → Consumer checks: "Have I processed message with ID X?"
    → If yes → skip (dedup)
    → If no → process + record ID in DB (same transaction)</code></pre>
</section>
<section id="dead-letter-queue-dlq" class="level3">
<h3 class="anchored" data-anchor-id="dead-letter-queue-dlq">Dead Letter Queue (DLQ)</h3>
<pre><code>Message processing flow:
  1. Consumer picks up message
  2. Processing fails → retry (exponential backoff: 1s, 5s, 30s, 5min)
  3. After max retries (e.g., 5 attempts) → move to Dead Letter Queue
  4. DLQ messages are inspected manually or by automated systems
  5. Fix the bug → replay DLQ messages back to original queue

Why DLQ matters:
  - Prevents poison messages from blocking the queue
  - Preserves failed messages for debugging
  - Allows retry after fix is deployed</code></pre>
<hr>
</section>
</section>
<section id="q4-how-do-you-design-a-microservices-architecture" class="level2">
<h2 class="anchored" data-anchor-id="q4-how-do-you-design-a-microservices-architecture">Q4: How Do You Design a Microservices Architecture?</h2>
<p><strong>Answer:</strong></p>
<p>Microservices architecture structures an application as a collection of loosely coupled, independently deployable services, each owning its own data and business logic.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    CLIENT["Client"]
    CLIENT --&gt; GW["API Gateway"]
    GW --&gt; US["User Service&lt;br/&gt;(PostgreSQL)"]
    GW --&gt; OS["Order Service&lt;br/&gt;(MySQL)"]
    GW --&gt; PS["Payment Service&lt;br/&gt;(MongoDB)"]
    GW --&gt; NS["Notification Service&lt;br/&gt;(Redis)"]

    OS --&gt;|"Event: order_created"| MQ["Message Bus&lt;br/&gt;(Kafka)"]
    MQ --&gt; PS
    MQ --&gt; NS
    OS --&gt;|"gRPC"| US

    subgraph SD["Service Discovery"]
        REG["Service Registry&lt;br/&gt;(Consul / Eureka)"]
    end

    US --&gt; REG
    OS --&gt; REG
    PS --&gt; REG

    style GW fill:#56cc9d,stroke:#333,color:#fff
    style MQ fill:#ffce67,stroke:#333
    style SD fill:#6cc3d5,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="microservices-design-principles" class="level3">
<h3 class="anchored" data-anchor-id="microservices-design-principles">Microservices Design Principles</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 33%">
<col style="width: 39%">
<col style="width: 27%">
</colgroup>
<thead>
<tr class="header">
<th>Principle</th>
<th>Description</th>
<th>Example</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Single responsibility</strong></td>
<td>Each service does one thing well</td>
<td>User Service only handles user CRUD + auth</td>
</tr>
<tr class="even">
<td><strong>Database per service</strong></td>
<td>No shared databases</td>
<td>Order Service has its own MySQL instance</td>
</tr>
<tr class="odd">
<td><strong>API-first</strong></td>
<td>Define contracts before implementation</td>
<td>OpenAPI spec agreed before coding</td>
</tr>
<tr class="even">
<td><strong>Decentralized governance</strong></td>
<td>Teams choose their own tech stack</td>
<td>User svc in Go, Analytics in Python</td>
</tr>
<tr class="odd">
<td><strong>Design for failure</strong></td>
<td>Assume any service can fail</td>
<td>Circuit breakers, retries, fallbacks</td>
</tr>
<tr class="even">
<td><strong>Smart endpoints, dumb pipes</strong></td>
<td>Logic in services, not in the message bus</td>
<td>Services process events, Kafka just delivers</td>
</tr>
</tbody>
</table>
</section>
<section id="service-communication-patterns" class="level3">
<h3 class="anchored" data-anchor-id="service-communication-patterns">Service Communication Patterns</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 26%">
<col style="width: 17%">
<col style="width: 29%">
<col style="width: 26%">
</colgroup>
<thead>
<tr class="header">
<th>Pattern</th>
<th>Type</th>
<th>Use Case</th>
<th>Example</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>REST/HTTP</strong></td>
<td>Synchronous</td>
<td>Simple CRUD operations</td>
<td>GET /users/123</td>
</tr>
<tr class="even">
<td><strong>gRPC</strong></td>
<td>Synchronous</td>
<td>Low-latency internal calls</td>
<td>Service-to-service with protobuf</td>
</tr>
<tr class="odd">
<td><strong>Event-driven (async)</strong></td>
<td>Asynchronous</td>
<td>Decouple services, eventual consistency</td>
<td>OrderCreated event → Payment, Notification</td>
</tr>
<tr class="even">
<td><strong>Saga</strong></td>
<td>Choreography/Orchestration</td>
<td>Distributed transactions</td>
<td>Order → Payment → Inventory (compensating on failure)</td>
</tr>
</tbody>
</table>
</section>
<section id="saga-pattern-for-distributed-transactions" class="level3">
<h3 class="anchored" data-anchor-id="saga-pattern-for-distributed-transactions">Saga Pattern for Distributed Transactions</h3>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph LR
    subgraph Happy["Happy Path"]
        O1["Create Order&lt;br/&gt;(PENDING)"] --&gt; P1["Reserve Payment"]
        P1 --&gt; I1["Reserve Inventory"]
        I1 --&gt; O2["Confirm Order&lt;br/&gt;(CONFIRMED)"]
    end

    subgraph Compensate["Compensation (on failure)"]
        I_FAIL["Inventory fails"] --&gt; P_COMP["Refund Payment"]
        P_COMP --&gt; O_COMP["Cancel Order"]
    end

    style Happy fill:#56cc9d,stroke:#333,color:#fff
    style Compensate fill:#ff7851,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<table class="caption-top table">
<colgroup>
<col style="width: 30%">
<col style="width: 36%">
<col style="width: 16%">
<col style="width: 16%">
</colgroup>
<thead>
<tr class="header">
<th>Saga Type</th>
<th>Coordination</th>
<th>Pros</th>
<th>Cons</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Choreography</strong></td>
<td>Each service listens to events and acts</td>
<td>Decoupled, no central coordinator</td>
<td>Hard to track overall flow, debugging complex</td>
</tr>
<tr class="even">
<td><strong>Orchestration</strong></td>
<td>Central orchestrator directs the workflow</td>
<td>Easy to understand and monitor</td>
<td>Orchestrator is a single point of failure</td>
</tr>
</tbody>
</table>
</section>
<section id="service-discovery" class="level3">
<h3 class="anchored" data-anchor-id="service-discovery">Service Discovery</h3>
<pre><code>Problem: Services scale dynamically (pods come and go).
         How does Service A find Service B's current address?

Solution: Service Registry
  1. Service starts → registers itself (IP:port) with registry
  2. Service wants to call another → queries registry for addresses
  3. Registry health-checks registered services, removes dead ones
  4. Client-side load balancing across returned addresses

Tools:
  - Consul (HashiCorp) — service mesh + KV store + health checks
  - Eureka (Netflix) — Java-focused, Spring Cloud native
  - Kubernetes DNS — built-in (service-name.namespace.svc.cluster.local)
  - etcd — distributed KV store (used by Kubernetes internally)</code></pre>
<hr>
</section>
</section>
<section id="q5-what-are-database-replication-and-partitioning-strategies" class="level2">
<h2 class="anchored" data-anchor-id="q5-what-are-database-replication-and-partitioning-strategies">Q5: What Are Database Replication and Partitioning Strategies?</h2>
<p><strong>Answer:</strong></p>
<p>Replication and partitioning are the two fundamental mechanisms for scaling databases beyond a single machine, addressing read throughput, write throughput, storage capacity, and availability.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    subgraph Replication["Replication (copies of same data)"]
        PRIMARY["Primary (writes)"]
        PRIMARY --&gt;|"Async/Sync replication"| REP1["Replica 1 (reads)"]
        PRIMARY --&gt;|"Async/Sync replication"| REP2["Replica 2 (reads)"]
        PRIMARY --&gt;|"Async/Sync replication"| REP3["Replica 3 (reads)"]
    end

    subgraph Partitioning["Partitioning / Sharding (split data)"]
        ROUTER["Router"]
        ROUTER --&gt; SHARD1["Shard 1&lt;br/&gt;Users A-H"]
        ROUTER --&gt; SHARD2["Shard 2&lt;br/&gt;Users I-P"]
        ROUTER --&gt; SHARD3["Shard 3&lt;br/&gt;Users Q-Z"]
    end

    style PRIMARY fill:#56cc9d,stroke:#333,color:#fff
    style ROUTER fill:#ffce67,stroke:#333
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="replication-strategies" class="level3">
<h3 class="anchored" data-anchor-id="replication-strategies">Replication Strategies</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 18%">
<col style="width: 23%">
<col style="width: 23%">
<col style="width: 16%">
<col style="width: 18%">
</colgroup>
<thead>
<tr class="header">
<th>Strategy</th>
<th>How It Works</th>
<th>Consistency</th>
<th>Latency</th>
<th>Use Case</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Synchronous</strong></td>
<td>Primary waits for all replicas to ACK</td>
<td>Strong</td>
<td>High (slowest replica)</td>
<td>Financial transactions</td>
</tr>
<tr class="even">
<td><strong>Semi-synchronous</strong></td>
<td>Primary waits for at least 1 replica ACK</td>
<td>Strong (with 1 replica)</td>
<td>Medium</td>
<td>Critical data with some tolerance</td>
</tr>
<tr class="odd">
<td><strong>Asynchronous</strong></td>
<td>Primary doesn’t wait, replicas catch up</td>
<td>Eventual</td>
<td>Low</td>
<td>Read-heavy workloads, analytics</td>
</tr>
</tbody>
</table>
</section>
<section id="replication-topologies" class="level3">
<h3 class="anchored" data-anchor-id="replication-topologies">Replication Topologies</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 28%">
<col style="width: 37%">
<col style="width: 17%">
<col style="width: 17%">
</colgroup>
<thead>
<tr class="header">
<th>Topology</th>
<th>Description</th>
<th>Pros</th>
<th>Cons</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Single-leader</strong></td>
<td>One primary (writes), N replicas (reads)</td>
<td>Simple, no conflicts</td>
<td>Write bottleneck on primary</td>
</tr>
<tr class="even">
<td><strong>Multi-leader</strong></td>
<td>Multiple primaries, each accepts writes</td>
<td>Write scaling, geo-distributed</td>
<td>Conflict resolution needed</td>
</tr>
<tr class="odd">
<td><strong>Leaderless</strong></td>
<td>Any node accepts reads/writes (quorum)</td>
<td>High availability, no failover</td>
<td>Complex, conflict resolution</td>
</tr>
</tbody>
</table>
</section>
<section id="replication-lag-problems" class="level3">
<h3 class="anchored" data-anchor-id="replication-lag-problems">Replication Lag Problems</h3>
<pre><code>Scenario: User updates profile (write to primary), immediately reads (from replica)
Problem: Replica hasn't received the update yet → shows stale data

Solutions:
  1. Read-your-writes consistency:
     → After write, read from primary for N seconds
     → Or track last-write timestamp, read from replica only if up-to-date

  2. Monotonic reads:
     → Always route same user to same replica (sticky reads)
     → Prevents seeing data go "backward"

  3. Causal consistency:
     → Track dependencies between writes
     → Replica only serves reads after all causal dependencies are applied</code></pre>
</section>
<section id="sharding-partitioning-deep-dive" class="level3">
<h3 class="anchored" data-anchor-id="sharding-partitioning-deep-dive">Sharding (Partitioning) Deep Dive</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 47%">
<col style="width: 22%">
<col style="width: 15%">
<col style="width: 15%">
</colgroup>
<thead>
<tr class="header">
<th>Shard Key Strategy</th>
<th>Example</th>
<th>Pros</th>
<th>Cons</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Hash-based</strong></td>
<td><code>shard = hash(user_id) % 4</code></td>
<td>Even distribution</td>
<td>Range queries span all shards</td>
</tr>
<tr class="even">
<td><strong>Range-based</strong></td>
<td><code>shard1: dates Jan-Mar</code></td>
<td>Efficient range scans</td>
<td>Hot shards (recent data accessed most)</td>
</tr>
<tr class="odd">
<td><strong>Geographic</strong></td>
<td><code>shard_us, shard_eu, shard_asia</code></td>
<td>Data locality, compliance</td>
<td>Uneven if one region dominates</td>
</tr>
<tr class="even">
<td><strong>Directory</strong></td>
<td>Lookup table: <code>user123 → shard2</code></td>
<td>Maximum flexibility</td>
<td>Directory is bottleneck/SPOF</td>
</tr>
</tbody>
</table>
</section>
<section id="cross-shard-operations" class="level3">
<h3 class="anchored" data-anchor-id="cross-shard-operations">Cross-Shard Operations</h3>
<pre><code>Challenge: Query that spans multiple shards (e.g., "all orders &gt; $100")

Approaches:
  1. Scatter-gather: Query all shards, merge results (expensive)
  2. Denormalize: Copy needed data into each shard (storage trade-off)
  3. Global index: Secondary index service spans all shards
  4. Avoid: Design schema so most queries hit single shard
     → Shard by user_id, and most queries are user-scoped</code></pre>
<hr>
</section>
</section>
<section id="q6-how-does-kubernetes-work-and-how-do-you-design-for-container-orchestration" class="level2">
<h2 class="anchored" data-anchor-id="q6-how-does-kubernetes-work-and-how-do-you-design-for-container-orchestration">Q6: How Does Kubernetes Work and How Do You Design for Container Orchestration?</h2>
<p><strong>Answer:</strong></p>
<p>Kubernetes (K8s) is a container orchestration platform that automates deployment, scaling, and management of containerized applications. It’s the de facto standard for running microservices in production.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    subgraph ControlPlane["Control Plane"]
        API["API Server&lt;br/&gt;(kube-apiserver)"]
        SCHED["Scheduler&lt;br/&gt;(kube-scheduler)"]
        CM["Controller Manager"]
        ETCD["etcd&lt;br/&gt;(cluster state store)"]
    end

    subgraph WorkerNode["Worker Node 1"]
        KUBELET["kubelet"]
        PROXY["kube-proxy"]
        POD1["Pod A&lt;br/&gt;(Container 1)"]
        POD2["Pod B&lt;br/&gt;(Container 2, Container 3)"]
    end

    subgraph WorkerNode2["Worker Node 2"]
        KUBELET2["kubelet"]
        POD3["Pod C"]
        POD4["Pod D"]
    end

    API --&gt; SCHED
    API --&gt; CM
    API --&gt; ETCD
    API --&gt; KUBELET
    API --&gt; KUBELET2

    style ControlPlane fill:#56cc9d,stroke:#333,color:#fff
    style WorkerNode fill:#ffce67,stroke:#333
    style WorkerNode2 fill:#6cc3d5,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="core-kubernetes-objects" class="level3">
<h3 class="anchored" data-anchor-id="core-kubernetes-objects">Core Kubernetes Objects</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 30%">
<col style="width: 34%">
<col style="width: 34%">
</colgroup>
<thead>
<tr class="header">
<th>Object</th>
<th>Purpose</th>
<th>Example</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Pod</strong></td>
<td>Smallest deployable unit (1+ containers)</td>
<td>Single instance of your app</td>
</tr>
<tr class="even">
<td><strong>Deployment</strong></td>
<td>Manages desired state of Pods (replicas, rolling updates)</td>
<td>“Run 3 replicas of user-service v2”</td>
</tr>
<tr class="odd">
<td><strong>Service</strong></td>
<td>Stable network endpoint for a set of Pods</td>
<td>Load-balanced IP for user-service Pods</td>
</tr>
<tr class="even">
<td><strong>Ingress</strong></td>
<td>External HTTP routing to services</td>
<td><code>api.example.com/users → user-service</code></td>
</tr>
<tr class="odd">
<td><strong>ConfigMap</strong></td>
<td>Non-sensitive configuration</td>
<td>Database host, feature flags</td>
</tr>
<tr class="even">
<td><strong>Secret</strong></td>
<td>Sensitive data (encrypted at rest)</td>
<td>DB passwords, API keys</td>
</tr>
<tr class="odd">
<td><strong>HPA</strong></td>
<td>Horizontal Pod Autoscaler</td>
<td>Scale Pods based on CPU/memory/custom metrics</td>
</tr>
<tr class="even">
<td><strong>PVC</strong></td>
<td>Persistent Volume Claim</td>
<td>Attach storage to stateful Pods</td>
</tr>
</tbody>
</table>
</section>
<section id="deployment-strategies-in-kubernetes" class="level3">
<h3 class="anchored" data-anchor-id="deployment-strategies-in-kubernetes">Deployment Strategies in Kubernetes</h3>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph LR
    subgraph Rolling["Rolling Update (default)"]
        R1["v1 v1 v1"] --&gt; R2["v2 v1 v1"] --&gt; R3["v2 v2 v1"] --&gt; R4["v2 v2 v2"]
    end

    subgraph BlueGreen["Blue-Green"]
        BG1["Blue (v1) ← traffic"] --&gt; BG2["Green (v2) ← traffic"]
    end

    subgraph Canary["Canary"]
        CAN1["v1: 90% traffic&lt;br/&gt;v2: 10% traffic"] --&gt; CAN2["v1: 0%&lt;br/&gt;v2: 100%"]
    end

    style Rolling fill:#56cc9d,stroke:#333,color:#fff
    style BlueGreen fill:#ffce67,stroke:#333
    style Canary fill:#6cc3d5,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<table class="caption-top table">
<colgroup>
<col style="width: 25%">
<col style="width: 33%">
<col style="width: 25%">
<col style="width: 15%">
</colgroup>
<thead>
<tr class="header">
<th>Strategy</th>
<th>How It Works</th>
<th>Rollback</th>
<th>Risk</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Rolling update</strong></td>
<td>Replace Pods one by one</td>
<td>Automatic rollback on failure</td>
<td>Brief period with mixed versions</td>
</tr>
<tr class="even">
<td><strong>Blue-Green</strong></td>
<td>Run two full environments, switch traffic</td>
<td>Instant (switch back to blue)</td>
<td>2x resources during deployment</td>
</tr>
<tr class="odd">
<td><strong>Canary</strong></td>
<td>Route small % of traffic to new version</td>
<td>Instant (route all to old)</td>
<td>Complex routing rules</td>
</tr>
<tr class="even">
<td><strong>A/B testing</strong></td>
<td>Route by user attributes (region, ID)</td>
<td>Instant</td>
<td>Requires feature flag infrastructure</td>
</tr>
</tbody>
</table>
</section>
<section id="resource-management" class="level3">
<h3 class="anchored" data-anchor-id="resource-management">Resource Management</h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb9-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Pod resource specification</span></span>
<span id="cb9-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">resources</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span></span>
<span id="cb9-3"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">  </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">requests</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">           # Guaranteed minimum</span></span>
<span id="cb9-4"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">cpu</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"250m"</span><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">       # 0.25 CPU cores</span></span>
<span id="cb9-5"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">memory</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"256Mi"</span><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">   # 256 MB RAM</span></span>
<span id="cb9-6"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">  </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">limits</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">             # Maximum allowed</span></span>
<span id="cb9-7"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">cpu</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"1000m"</span><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">      # 1 CPU core</span></span>
<span id="cb9-8"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">memory</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"512Mi"</span><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">   # 512 MB RAM</span></span>
<span id="cb9-9"></span>
<span id="cb9-10"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># HPA (Horizontal Pod Autoscaler)</span></span>
<span id="cb9-11"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Scale between 3-10 pods when CPU &gt; 70%</span></span>
<span id="cb9-12"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">minReplicas</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span></span>
<span id="cb9-13"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">maxReplicas</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span></span>
<span id="cb9-14"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">targetCPUUtilizationPercentage</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">70</span></span></code></pre></div></div>
</section>
<section id="kubernetes-networking" class="level3">
<h3 class="anchored" data-anchor-id="kubernetes-networking">Kubernetes Networking</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 50%">
<col style="width: 50%">
</colgroup>
<thead>
<tr class="header">
<th>Concept</th>
<th>Purpose</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>ClusterIP</strong></td>
<td>Internal service (only within cluster)</td>
</tr>
<tr class="even">
<td><strong>NodePort</strong></td>
<td>Expose service on each node’s IP at a static port</td>
</tr>
<tr class="odd">
<td><strong>LoadBalancer</strong></td>
<td>Provision cloud LB (AWS ALB/NLB) for external traffic</td>
</tr>
<tr class="even">
<td><strong>Ingress</strong></td>
<td>L7 routing rules (path-based, host-based)</td>
</tr>
<tr class="odd">
<td><strong>Network Policy</strong></td>
<td>Firewall rules between Pods (default: all-open)</td>
</tr>
<tr class="even">
<td><strong>Service Mesh</strong></td>
<td>Sidecar proxy for mTLS, observability, traffic control</td>
</tr>
</tbody>
</table>
<hr>
</section>
</section>
<section id="q7-how-do-you-design-a-cicd-pipeline" class="level2">
<h2 class="anchored" data-anchor-id="q7-how-do-you-design-a-cicd-pipeline">Q7: How Do You Design a CI/CD Pipeline?</h2>
<p><strong>Answer:</strong></p>
<p>CI/CD (Continuous Integration / Continuous Delivery) automates the process of building, testing, and deploying software. A well-designed pipeline ensures rapid, reliable releases with minimal manual intervention.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph LR
    DEV["Developer&lt;br/&gt;pushes code"]
    DEV --&gt; CI["CI Pipeline"]

    subgraph CI["Continuous Integration"]
        BUILD["Build&lt;br/&gt;(compile, deps)"]
        LINT["Lint &amp;&lt;br/&gt;Static Analysis"]
        TEST["Unit Tests"]
        INT["Integration Tests"]
        SEC["Security Scan&lt;br/&gt;(SAST, deps)"]
        IMG["Build Container&lt;br/&gt;Image"]
    end

    CI --&gt; CD["CD Pipeline"]

    subgraph CD["Continuous Delivery"]
        STAGE["Deploy to&lt;br/&gt;Staging"]
        E2E["E2E Tests&lt;br/&gt;(staging)"]
        APPROVE["Manual Approval&lt;br/&gt;(optional)"]
        PROD["Deploy to&lt;br/&gt;Production"]
        SMOKE["Smoke Tests&lt;br/&gt;(production)"]
    end

    style CI fill:#56cc9d,stroke:#333,color:#fff
    style CD fill:#6cc3d5,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="cicd-pipeline-stages" class="level3">
<h3 class="anchored" data-anchor-id="cicd-pipeline-stages">CI/CD Pipeline Stages</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 18%">
<col style="width: 23%">
<col style="width: 18%">
<col style="width: 39%">
</colgroup>
<thead>
<tr class="header">
<th>Stage</th>
<th>Purpose</th>
<th>Tools</th>
<th>Feedback Time</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Lint / Format</strong></td>
<td>Code style consistency</td>
<td>ESLint, Black, gofmt</td>
<td>&lt; 30s</td>
</tr>
<tr class="even">
<td><strong>Unit Tests</strong></td>
<td>Test individual functions/classes</td>
<td>pytest, JUnit, Jest</td>
<td>1-5 min</td>
</tr>
<tr class="odd">
<td><strong>Build</strong></td>
<td>Compile code, resolve dependencies</td>
<td>Maven, npm, pip</td>
<td>1-3 min</td>
</tr>
<tr class="even">
<td><strong>Integration Tests</strong></td>
<td>Test service interactions</td>
<td>Testcontainers, docker-compose</td>
<td>5-15 min</td>
</tr>
<tr class="odd">
<td><strong>Security Scan (SAST)</strong></td>
<td>Find vulnerabilities in code</td>
<td>Snyk, SonarQube, Semgrep</td>
<td>2-5 min</td>
</tr>
<tr class="even">
<td><strong>Container Build</strong></td>
<td>Build Docker image, push to registry</td>
<td>Docker, Buildah, Kaniko</td>
<td>2-5 min</td>
</tr>
<tr class="odd">
<td><strong>Deploy to Staging</strong></td>
<td>Deploy to pre-production environment</td>
<td>ArgoCD, Helm, Terraform</td>
<td>3-10 min</td>
</tr>
<tr class="even">
<td><strong>E2E Tests</strong></td>
<td>Full user flow tests in staging</td>
<td>Playwright, Cypress, Selenium</td>
<td>10-30 min</td>
</tr>
<tr class="odd">
<td><strong>Deploy to Production</strong></td>
<td>Rolling update / canary / blue-green</td>
<td>ArgoCD, Spinnaker, Flux</td>
<td>5-15 min</td>
</tr>
<tr class="even">
<td><strong>Smoke Tests</strong></td>
<td>Verify critical paths in production</td>
<td>Custom health checks, synthetic monitors</td>
<td>1-3 min</td>
</tr>
</tbody>
</table>
</section>
<section id="cicd-best-practices" class="level3">
<h3 class="anchored" data-anchor-id="cicd-best-practices">CI/CD Best Practices</h3>
<pre><code>Pipeline design principles:
  1. Fast feedback: fail early (lint → unit tests → integration)
  2. Immutable artifacts: build once, deploy to all environments
  3. Environment parity: staging mirrors production
  4. Infrastructure as Code: Terraform/Pulumi for infra changes
  5. GitOps: desired state in Git, reconciler applies it (ArgoCD)
  6. Feature flags: decouple deployment from release
  7. Rollback plan: every deployment has automated rollback trigger

Branch strategy:
  - Trunk-based development (preferred for fast teams):
    → Short-lived feature branches (&lt; 1 day)
    → Merge to main frequently
    → Feature flags hide incomplete work
    → Main is always deployable

  - GitFlow (for teams needing release management):
    → develop → feature branches → release branches → main
    → More overhead, longer release cycles</code></pre>
</section>
<section id="gitops-with-argocd" class="level3">
<h3 class="anchored" data-anchor-id="gitops-with-argocd">GitOps with ArgoCD</h3>
<pre><code>GitOps workflow:
  1. Developer merges PR → main branch
  2. CI pipeline builds image → pushes to registry (e.g., v1.2.3)
  3. CI updates manifest repo (Helm values / kustomize with new image tag)
  4. ArgoCD detects drift between Git manifest and cluster state
  5. ArgoCD applies changes to Kubernetes cluster
  6. If deployment fails health checks → ArgoCD auto-rollback

Benefits:
  - Git is single source of truth
  - Full audit trail (who changed what, when)
  - Easy rollback (git revert)
  - Declarative (describe desired state, not imperative steps)</code></pre>
<hr>
</section>
</section>
<section id="q8-how-do-you-design-a-monitoring-and-observability-system" class="level2">
<h2 class="anchored" data-anchor-id="q8-how-do-you-design-a-monitoring-and-observability-system">Q8: How Do You Design a Monitoring and Observability System?</h2>
<p><strong>Answer:</strong></p>
<p>Observability is the ability to understand a system’s internal state by examining its external outputs. The three pillars are <strong>metrics</strong>, <strong>logs</strong>, and <strong>traces</strong>. Together they enable debugging, alerting, and performance optimization.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    APPS["Applications / Services"]
    APPS --&gt;|"Metrics"| PROM["Prometheus&lt;br/&gt;(time-series DB)"]
    APPS --&gt;|"Logs"| ELK["ELK Stack&lt;br/&gt;(Elasticsearch + Logstash + Kibana)"]
    APPS --&gt;|"Traces"| JAEGER["Jaeger / Zipkin&lt;br/&gt;(distributed tracing)"]

    PROM --&gt; GRAFANA["Grafana&lt;br/&gt;(dashboards)"]
    ELK --&gt; GRAFANA
    PROM --&gt; ALERT["Alertmanager&lt;br/&gt;(PagerDuty, Slack)"]

    style PROM fill:#56cc9d,stroke:#333,color:#fff
    style ELK fill:#ffce67,stroke:#333
    style JAEGER fill:#6cc3d5,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="three-pillars-of-observability" class="level3">
<h3 class="anchored" data-anchor-id="three-pillars-of-observability">Three Pillars of Observability</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 30%">
<col style="width: 23%">
<col style="width: 19%">
<col style="width: 26%">
</colgroup>
<thead>
<tr class="header">
<th>Pillar</th>
<th>What</th>
<th>Why</th>
<th>Tools</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Metrics</strong></td>
<td>Numeric measurements over time (counters, gauges, histograms)</td>
<td>Alerting, capacity planning, SLOs</td>
<td>Prometheus, Datadog, CloudWatch</td>
</tr>
<tr class="even">
<td><strong>Logs</strong></td>
<td>Structured event records</td>
<td>Debugging specific issues, audit trail</td>
<td>ELK, Loki, Splunk, CloudWatch Logs</td>
</tr>
<tr class="odd">
<td><strong>Traces</strong></td>
<td>Request journey across services</td>
<td>Find latency bottlenecks across microservices</td>
<td>Jaeger, Zipkin, AWS X-Ray, OpenTelemetry</td>
</tr>
</tbody>
</table>
</section>
<section id="key-metrics-the-four-golden-signals" class="level3">
<h3 class="anchored" data-anchor-id="key-metrics-the-four-golden-signals">Key Metrics (The Four Golden Signals)</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 16%">
<col style="width: 33%">
<col style="width: 50%">
</colgroup>
<thead>
<tr class="header">
<th>Signal</th>
<th>What to Measure</th>
<th>Alert Threshold Example</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Latency</strong></td>
<td>Time to serve a request (P50, P95, P99)</td>
<td>P99 &gt; 500ms for 5 minutes</td>
</tr>
<tr class="even">
<td><strong>Traffic</strong></td>
<td>Requests per second (QPS)</td>
<td>Sudden drop &gt; 50% (indicates failure)</td>
</tr>
<tr class="odd">
<td><strong>Errors</strong></td>
<td>Error rate (5xx, timeouts, failed operations)</td>
<td>Error rate &gt; 1% for 3 minutes</td>
</tr>
<tr class="even">
<td><strong>Saturation</strong></td>
<td>Resource utilization (CPU, memory, disk, connections)</td>
<td>CPU &gt; 80% for 10 minutes</td>
</tr>
</tbody>
</table>
</section>
<section id="structured-logging" class="level3">
<h3 class="anchored" data-anchor-id="structured-logging">Structured Logging</h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode json code-with-copy"><code class="sourceCode json"><span id="cb12-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">{</span></span>
<span id="cb12-2">  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">"timestamp"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">:</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2026-05-21T10:30:45.123Z"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">,</span></span>
<span id="cb12-3">  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">"level"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">:</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ERROR"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">,</span></span>
<span id="cb12-4">  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">"service"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">:</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"order-service"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">,</span></span>
<span id="cb12-5">  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">"trace_id"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">:</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"abc-123-def-456"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">,</span></span>
<span id="cb12-6">  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">"span_id"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">:</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"span-789"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">,</span></span>
<span id="cb12-7">  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">"user_id"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">:</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"user_42"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">,</span></span>
<span id="cb12-8">  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">"method"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">:</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"POST"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">,</span></span>
<span id="cb12-9">  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">"path"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">:</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"/api/v1/orders"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">,</span></span>
<span id="cb12-10">  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">"status_code"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">:</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">,</span></span>
<span id="cb12-11">  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">"duration_ms"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">:</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2345</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">,</span></span>
<span id="cb12-12">  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">"error"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">:</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ConnectionRefusedError: payment-service:8080"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">,</span></span>
<span id="cb12-13">  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">"message"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">:</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Failed to process payment for order"</span></span>
<span id="cb12-14"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">}</span></span></code></pre></div></div>
</section>
<section id="distributed-tracing" class="level3">
<h3 class="anchored" data-anchor-id="distributed-tracing">Distributed Tracing</h3>
<pre><code>Request: User places order
  
  ┌─ API Gateway (12ms) ─────────────────────────────────────┐
  │  ┌─ Order Service (45ms) ──────────────────────────────┐  │
  │  │  ┌─ User Service (8ms) ────┐                        │  │
  │  │  └─────────────────────────┘                        │  │
  │  │  ┌─ Payment Service (320ms) ← BOTTLENECK ─────────┐│  │
  │  │  │  ┌─ Stripe API (280ms) ───────────────────────┐││  │
  │  │  │  └────────────────────────────────────────────┘││  │
  │  │  └────────────────────────────────────────────────┘│  │
  │  │  ┌─ Inventory Service (15ms)──┐                     │  │
  │  │  └────────────────────────────┘                     │  │
  │  └─────────────────────────────────────────────────────┘  │
  └────────────────────────────────────────────────────────────┘
  Total: 392ms (Payment → Stripe is 71% of total time)</code></pre>
</section>
<section id="slos-slis-and-slas" class="level3">
<h3 class="anchored" data-anchor-id="slos-slis-and-slas">SLOs, SLIs, and SLAs</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 23%">
<col style="width: 42%">
<col style="width: 34%">
</colgroup>
<thead>
<tr class="header">
<th>Term</th>
<th>Definition</th>
<th>Example</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>SLI</strong> (Service Level Indicator)</td>
<td>The metric you measure</td>
<td>P99 latency, availability %, error rate</td>
</tr>
<tr class="even">
<td><strong>SLO</strong> (Service Level Objective)</td>
<td>The target for the SLI</td>
<td>P99 latency &lt; 200ms, 99.9% availability</td>
</tr>
<tr class="odd">
<td><strong>SLA</strong> (Service Level Agreement)</td>
<td>Contract with consequences if SLO breached</td>
<td>99.9% uptime or customer gets credits</td>
</tr>
<tr class="even">
<td><strong>Error Budget</strong></td>
<td>How much failure is allowed before violating SLO</td>
<td>99.9% = 43 minutes downtime/month budget</td>
</tr>
</tbody>
</table>
</section>
<section id="alerting-strategy" class="level3">
<h3 class="anchored" data-anchor-id="alerting-strategy">Alerting Strategy</h3>
<pre><code>Alert design principles:
  1. Alert on symptoms, not causes
     ✅ "Error rate &gt; 5% for 3 min"  (symptom)
     ❌ "CPU &gt; 90%"  (may not impact users)

  2. Severity levels:
     - P1 (Critical): Revenue impacting, page immediately
     - P2 (High): Degraded service, page during business hours
     - P3 (Medium): Non-urgent, ticket in queue
     - P4 (Low): Informational, dashboard only

  3. Reduce noise:
     - Group related alerts
     - Require duration threshold (not single spike)
     - Suppress during maintenance windows
     - Escalation: Slack → PagerDuty → phone call</code></pre>
<hr>
</section>
</section>
<section id="q9-how-does-event-driven-architecture-work" class="level2">
<h2 class="anchored" data-anchor-id="q9-how-does-event-driven-architecture-work">Q9: How Does Event-Driven Architecture Work?</h2>
<p><strong>Answer:</strong></p>
<p>Event-Driven Architecture (EDA) is a design pattern where services communicate by producing and consuming events (facts about something that happened). It enables loose coupling, real-time processing, and scalable async workflows.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    subgraph Producers["Event Producers"]
        US["User Service&lt;br/&gt;→ UserCreated"]
        OS["Order Service&lt;br/&gt;→ OrderPlaced"]
        PS["Payment Service&lt;br/&gt;→ PaymentProcessed"]
    end

    subgraph EventBus["Event Bus / Broker (Kafka)"]
        T1["Topic: user-events"]
        T2["Topic: order-events"]
        T3["Topic: payment-events"]
    end

    subgraph Consumers["Event Consumers"]
        EMAIL["Email Service"]
        ANALYTICS["Analytics Service"]
        INVENTORY["Inventory Service"]
        SEARCH["Search Indexer"]
    end

    US --&gt; T1
    OS --&gt; T2
    PS --&gt; T3
    T1 --&gt; EMAIL
    T1 --&gt; ANALYTICS
    T2 --&gt; INVENTORY
    T2 --&gt; ANALYTICS
    T3 --&gt; EMAIL
    T3 --&gt; SEARCH

    style EventBus fill:#56cc9d,stroke:#333,color:#fff
    style Consumers fill:#ffce67,stroke:#333
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="event-types" class="level3">
<h3 class="anchored" data-anchor-id="event-types">Event Types</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 17%">
<col style="width: 38%">
<col style="width: 26%">
<col style="width: 17%">
</colgroup>
<thead>
<tr class="header">
<th>Type</th>
<th>Description</th>
<th>Example</th>
<th>Size</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Domain Event</strong></td>
<td>Something significant happened in the business</td>
<td><code>OrderPlaced</code>, <code>UserRegistered</code></td>
<td>Small (metadata + IDs)</td>
</tr>
<tr class="even">
<td><strong>Integration Event</strong></td>
<td>Event shared between services (bounded contexts)</td>
<td><code>PaymentCompleted</code> consumed by Order service</td>
<td>Small</td>
</tr>
<tr class="odd">
<td><strong>Event-Carried State Transfer</strong></td>
<td>Event contains full state (eliminates need to query source)</td>
<td><code>OrderPlaced { items: [...], total: 99.50, address: {...} }</code></td>
<td>Large</td>
</tr>
<tr class="even">
<td><strong>Change Data Capture (CDC)</strong></td>
<td>Database changes streamed as events</td>
<td>Debezium captures INSERT/UPDATE/DELETE from DB binlog</td>
<td>Row-level</td>
</tr>
</tbody>
</table>
</section>
<section id="event-sourcing" class="level3">
<h3 class="anchored" data-anchor-id="event-sourcing">Event Sourcing</h3>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph LR
    CMD["Command:&lt;br/&gt;PlaceOrder"]
    CMD --&gt; ES["Event Store&lt;br/&gt;(append-only log)"]
    ES --&gt; E1["OrderCreated"]
    ES --&gt; E2["ItemAdded (x3)"]
    ES --&gt; E3["PaymentReceived"]
    ES --&gt; E4["OrderShipped"]

    ES --&gt;|"Replay events"| STATE["Current State:&lt;br/&gt;Order #123&lt;br/&gt;Status: Shipped&lt;br/&gt;Items: 3&lt;br/&gt;Total: $59.99"]

    style ES fill:#56cc9d,stroke:#333,color:#fff
    style STATE fill:#6cc3d5,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<table class="caption-top table">
<colgroup>
<col style="width: 18%">
<col style="width: 44%">
<col style="width: 37%">
</colgroup>
<thead>
<tr class="header">
<th>Aspect</th>
<th>Traditional (CRUD)</th>
<th>Event Sourcing</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Storage</strong></td>
<td>Current state only</td>
<td>Full history of events</td>
</tr>
<tr class="even">
<td><strong>State</strong></td>
<td>Mutable (UPDATE/DELETE)</td>
<td>Immutable (append-only)</td>
</tr>
<tr class="odd">
<td><strong>Audit trail</strong></td>
<td>Requires separate logging</td>
<td>Built-in (every change is an event)</td>
</tr>
<tr class="even">
<td><strong>Debugging</strong></td>
<td>“Why is it in this state?”</td>
<td>Replay events to see exactly what happened</td>
</tr>
<tr class="odd">
<td><strong>Complexity</strong></td>
<td>Simple CRUD operations</td>
<td>Event replay, projections, eventual consistency</td>
</tr>
<tr class="even">
<td><strong>Best for</strong></td>
<td>Simple domains</td>
<td>Financial systems, audit-heavy, undo/redo needed</td>
</tr>
</tbody>
</table>
</section>
<section id="cqrs-command-query-responsibility-segregation" class="level3">
<h3 class="anchored" data-anchor-id="cqrs-command-query-responsibility-segregation">CQRS (Command Query Responsibility Segregation)</h3>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    CLIENT["Client"]
    CLIENT --&gt;|"Write (Command)"| WRITE["Write Model&lt;br/&gt;(normalized DB)"]
    CLIENT --&gt;|"Read (Query)"| READ["Read Model&lt;br/&gt;(denormalized views)"]

    WRITE --&gt;|"Events"| PROJ["Projection Service"]
    PROJ --&gt; READ

    style WRITE fill:#56cc9d,stroke:#333,color:#fff
    style READ fill:#6cc3d5,stroke:#333,color:#fff
    style PROJ fill:#ffce67,stroke:#333
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<table class="caption-top table">
<colgroup>
<col style="width: 38%">
<col style="width: 61%">
</colgroup>
<thead>
<tr class="header">
<th>Aspect</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Why CQRS</strong></td>
<td>Reads and writes have different performance profiles and scaling needs</td>
</tr>
<tr class="even">
<td><strong>Write side</strong></td>
<td>Normalized, optimized for consistency and validation</td>
</tr>
<tr class="odd">
<td><strong>Read side</strong></td>
<td>Denormalized, pre-computed views optimized for queries</td>
</tr>
<tr class="even">
<td><strong>Sync mechanism</strong></td>
<td>Events from write side update read projections (async)</td>
</tr>
<tr class="odd">
<td><strong>Trade-off</strong></td>
<td>Eventual consistency between write and read models</td>
</tr>
<tr class="even">
<td><strong>Pairs with</strong></td>
<td>Event Sourcing (events feed both write log and read projections)</td>
</tr>
</tbody>
</table>
</section>
<section id="idempotent-event-processing" class="level3">
<h3 class="anchored" data-anchor-id="idempotent-event-processing">Idempotent Event Processing</h3>
<pre><code>Problem: Network failures → events may be delivered multiple times.
         Consumer must handle duplicates safely.

Solutions:
  1. Idempotency key in every event:
     Event: { "id": "evt_abc123", "type": "PaymentReceived", "data": {...} }
     Consumer: 
       - Before processing, check: "Have I seen evt_abc123?"
       - If yes → skip
       - If no → process + record evt_abc123 in processed_events table

  2. Idempotent operations (naturally safe):
     - SET operations (overwrite): last write wins
     - Upsert with same data: same result regardless of count

  3. Transactional outbox pattern:
     - Write business data + event to same DB (single transaction)
     - Background process reads outbox table → publishes to Kafka
     - Guarantees: if data saved, event will eventually publish</code></pre>
<hr>
</section>
</section>
<section id="q10-how-do-you-design-for-service-mesh-and-inter-service-communication" class="level2">
<h2 class="anchored" data-anchor-id="q10-how-do-you-design-for-service-mesh-and-inter-service-communication">Q10: How Do You Design for Service Mesh and Inter-Service Communication?</h2>
<p><strong>Answer:</strong></p>
<p>A service mesh is an infrastructure layer that handles service-to-service communication, providing observability, security (mTLS), and traffic management without changing application code. It’s typically implemented as sidecar proxies alongside each service.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    subgraph PodA["Pod: Order Service"]
        APP_A["Order Service&lt;br/&gt;(application)"]
        PROXY_A["Envoy Sidecar&lt;br/&gt;(proxy)"]
    end

    subgraph PodB["Pod: Payment Service"]
        APP_B["Payment Service&lt;br/&gt;(application)"]
        PROXY_B["Envoy Sidecar&lt;br/&gt;(proxy)"]
    end

    subgraph PodC["Pod: User Service"]
        APP_C["User Service&lt;br/&gt;(application)"]
        PROXY_C["Envoy Sidecar&lt;br/&gt;(proxy)"]
    end

    PROXY_A --&gt;|"mTLS"| PROXY_B
    PROXY_A --&gt;|"mTLS"| PROXY_C

    CP["Control Plane&lt;br/&gt;(Istio / Linkerd)"]
    CP --&gt;|"Config, certs"| PROXY_A
    CP --&gt;|"Config, certs"| PROXY_B
    CP --&gt;|"Config, certs"| PROXY_C

    style CP fill:#56cc9d,stroke:#333,color:#fff
    style PodA fill:#ffce67,stroke:#333
    style PodB fill:#6cc3d5,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="what-a-service-mesh-provides" class="level3">
<h3 class="anchored" data-anchor-id="what-a-service-mesh-provides">What a Service Mesh Provides</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 25%">
<col style="width: 37%">
<col style="width: 37%">
</colgroup>
<thead>
<tr class="header">
<th>Feature</th>
<th>Description</th>
<th>Without Mesh</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>mTLS</strong></td>
<td>Automatic encryption + identity between all services</td>
<td>Each service manages certs manually</td>
</tr>
<tr class="even">
<td><strong>Traffic management</strong></td>
<td>Canary releases, A/B testing, fault injection</td>
<td>Custom load balancer config per service</td>
</tr>
<tr class="odd">
<td><strong>Observability</strong></td>
<td>Automatic metrics, traces, access logs from proxy</td>
<td>Instrument every service manually</td>
</tr>
<tr class="even">
<td><strong>Retries &amp; timeouts</strong></td>
<td>Configurable retry policies per route</td>
<td>Each service implements retry logic</td>
</tr>
<tr class="odd">
<td><strong>Circuit breaking</strong></td>
<td>Auto-stop traffic to failing services</td>
<td>Library-based (Hystrix, resilience4j)</td>
</tr>
<tr class="even">
<td><strong>Rate limiting</strong></td>
<td>Per-service traffic control</td>
<td>Centralized rate limiter service</td>
</tr>
<tr class="odd">
<td><strong>Access control</strong></td>
<td>Policy-based authorization (which service can call which)</td>
<td>Manual firewall rules / code checks</td>
</tr>
</tbody>
</table>
</section>
<section id="service-mesh-comparison" class="level3">
<h3 class="anchored" data-anchor-id="service-mesh-comparison">Service Mesh Comparison</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 21%">
<col style="width: 17%">
<col style="width: 21%">
<col style="width: 39%">
</colgroup>
<thead>
<tr class="header">
<th>Feature</th>
<th>Istio</th>
<th>Linkerd</th>
<th>Consul Connect</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Proxy</strong></td>
<td>Envoy</td>
<td>Linkerd2-proxy (Rust)</td>
<td>Envoy or built-in</td>
</tr>
<tr class="even">
<td><strong>Complexity</strong></td>
<td>High (many CRDs)</td>
<td>Low (lightweight)</td>
<td>Medium</td>
</tr>
<tr class="odd">
<td><strong>Performance</strong></td>
<td>Moderate overhead</td>
<td>Low overhead</td>
<td>Low overhead</td>
</tr>
<tr class="even">
<td><strong>Features</strong></td>
<td>Full-featured (traffic, security, observability)</td>
<td>Core features, simple</td>
<td>Service discovery + mesh</td>
</tr>
<tr class="odd">
<td><strong>Best for</strong></td>
<td>Large orgs needing full control</td>
<td>Teams wanting simplicity</td>
<td>HashiCorp ecosystem users</td>
</tr>
</tbody>
</table>
</section>
<section id="traffic-management-patterns" class="level3">
<h3 class="anchored" data-anchor-id="traffic-management-patterns">Traffic Management Patterns</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 27%">
<col style="width: 27%">
<col style="width: 45%">
</colgroup>
<thead>
<tr class="header">
<th>Pattern</th>
<th>Purpose</th>
<th>Configuration</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Canary</strong></td>
<td>Route 5% traffic to v2, 95% to v1</td>
<td>Weight-based routing</td>
</tr>
<tr class="even">
<td><strong>Header-based routing</strong></td>
<td>Internal testers get v2 via header <code>x-version: canary</code></td>
<td>Match rules on headers</td>
</tr>
<tr class="odd">
<td><strong>Fault injection</strong></td>
<td>Inject 500ms delay to test resilience</td>
<td>Delay/abort rules for testing</td>
</tr>
<tr class="even">
<td><strong>Mirroring</strong></td>
<td>Copy production traffic to test environment</td>
<td>Traffic shadowing (no impact to users)</td>
</tr>
<tr class="odd">
<td><strong>Circuit breaking</strong></td>
<td>Max 100 concurrent requests per service</td>
<td>Connection pool limits</td>
</tr>
<tr class="even">
<td><strong>Retry budget</strong></td>
<td>Max 20% additional requests as retries</td>
<td>Prevent retry storms</td>
</tr>
</tbody>
</table>
</section>
<section id="when-to-use-and-not-use-a-service-mesh" class="level3">
<h3 class="anchored" data-anchor-id="when-to-use-and-not-use-a-service-mesh">When to Use (and NOT Use) a Service Mesh</h3>
<pre><code>Use a service mesh when:
  ✅ Running 10+ microservices in production
  ✅ Need mTLS between all services (zero trust)
  ✅ Want consistent observability without code changes
  ✅ Complex traffic routing (canary, A/B, fault injection)
  ✅ Need policy-based access control

Do NOT use when:
  ❌ Fewer than 5 services (overhead not worth it)
  ❌ Team doesn't have Kubernetes expertise
  ❌ Simple request-response with no special routing
  ❌ Latency-critical paths where sidecar overhead matters (~1-3ms)
  ❌ Monolith or early-stage product</code></pre>
<hr>
</section>
</section>
<section id="summary-table" class="level2">
<h2 class="anchored" data-anchor-id="summary-table">Summary Table</h2>
<table class="caption-top table">
<colgroup>
<col style="width: 13%">
<col style="width: 30%">
<col style="width: 56%">
</colgroup>
<thead>
<tr class="header">
<th>#</th>
<th>Topic</th>
<th>Key Concepts</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>1</td>
<td><strong>Load Balancing</strong></td>
<td>L4 vs L7, algorithms (round robin, least connections, consistent hashing), health checks, sticky sessions</td>
</tr>
<tr class="even">
<td>2</td>
<td><strong>Caching</strong></td>
<td>Cache-aside, write-through, write-behind, eviction policies, Redis vs Memcached, stampede prevention</td>
</tr>
<tr class="odd">
<td>3</td>
<td><strong>Message Queues</strong></td>
<td>Kafka vs RabbitMQ vs SQS, delivery guarantees, DLQ, partitions, consumer groups</td>
</tr>
<tr class="even">
<td>4</td>
<td><strong>Microservices</strong></td>
<td>Service communication, Saga pattern, service discovery, database per service</td>
</tr>
<tr class="odd">
<td>5</td>
<td><strong>Database Scaling</strong></td>
<td>Replication (sync/async), sharding strategies, replication lag, cross-shard queries</td>
</tr>
<tr class="even">
<td>6</td>
<td><strong>Kubernetes</strong></td>
<td>Pods, Deployments, Services, HPA, rolling/blue-green/canary deploys, resource limits</td>
</tr>
<tr class="odd">
<td>7</td>
<td><strong>CI/CD</strong></td>
<td>Pipeline stages, GitOps, ArgoCD, trunk-based development, immutable artifacts</td>
</tr>
<tr class="even">
<td>8</td>
<td><strong>Monitoring</strong></td>
<td>Metrics/Logs/Traces, four golden signals, SLOs, alerting strategy, distributed tracing</td>
</tr>
<tr class="odd">
<td>9</td>
<td><strong>Event-Driven Architecture</strong></td>
<td>Event sourcing, CQRS, CDC, idempotent processing, transactional outbox</td>
</tr>
<tr class="even">
<td>10</td>
<td><strong>Service Mesh</strong></td>
<td>Sidecar proxy (Envoy), mTLS, traffic management, Istio vs Linkerd, when to use</td>
</tr>
</tbody>
</table>
<hr>
</section>
<section id="whats-next" class="level2">
<h2 class="anchored" data-anchor-id="whats-next">What’s Next?</h2>
<p>This article covered infrastructure components and operational patterns. Continue with:</p>
<ul>
<li><strong>Foundational concepts:</strong> <a href="../../posts/system-design/System-Design-Interview-QA-1.html">System Design Interview QA - 1</a> — scalability, CAP theorem, APIs, networking, security</li>
<li><strong>Hands-on design problems:</strong> <a href="../../posts/system-design/System-Design-Interview-QA-3.html">System Design Interview QA - 3</a> — URL shortener, chat system, news feed, video streaming</li>
<li><strong>Design patterns:</strong> <a href="../../posts/design-pattern/Design-Pattern-Interview-QA-1.html">Design Pattern Interview QA - 1</a></li>
<li><strong>Enterprise patterns (Spring, CQRS):</strong> <a href="../../posts/design-pattern/Design-Pattern-Interview-QA-2.html">Design Pattern Interview QA - 2</a></li>
</ul>


</section>

 ]]></description>
  <guid>https://vectoringai.com/posts/system-design/System-Design-Interview-QA-2.html</guid>
  <pubDate>Thu, 21 May 2026 00:00:00 GMT</pubDate>
  <media:content url="https://vectoringai.com/images/system-design/thumb_system_design_interview_qa_300.png" medium="image" type="image/png" height="96" width="144"/>
</item>
<item>
  <title>System Design Interview QA - 3</title>
  <dc:creator>Vectoring AI</dc:creator>
  <link>https://vectoringai.com/posts/system-design/System-Design-Interview-QA-3.html</link>
  <description><![CDATA[ 




<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>This is <strong>Part 3</strong> of our System Design Interview QA series, covering the <strong>10 most frequently asked system design questions</strong> at FAANG+ companies. Each question follows the proven 4-step framework: <strong>Requirements → High-Level Design → Deep Dive → Trade-offs</strong>.</p>
<blockquote class="blockquote">
<p>For foundational concepts, see <a href="../../posts/system-design/System-Design-Interview-QA-1.html">System Design Interview QA - 1</a>. For infrastructure deep dives, see <a href="../../posts/system-design/System-Design-Interview-QA-2.html">System Design Interview QA - 2</a>. For design patterns, see <a href="../../posts/design-pattern/Design-Pattern-Interview-QA-1.html">Design Pattern Interview QA - 1</a>.</p>
</blockquote>
<hr>
</section>
<section id="q1-design-a-url-shortener-tinyurl" class="level2">
<h2 class="anchored" data-anchor-id="q1-design-a-url-shortener-tinyurl">Q1: Design a URL Shortener (TinyURL)</h2>
<p><strong>Answer:</strong></p>
<p>A URL shortener maps long URLs to short, unique aliases (e.g., <code>https://tiny.url/a1b2c3</code>) and redirects users to the original URL.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    USER["User"]
    USER --&gt;|"POST /shorten&lt;br/&gt;{url: 'https://very-long-url.com/...'}"| API["API Service"]
    API --&gt; KEYGEN["Key Generator&lt;br/&gt;(Base62 encoding)"]
    API --&gt; DB["Database&lt;br/&gt;(short_code → original_url)"]
    API --&gt;|"Returns: tiny.url/a1b2c3"| USER

    USER2["Visitor"]
    USER2 --&gt;|"GET /a1b2c3"| LB["Load Balancer"]
    LB --&gt; CACHE["Cache (Redis)&lt;br/&gt;(hot URLs)"]
    CACHE --&gt;|"miss"| DB
    LB --&gt;|"301 Redirect"| USER2

    style API fill:#56cc9d,stroke:#333,color:#fff
    style CACHE fill:#ffce67,stroke:#333
    style DB fill:#6cc3d5,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="requirements" class="level3">
<h3 class="anchored" data-anchor-id="requirements">Requirements</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 31%">
<col style="width: 68%">
</colgroup>
<thead>
<tr class="header">
<th>Type</th>
<th>Requirement</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Functional</strong></td>
<td>Shorten a URL → return short link; Redirect short link → original URL; Optional: custom aliases, expiration, analytics</td>
</tr>
<tr class="even">
<td><strong>Non-functional</strong></td>
<td>Low latency redirects (&lt;100ms); High availability (99.99%); 100M URLs/day write, 10:1 read-to-write ratio</td>
</tr>
<tr class="odd">
<td><strong>Capacity</strong></td>
<td>~1B URLs/year; ~1KB per record → ~1TB storage/year; ~100K reads/sec peak</td>
</tr>
</tbody>
</table>
</section>
<section id="key-design-decisions" class="level3">
<h3 class="anchored" data-anchor-id="key-design-decisions">Key Design Decisions</h3>
<pre><code>Short Code Generation:
  Option A: Hash (MD5/SHA256) → take first 7 chars → collision check
  Option B: Pre-generated key service (counter-based, Base62 encoded)
  Option C: Snowflake ID → Base62 encode

  Recommended: Counter-based with Base62 encoding
    - 7 chars of Base62 = 62^7 = ~3.5 trillion unique codes
    - No collision checking needed
    - Monotonically increasing → good for DB indexing</code></pre>
</section>
<section id="database-schema" class="level3">
<h3 class="anchored" data-anchor-id="database-schema">Database Schema</h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode sql code-with-copy"><code class="sourceCode sql"><span id="cb2-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-- URLs table</span></span>
<span id="cb2-2"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">CREATE</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">TABLE</span> urls (</span>
<span id="cb2-3">    short_code  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">VARCHAR</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span>) <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">PRIMARY</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">KEY</span>,  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-- Base62 encoded</span></span>
<span id="cb2-4">    original_url TEXT <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">NOT</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">NULL</span>,</span>
<span id="cb2-5">    user_id     BIGINT,</span>
<span id="cb2-6">    created_at  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">TIMESTAMP</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">DEFAULT</span> NOW(),</span>
<span id="cb2-7">    expires_at  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">TIMESTAMP</span>,</span>
<span id="cb2-8">    click_count BIGINT <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">DEFAULT</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span></span>
<span id="cb2-9">);</span>
<span id="cb2-10"></span>
<span id="cb2-11"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-- Analytics (separate table for write performance)</span></span>
<span id="cb2-12"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">CREATE</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">TABLE</span> clicks (</span>
<span id="cb2-13">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">id</span>          BIGSERIAL <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">PRIMARY</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">KEY</span>,</span>
<span id="cb2-14">    short_code  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">VARCHAR</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span>),</span>
<span id="cb2-15">    clicked_at  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">TIMESTAMP</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">DEFAULT</span> NOW(),</span>
<span id="cb2-16">    user_agent  TEXT,</span>
<span id="cb2-17">    ip_address  INET,</span>
<span id="cb2-18">    referrer    TEXT</span>
<span id="cb2-19">);</span></code></pre></div></div>
</section>
<section id="trade-offs" class="level3">
<h3 class="anchored" data-anchor-id="trade-offs">Trade-offs</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 21%">
<col style="width: 21%">
<col style="width: 21%">
<col style="width: 34%">
</colgroup>
<thead>
<tr class="header">
<th>Decision</th>
<th>Option A</th>
<th>Option B</th>
<th>Recommendation</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Storage</strong></td>
<td>SQL (PostgreSQL)</td>
<td>NoSQL (DynamoDB)</td>
<td>NoSQL for scale — simple key-value access pattern</td>
</tr>
<tr class="even">
<td><strong>Redirect</strong></td>
<td>301 (permanent)</td>
<td>302 (temporary)</td>
<td>302 if you need analytics; 301 for caching</td>
</tr>
<tr class="odd">
<td><strong>Caching</strong></td>
<td>Cache all</td>
<td>Cache hot URLs only</td>
<td>Cache hot URLs in Redis (80/20 rule)</td>
</tr>
<tr class="even">
<td><strong>ID generation</strong></td>
<td>Centralized counter</td>
<td>Distributed (Snowflake)</td>
<td>Distributed for multi-region</td>
</tr>
</tbody>
</table>
<hr>
</section>
</section>
<section id="q2-design-a-rate-limiter" class="level2">
<h2 class="anchored" data-anchor-id="q2-design-a-rate-limiter">Q2: Design a Rate Limiter</h2>
<p><strong>Answer:</strong></p>
<p>A rate limiter controls the rate of requests a client can send to an API, protecting against abuse, DDoS attacks, and resource exhaustion.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    CLIENT["Client"]
    CLIENT --&gt; RL["Rate Limiter&lt;br/&gt;(middleware / API gateway)"]
    RL --&gt;|"Under limit"| API["API Servers"]
    RL --&gt;|"Over limit"| REJECT["429 Too Many Requests&lt;br/&gt;Retry-After: 30"]
    RL --&gt; STORE["Rules &amp; Counter Store&lt;br/&gt;(Redis)"]

    subgraph Algorithms
        A1["Fixed Window"]
        A2["Sliding Window Log"]
        A3["Sliding Window Counter"]
        A4["Token Bucket"]
        A5["Leaky Bucket"]
    end

    style RL fill:#56cc9d,stroke:#333,color:#fff
    style REJECT fill:#ff7851,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="requirements-1" class="level3">
<h3 class="anchored" data-anchor-id="requirements-1">Requirements</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 31%">
<col style="width: 68%">
</colgroup>
<thead>
<tr class="header">
<th>Type</th>
<th>Requirement</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Functional</strong></td>
<td>Limit requests per client (IP, user ID, API key); Return rate limit headers; Support different limits per endpoint</td>
</tr>
<tr class="even">
<td><strong>Non-functional</strong></td>
<td>Ultra-low latency (&lt;1ms overhead); Distributed (works across multiple servers); Highly available; Accurate counting</td>
</tr>
</tbody>
</table>
</section>
<section id="algorithm-comparison" class="level3">
<h3 class="anchored" data-anchor-id="algorithm-comparison">Algorithm Comparison</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 30%">
<col style="width: 36%">
<col style="width: 16%">
<col style="width: 16%">
</colgroup>
<thead>
<tr class="header">
<th>Algorithm</th>
<th>How It Works</th>
<th>Pros</th>
<th>Cons</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Fixed Window</strong></td>
<td>Count requests in fixed time windows (e.g., per minute)</td>
<td>Simple, low memory</td>
<td>Burst at window boundaries (2x allowed)</td>
</tr>
<tr class="even">
<td><strong>Sliding Window Log</strong></td>
<td>Store timestamp of each request, count in sliding window</td>
<td>Accurate</td>
<td>High memory (stores all timestamps)</td>
</tr>
<tr class="odd">
<td><strong>Sliding Window Counter</strong></td>
<td>Weighted count across current + previous window</td>
<td>Accurate + low memory</td>
<td>Approximate</td>
</tr>
<tr class="even">
<td><strong>Token Bucket</strong></td>
<td>Tokens added at fixed rate, each request consumes one</td>
<td>Allows controlled bursts</td>
<td>Slightly complex</td>
</tr>
<tr class="odd">
<td><strong>Leaky Bucket</strong></td>
<td>Requests queue and process at fixed rate</td>
<td>Smooth output rate</td>
<td>Doesn’t allow bursts</td>
</tr>
</tbody>
</table>
</section>
<section id="token-bucket-design-recommended" class="level3">
<h3 class="anchored" data-anchor-id="token-bucket-design-recommended">Token Bucket Design (Recommended)</h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Redis-based Token Bucket (distributed)</span></span>
<span id="cb3-2"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Key: rate_limit:{client_id}</span></span>
<span id="cb3-3"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Fields: tokens (float), last_refill (timestamp)</span></span>
<span id="cb3-4"></span>
<span id="cb3-5"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">async</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> is_allowed(redis, client_id: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span>, max_tokens: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span>, refill_rate: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">bool</span>:</span>
<span id="cb3-6">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb3-7"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    max_tokens: bucket capacity (e.g., 100)</span></span>
<span id="cb3-8"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    refill_rate: tokens per second (e.g., 10)</span></span>
<span id="cb3-9"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    """</span></span>
<span id="cb3-10">    key <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"rate_limit:</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>client_id<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span></span>
<span id="cb3-11">    now <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> time.time()</span>
<span id="cb3-12"></span>
<span id="cb3-13">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Lua script for atomicity (no race conditions)</span></span>
<span id="cb3-14">    lua_script <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb3-15"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    local tokens = tonumber(redis.call('hget', KEYS[1], 'tokens') or ARGV[1])</span></span>
<span id="cb3-16"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    local last_refill = tonumber(redis.call('hget', KEYS[1], 'last_refill') or ARGV[3])</span></span>
<span id="cb3-17"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    local now = tonumber(ARGV[3])</span></span>
<span id="cb3-18"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    local max_tokens = tonumber(ARGV[1])</span></span>
<span id="cb3-19"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    local refill_rate = tonumber(ARGV[2])</span></span>
<span id="cb3-20"></span>
<span id="cb3-21"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    -- Refill tokens based on elapsed time</span></span>
<span id="cb3-22"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    local elapsed = now - last_refill</span></span>
<span id="cb3-23"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    tokens = math.min(max_tokens, tokens + elapsed * refill_rate)</span></span>
<span id="cb3-24"></span>
<span id="cb3-25"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    if tokens &gt;= 1 then</span></span>
<span id="cb3-26"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">        tokens = tokens - 1</span></span>
<span id="cb3-27"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">        redis.call('hset', KEYS[1], 'tokens', tokens, 'last_refill', now)</span></span>
<span id="cb3-28"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">        redis.call('expire', KEYS[1], 3600)</span></span>
<span id="cb3-29"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">        return 1  -- Allowed</span></span>
<span id="cb3-30"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    else</span></span>
<span id="cb3-31"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">        redis.call('hset', KEYS[1], 'tokens', tokens, 'last_refill', now)</span></span>
<span id="cb3-32"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">        return 0  -- Rejected</span></span>
<span id="cb3-33"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    end</span></span>
<span id="cb3-34"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    """</span></span>
<span id="cb3-35">    result <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">await</span> redis.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">eval</span>(lua_script, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, key, max_tokens, refill_rate, now)</span>
<span id="cb3-36">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> result <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span></span></code></pre></div></div>
</section>
<section id="where-to-place-the-rate-limiter" class="level3">
<h3 class="anchored" data-anchor-id="where-to-place-the-rate-limiter">Where to Place the Rate Limiter</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 45%">
<col style="width: 27%">
<col style="width: 27%">
</colgroup>
<thead>
<tr class="header">
<th>Location</th>
<th>Pros</th>
<th>Cons</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>API Gateway</strong> (recommended)</td>
<td>Centralized, handles all services</td>
<td>Single point of failure</td>
</tr>
<tr class="even">
<td><strong>Middleware</strong> (per service)</td>
<td>Fine-grained, service-specific rules</td>
<td>Each service must implement</td>
</tr>
<tr class="odd">
<td><strong>Client-side</strong></td>
<td>Reduces unnecessary requests</td>
<td>Can be bypassed</td>
</tr>
<tr class="even">
<td><strong>CDN/Edge</strong></td>
<td>Stops attacks before reaching origin</td>
<td>Limited rule flexibility</td>
</tr>
</tbody>
</table>
<hr>
</section>
</section>
<section id="q3-design-a-chatmessaging-system-whatsapp" class="level2">
<h2 class="anchored" data-anchor-id="q3-design-a-chatmessaging-system-whatsapp">Q3: Design a Chat/Messaging System (WhatsApp)</h2>
<p><strong>Answer:</strong></p>
<p>A real-time messaging system supports 1-on-1 and group messaging with delivery guarantees, presence tracking, and message persistence.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    SENDER["Sender (Alice)"]
    SENDER --&gt;|"WebSocket"| GW1["Chat Gateway&lt;br/&gt;(maintains connections)"]
    GW1 --&gt; MQ["Message Queue&lt;br/&gt;(Kafka)"]
    MQ --&gt; ROUTER["Message Router&lt;br/&gt;(fan-out service)"]
    ROUTER --&gt; GW2["Chat Gateway&lt;br/&gt;(Bob's server)"]
    GW2 --&gt;|"WebSocket"| RECEIVER["Receiver (Bob)"]

    MQ --&gt; DB["Message Store&lt;br/&gt;(Cassandra)"]
    ROUTER -.-&gt;|"Bob offline"| PUSH["Push Notification&lt;br/&gt;(APNs / FCM)"]

    style GW1 fill:#56cc9d,stroke:#333,color:#fff
    style MQ fill:#ffce67,stroke:#333
    style DB fill:#6cc3d5,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="requirements-2" class="level3">
<h3 class="anchored" data-anchor-id="requirements-2">Requirements</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 31%">
<col style="width: 68%">
</colgroup>
<thead>
<tr class="header">
<th>Type</th>
<th>Requirement</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Functional</strong></td>
<td>1-on-1 messaging; Group chat (up to 500 members); Sent/Delivered/Read receipts; Online/offline presence; Message history</td>
</tr>
<tr class="even">
<td><strong>Non-functional</strong></td>
<td>Low latency (&lt;300ms end-to-end); High availability (99.99%); Eventual consistency acceptable; 50B messages/day</td>
</tr>
<tr class="odd">
<td><strong>Capacity</strong></td>
<td>500M DAU; 100 messages/user/day; ~100 bytes/message → ~5TB/day</td>
</tr>
</tbody>
</table>
</section>
<section id="key-components" class="level3">
<h3 class="anchored" data-anchor-id="key-components">Key Components</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 35%">
<col style="width: 35%">
<col style="width: 29%">
</colgroup>
<thead>
<tr class="header">
<th>Component</th>
<th>Technology</th>
<th>Purpose</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Connection layer</strong></td>
<td>WebSocket servers</td>
<td>Persistent bidirectional connections</td>
</tr>
<tr class="even">
<td><strong>Message queue</strong></td>
<td>Kafka</td>
<td>Decouple send/receive, handle spikes</td>
</tr>
<tr class="odd">
<td><strong>Message store</strong></td>
<td>Cassandra</td>
<td>Write-heavy, append-only, partitioned by chat_id</td>
</tr>
<tr class="even">
<td><strong>User presence</strong></td>
<td>Redis</td>
<td>Track online/offline status with TTL</td>
</tr>
<tr class="odd">
<td><strong>Push notifications</strong></td>
<td>APNs/FCM</td>
<td>Deliver to offline users</td>
</tr>
<tr class="even">
<td><strong>Media storage</strong></td>
<td>S3 + CDN</td>
<td>Images, videos, voice notes</td>
</tr>
</tbody>
</table>
</section>
<section id="message-delivery-flow" class="level3">
<h3 class="anchored" data-anchor-id="message-delivery-flow">Message Delivery Flow</h3>
<pre><code>1. Alice sends message via WebSocket → Chat Gateway
2. Gateway publishes to Kafka topic (partitioned by chat_id)
3. Message Router consumes from Kafka:
   a. Persist message to Cassandra (status: "sent")
   b. Look up Bob's connection in Session Store (Redis)
   c. If online → push via WebSocket → update status: "delivered"
   d. If offline → send push notification
4. Bob opens app → fetch undelivered messages → update: "delivered"
5. Bob reads message → client sends ACK → update: "read"</code></pre>
</section>
<section id="group-messaging-fan-out" class="level3">
<h3 class="anchored" data-anchor-id="group-messaging-fan-out">Group Messaging Fan-out</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 30%">
<col style="width: 39%">
<col style="width: 30%">
</colgroup>
<thead>
<tr class="header">
<th>Strategy</th>
<th>How It Works</th>
<th>Best For</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Fan-out on write</strong></td>
<td>Copy message to each member’s inbox at send time</td>
<td>Small groups (&lt;100 members)</td>
</tr>
<tr class="even">
<td><strong>Fan-out on read</strong></td>
<td>Store once, recipients pull on connect</td>
<td>Large groups / channels</td>
</tr>
<tr class="odd">
<td><strong>Hybrid</strong></td>
<td>Fan-out on write for small groups, on read for large</td>
<td>Production systems (WhatsApp)</td>
</tr>
</tbody>
</table>
<hr>
</section>
</section>
<section id="q4-design-a-social-media-news-feed-twitterinstagram" class="level2">
<h2 class="anchored" data-anchor-id="q4-design-a-social-media-news-feed-twitterinstagram">Q4: Design a Social Media News Feed (Twitter/Instagram)</h2>
<p><strong>Answer:</strong></p>
<p>A news feed system aggregates and ranks posts from users you follow, delivering a personalized, near real-time content stream.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    POSTER["User Posts Tweet"]
    POSTER --&gt; PS["Post Service"]
    PS --&gt; DB["Post Store"]
    PS --&gt; FANOUT["Fan-out Service"]
    FANOUT --&gt; CACHE["Feed Cache&lt;br/&gt;(per user, Redis)"]

    READER["User Opens Feed"]
    READER --&gt; FS["Feed Service"]
    FS --&gt; CACHE
    FS --&gt; RANK["Ranking Service&lt;br/&gt;(ML model)"]
    RANK --&gt; FEED["Merged &amp;&lt;br/&gt;Ranked Feed"]

    style FANOUT fill:#56cc9d,stroke:#333,color:#fff
    style CACHE fill:#ffce67,stroke:#333
    style RANK fill:#6cc3d5,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="requirements-3" class="level3">
<h3 class="anchored" data-anchor-id="requirements-3">Requirements</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 31%">
<col style="width: 68%">
</colgroup>
<thead>
<tr class="header">
<th>Type</th>
<th>Requirement</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Functional</strong></td>
<td>Create posts (text, images, video); Follow/unfollow users; View personalized news feed; Like, comment, share</td>
</tr>
<tr class="even">
<td><strong>Non-functional</strong></td>
<td>Feed generation &lt;500ms; High availability; 500M DAU; Eventual consistency acceptable</td>
</tr>
<tr class="odd">
<td><strong>Capacity</strong></td>
<td>2 posts/user/day → 1B posts/day; Average 300 followers; Feed shows top 20 posts</td>
</tr>
</tbody>
</table>
</section>
<section id="fan-out-strategy-the-core-decision" class="level3">
<h3 class="anchored" data-anchor-id="fan-out-strategy-the-core-decision">Fan-out Strategy: The Core Decision</h3>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph LR
    subgraph FanOutWrite["Fan-out on Write (Push)"]
        POST1["New Post"] --&gt; COPY["Copy to all&lt;br/&gt;followers' feeds"]
        COPY --&gt; F1["Alice's Feed Cache"]
        COPY --&gt; F2["Bob's Feed Cache"]
        COPY --&gt; F3["Carol's Feed Cache"]
    end

    subgraph FanOutRead["Fan-out on Read (Pull)"]
        OPEN["Open Feed"] --&gt; FETCH["Fetch posts from&lt;br/&gt;all followed users"]
        FETCH --&gt; M1["User A's posts"]
        FETCH --&gt; M2["User B's posts"]
        FETCH --&gt; M3["User C's posts"]
    end

    style FanOutWrite fill:#56cc9d,stroke:#333,color:#fff
    style FanOutRead fill:#6cc3d5,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<table class="caption-top table">
<colgroup>
<col style="width: 14%">
<col style="width: 44%">
<col style="width: 40%">
</colgroup>
<thead>
<tr class="header">
<th>Aspect</th>
<th>Fan-out on Write (Push)</th>
<th>Fan-out on Read (Pull)</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>When</strong></td>
<td>At post creation time</td>
<td>At feed request time</td>
</tr>
<tr class="even">
<td><strong>Latency</strong></td>
<td>Fast reads (pre-computed)</td>
<td>Slow reads (compute on demand)</td>
</tr>
<tr class="odd">
<td><strong>Write cost</strong></td>
<td>High (copy to all followers)</td>
<td>Low (store once)</td>
</tr>
<tr class="even">
<td><strong>Hot users</strong></td>
<td>Celebrity with 50M followers → 50M writes</td>
<td>No write amplification</td>
</tr>
<tr class="odd">
<td><strong>Best for</strong></td>
<td>Normal users (&lt;10K followers)</td>
<td>Celebrities / high-follower users</td>
</tr>
</tbody>
</table>
</section>
<section id="hybrid-approach-twitterinstagrams-actual-design" class="level3">
<h3 class="anchored" data-anchor-id="hybrid-approach-twitterinstagrams-actual-design">Hybrid Approach (Twitter/Instagram’s Actual Design)</h3>
<pre><code>Normal users (&lt;10K followers):
  → Fan-out on write: pre-compute feed for all followers
  → Feed is ready when they open the app

Celebrity users (&gt;10K followers):
  → Fan-out on read: don't pre-compute
  → When user opens feed, merge:
      - Pre-computed feed (from normal users they follow)
      - On-demand fetch (from celebrities they follow)
  → Rank the merged result</code></pre>
</section>
<section id="feed-ranking" class="level3">
<h3 class="anchored" data-anchor-id="feed-ranking">Feed Ranking</h3>
<table class="caption-top table">
<thead>
<tr class="header">
<th>Signal</th>
<th>Weight</th>
<th>Source</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>Recency</td>
<td>High</td>
<td>Post timestamp</td>
</tr>
<tr class="even">
<td>Engagement</td>
<td>High</td>
<td>Likes, comments, shares on the post</td>
</tr>
<tr class="odd">
<td>Relationship</td>
<td>Medium</td>
<td>Interaction history with poster</td>
</tr>
<tr class="even">
<td>Content type</td>
<td>Medium</td>
<td>User preference (video vs text)</td>
</tr>
<tr class="odd">
<td>Diversity</td>
<td>Low</td>
<td>Avoid showing too many posts from one user</td>
</tr>
</tbody>
</table>
<hr>
</section>
</section>
<section id="q5-design-a-file-storage-system-dropboxgoogle-drive" class="level2">
<h2 class="anchored" data-anchor-id="q5-design-a-file-storage-system-dropboxgoogle-drive">Q5: Design a File Storage System (Dropbox/Google Drive)</h2>
<p><strong>Answer:</strong></p>
<p>A cloud file storage system lets users upload, download, and sync files across devices with high reliability and global availability.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    CLIENT["Desktop / Mobile Client"]
    CLIENT --&gt;|"Upload (chunked)"| API["API Gateway"]
    API --&gt; META["Metadata Service"]
    META --&gt; METADB["Metadata DB&lt;br/&gt;(MySQL)"]
    API --&gt; UPLOAD["Upload Service"]
    UPLOAD --&gt; QUEUE["Upload Queue"]
    QUEUE --&gt; STORE["Object Storage&lt;br/&gt;(S3)"]

    CLIENT2["Another Device"]
    CLIENT2 --&gt; SYNC["Sync Service"]
    SYNC --&gt; NOTIFY["Notification Service&lt;br/&gt;(WebSocket / Long Poll)"]
    SYNC --&gt; CDN["CDN&lt;br/&gt;(Download cache)"]

    style API fill:#56cc9d,stroke:#333,color:#fff
    style STORE fill:#6cc3d5,stroke:#333,color:#fff
    style NOTIFY fill:#ffce67,stroke:#333
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="requirements-4" class="level3">
<h3 class="anchored" data-anchor-id="requirements-4">Requirements</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 31%">
<col style="width: 68%">
</colgroup>
<thead>
<tr class="header">
<th>Type</th>
<th>Requirement</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Functional</strong></td>
<td>Upload/download files; Sync across devices; File versioning; Share files/folders</td>
</tr>
<tr class="even">
<td><strong>Non-functional</strong></td>
<td>High reliability (no data loss); Low latency downloads; Support files up to 10GB; 100M users, 1M DAU</td>
</tr>
<tr class="odd">
<td><strong>Capacity</strong></td>
<td>1 file/user/day, avg 5MB → 5TB/day; Total storage: ~1.5PB</td>
</tr>
</tbody>
</table>
</section>
<section id="chunked-upload-design" class="level3">
<h3 class="anchored" data-anchor-id="chunked-upload-design">Chunked Upload Design</h3>
<pre><code>Why chunking?
  - Resume interrupted uploads (mobile networks)
  - Deduplicate at chunk level (save storage)
  - Parallel upload of chunks (faster)
  - Delta sync: only upload changed chunks

Chunk size: 4MB (balance between overhead and resume granularity)

Upload flow:
  1. Client splits file into 4MB chunks
  2. Client computes SHA-256 hash per chunk
  3. Client asks server: "Do you have chunk with hash X?"
     - Yes → skip (deduplication)
     - No → upload chunk
  4. After all chunks uploaded → server assembles file
  5. Server updates metadata DB with file record
  6. Notification service alerts other devices to sync</code></pre>
</section>
<section id="data-model" class="level3">
<h3 class="anchored" data-anchor-id="data-model">Data Model</h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode sql code-with-copy"><code class="sourceCode sql"><span id="cb7-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-- Files metadata</span></span>
<span id="cb7-2"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">CREATE</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">TABLE</span> files (</span>
<span id="cb7-3">    file_id     UUID <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">PRIMARY</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">KEY</span>,</span>
<span id="cb7-4">    user_id     BIGINT <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">NOT</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">NULL</span>,</span>
<span id="cb7-5">    filename    <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">VARCHAR</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">255</span>),</span>
<span id="cb7-6">    path        TEXT,</span>
<span id="cb7-7">    size_bytes  BIGINT,</span>
<span id="cb7-8">    checksum    <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">VARCHAR</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">64</span>),</span>
<span id="cb7-9">    version     <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">INT</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">DEFAULT</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,</span>
<span id="cb7-10">    created_at  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">TIMESTAMP</span>,</span>
<span id="cb7-11">    updated_at  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">TIMESTAMP</span></span>
<span id="cb7-12">);</span>
<span id="cb7-13"></span>
<span id="cb7-14"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-- File chunks (for dedup and resume)</span></span>
<span id="cb7-15"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">CREATE</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">TABLE</span> chunks (</span>
<span id="cb7-16">    chunk_hash  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">VARCHAR</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">64</span>) <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">PRIMARY</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">KEY</span>,  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-- SHA-256</span></span>
<span id="cb7-17">    size_bytes  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">INT</span>,</span>
<span id="cb7-18">    s3_location TEXT,</span>
<span id="cb7-19">    ref_count   <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">INT</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">DEFAULT</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-- for garbage collection</span></span>
<span id="cb7-20">);</span>
<span id="cb7-21"></span>
<span id="cb7-22"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-- File-to-chunk mapping</span></span>
<span id="cb7-23"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">CREATE</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">TABLE</span> file_chunks (</span>
<span id="cb7-24">    file_id     UUID,</span>
<span id="cb7-25">    chunk_index <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">INT</span>,</span>
<span id="cb7-26">    chunk_hash  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">VARCHAR</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">64</span>),</span>
<span id="cb7-27">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">PRIMARY</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">KEY</span> (file_id, chunk_index)</span>
<span id="cb7-28">);</span></code></pre></div></div>
</section>
<section id="sync-and-conflict-resolution" class="level3">
<h3 class="anchored" data-anchor-id="sync-and-conflict-resolution">Sync and Conflict Resolution</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 34%">
<col style="width: 65%">
</colgroup>
<thead>
<tr class="header">
<th>Scenario</th>
<th>Resolution Strategy</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>Same file edited on 2 devices</td>
<td>Create conflict copy with timestamp</td>
</tr>
<tr class="even">
<td>File deleted on one device, edited on another</td>
<td>Keep the edited version, log deletion</td>
</tr>
<tr class="odd">
<td>Concurrent uploads of same new file</td>
<td>Last-write-wins or merge (depends on file type)</td>
</tr>
<tr class="even">
<td>Offline edits</td>
<td>Queue changes locally, sync when online</td>
</tr>
</tbody>
</table>
<hr>
</section>
</section>
<section id="q6-design-a-video-streaming-platform-youtubenetflix" class="level2">
<h2 class="anchored" data-anchor-id="q6-design-a-video-streaming-platform-youtubenetflix">Q6: Design a Video Streaming Platform (YouTube/Netflix)</h2>
<p><strong>Answer:</strong></p>
<p>A video streaming platform handles upload, transcoding, storage, and adaptive delivery of video content to millions of concurrent viewers.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    CREATOR["Content Creator"]
    CREATOR --&gt;|"Upload video"| UPLOAD["Upload Service"]
    UPLOAD --&gt; QUEUE["Transcoding Queue&lt;br/&gt;(SQS/Kafka)"]
    QUEUE --&gt; TRANSCODE["Transcoding Workers&lt;br/&gt;(multiple resolutions)"]
    TRANSCODE --&gt; STORE["Object Storage&lt;br/&gt;(S3 / GCS)"]
    STORE --&gt; CDN["CDN&lt;br/&gt;(Edge servers worldwide)"]

    VIEWER["Viewer"]
    VIEWER --&gt;|"Adaptive bitrate"| CDN
    VIEWER --&gt; API["API Service&lt;br/&gt;(metadata, search, recommendations)"]
    API --&gt; METADB["Metadata DB"]

    style TRANSCODE fill:#56cc9d,stroke:#333,color:#fff
    style CDN fill:#ffce67,stroke:#333
    style STORE fill:#6cc3d5,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="requirements-5" class="level3">
<h3 class="anchored" data-anchor-id="requirements-5">Requirements</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 31%">
<col style="width: 68%">
</colgroup>
<thead>
<tr class="header">
<th>Type</th>
<th>Requirement</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Functional</strong></td>
<td>Upload videos; Stream videos (adaptive bitrate); Search and browse; Like, comment, subscribe</td>
</tr>
<tr class="even">
<td><strong>Non-functional</strong></td>
<td>Low startup latency (&lt;2s); Smooth playback (no buffering); Global availability; 1B DAU, 5M videos uploaded/day</td>
</tr>
<tr class="odd">
<td><strong>Capacity</strong></td>
<td>Avg video: 200MB raw → 500MB transcoded (multiple resolutions); ~1PB new storage/day</td>
</tr>
</tbody>
</table>
</section>
<section id="video-processing-pipeline" class="level3">
<h3 class="anchored" data-anchor-id="video-processing-pipeline">Video Processing Pipeline</h3>
<pre><code>Upload → Original Storage → Transcoding → CDN Distribution

Transcoding outputs (per video):
  ┌────────────────────────────────────────┐
  │ Resolution   Bitrate    File Size      │
  │ 360p         800 kbps   ~50MB          │
  │ 480p         1.5 Mbps   ~100MB         │
  │ 720p         3 Mbps     ~200MB         │
  │ 1080p        6 Mbps     ~400MB         │
  │ 4K           20 Mbps    ~1.5GB         │
  └────────────────────────────────────────┘
  + Audio tracks (multiple languages)
  + Subtitles (multiple languages)
  + Thumbnail generation (every 10s for preview)</code></pre>
</section>
<section id="adaptive-bitrate-streaming-abr" class="level3">
<h3 class="anchored" data-anchor-id="adaptive-bitrate-streaming-abr">Adaptive Bitrate Streaming (ABR)</h3>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph LR
    CLIENT["Video Player"]
    CLIENT --&gt;|"Measures bandwidth"| ABR["ABR Algorithm&lt;br/&gt;(DASH / HLS)"]
    ABR --&gt;|"Good network"| HD["1080p chunks"]
    ABR --&gt;|"Poor network"| SD["480p chunks"]
    ABR --&gt;|"Very poor"| LOW["360p chunks"]

    style ABR fill:#56cc9d,stroke:#333,color:#fff
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<table class="caption-top table">
<colgroup>
<col style="width: 31%">
<col style="width: 28%">
<col style="width: 40%">
</colgroup>
<thead>
<tr class="header">
<th>Protocol</th>
<th>Used By</th>
<th>Segment Size</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>HLS</strong> (HTTP Live Streaming)</td>
<td>Apple, most platforms</td>
<td>2-10s segments</td>
</tr>
<tr class="even">
<td><strong>DASH</strong> (Dynamic Adaptive Streaming)</td>
<td>YouTube, Netflix</td>
<td>2-10s segments</td>
</tr>
</tbody>
</table>
</section>
<section id="cdn-strategy" class="level3">
<h3 class="anchored" data-anchor-id="cdn-strategy">CDN Strategy</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 43%">
<col style="width: 56%">
</colgroup>
<thead>
<tr class="header">
<th>Approach</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Push popular content</strong></td>
<td>Pre-load trending videos to edge servers</td>
</tr>
<tr class="even">
<td><strong>Pull on demand</strong></td>
<td>Edge fetches from origin on first request, then caches</td>
</tr>
<tr class="odd">
<td><strong>Regional origin</strong></td>
<td>Multiple origin servers in different regions</td>
</tr>
<tr class="even">
<td><strong>Long tail</strong></td>
<td>Less popular content served from fewer / central CDN nodes</td>
</tr>
</tbody>
</table>
<hr>
</section>
</section>
<section id="q7-design-a-notification-system" class="level2">
<h2 class="anchored" data-anchor-id="q7-design-a-notification-system">Q7: Design a Notification System</h2>
<p><strong>Answer:</strong></p>
<p>A notification system delivers timely, relevant notifications across multiple channels (push, SMS, email) to billions of users.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    TRIGGER["Event Triggers&lt;br/&gt;(order shipped, new follower, etc.)"]
    TRIGGER --&gt; NS["Notification Service"]
    NS --&gt; PREF["User Preferences&lt;br/&gt;(channels, frequency, opt-outs)"]
    NS --&gt; TEMPLATE["Template Service&lt;br/&gt;(personalize message)"]
    NS --&gt; QUEUE["Priority Queues&lt;br/&gt;(Kafka / SQS)"]

    QUEUE --&gt; PUSH["Push Worker&lt;br/&gt;(APNs / FCM)"]
    QUEUE --&gt; SMS["SMS Worker&lt;br/&gt;(Twilio)"]
    QUEUE --&gt; EMAIL["Email Worker&lt;br/&gt;(SES / SendGrid)"]

    PUSH --&gt; USER["User Device"]
    SMS --&gt; USER
    EMAIL --&gt; USER

    style NS fill:#56cc9d,stroke:#333,color:#fff
    style QUEUE fill:#ffce67,stroke:#333
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="requirements-6" class="level3">
<h3 class="anchored" data-anchor-id="requirements-6">Requirements</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 31%">
<col style="width: 68%">
</colgroup>
<thead>
<tr class="header">
<th>Type</th>
<th>Requirement</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Functional</strong></td>
<td>Multi-channel: push, SMS, email, in-app; User preferences and opt-out; Notification templates; Rate limiting per user</td>
</tr>
<tr class="even">
<td><strong>Non-functional</strong></td>
<td>Soft real-time (&lt;30s for push, minutes for email); At-least-once delivery; 10B notifications/day; Pluggable providers</td>
</tr>
</tbody>
</table>
</section>
<section id="architecture-deep-dive" class="level3">
<h3 class="anchored" data-anchor-id="architecture-deep-dive">Architecture Deep Dive</h3>
<pre><code>Event flow:
  1. Service emits event: {"type": "order_shipped", "user_id": 123, "data": {...}}
  2. Notification Service:
     a. Check user preferences (opted-in channels, quiet hours)
     b. Check rate limits (max 5 push/hour per user)
     c. Render template with user data
     d. Enqueue to channel-specific queues with priority
  3. Channel workers:
     a. Dequeue message
     b. Call provider API (APNs, Twilio, SES)
     c. Handle retries with exponential backoff
     d. Log delivery status
  4. Analytics:
     - Track: sent, delivered, opened, clicked, unsubscribed</code></pre>
</section>
<section id="handling-scale-and-reliability" class="level3">
<h3 class="anchored" data-anchor-id="handling-scale-and-reliability">Handling Scale and Reliability</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 55%">
<col style="width: 45%">
</colgroup>
<thead>
<tr class="header">
<th>Challenge</th>
<th>Solution</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>Provider failures</td>
<td>Retry with exponential backoff + fallback providers</td>
</tr>
<tr class="even">
<td>Duplicate notifications</td>
<td>Idempotency key per notification (dedup in Redis)</td>
</tr>
<tr class="odd">
<td>Quiet hours / time zones</td>
<td>Store user timezone, schedule delivery accordingly</td>
</tr>
<tr class="even">
<td>Notification fatigue</td>
<td>Rate limiting + batching (digest emails)</td>
</tr>
<tr class="odd">
<td>Provider rate limits</td>
<td>Queue with controlled concurrency per provider</td>
</tr>
<tr class="even">
<td>Delivery tracking</td>
<td>Webhook callbacks from providers + polling</td>
</tr>
</tbody>
</table>
<hr>
</section>
</section>
<section id="q8-design-a-search-autocomplete-system" class="level2">
<h2 class="anchored" data-anchor-id="q8-design-a-search-autocomplete-system">Q8: Design a Search Autocomplete System</h2>
<p><strong>Answer:</strong></p>
<p>An autocomplete system suggests query completions in real time as users type, based on popularity, personalization, and recency.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    USER["User types: 'syst'"]
    USER --&gt; API["Autocomplete API&lt;br/&gt;(&lt;100ms response)"]
    API --&gt; CACHE["Local Cache&lt;br/&gt;(per server)"]
    CACHE --&gt;|"miss"| TRIE["Trie Service&lt;br/&gt;(in-memory)"]
    TRIE --&gt; RESULTS["Top-K results:&lt;br/&gt;1. system design&lt;br/&gt;2. systems programming&lt;br/&gt;3. systematic review"]

    subgraph Offline["Offline Pipeline (hourly)"]
        LOGS["Search Logs"] --&gt; AGG["Aggregation&lt;br/&gt;(MapReduce)"]
        AGG --&gt; BUILD["Build Trie&lt;br/&gt;(top queries)"]
        BUILD --&gt; DEPLOY["Deploy to&lt;br/&gt;Trie Servers"]
    end

    style API fill:#56cc9d,stroke:#333,color:#fff
    style TRIE fill:#6cc3d5,stroke:#333,color:#fff
    style Offline fill:#ffce67,stroke:#333
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="requirements-7" class="level3">
<h3 class="anchored" data-anchor-id="requirements-7">Requirements</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 31%">
<col style="width: 68%">
</colgroup>
<thead>
<tr class="header">
<th>Type</th>
<th>Requirement</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Functional</strong></td>
<td>Return top 5-10 suggestions per prefix; Rank by popularity / recency / personalization; Handle misspellings (fuzzy match)</td>
</tr>
<tr class="even">
<td><strong>Non-functional</strong></td>
<td>P99 latency &lt;100ms; Support 100K QPS; Update suggestions without downtime</td>
</tr>
</tbody>
</table>
</section>
<section id="trie-data-structure" class="level3">
<h3 class="anchored" data-anchor-id="trie-data-structure">Trie Data Structure</h3>
<pre><code>Trie for ["system", "systems", "syslog", "syntax"]:

         root
          |
          s
          |
          y
         / \
        s    n
        |    |
        t    t
        |    |
        e    a
        |    |
        m    x
       /
      s

Each node stores:
  - Character
  - Top-K queries passing through this prefix
  - Frequency / score for ranking</code></pre>
</section>
<section id="two-phase-architecture" class="level3">
<h3 class="anchored" data-anchor-id="two-phase-architecture">Two-Phase Architecture</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 18%">
<col style="width: 28%">
<col style="width: 23%">
<col style="width: 28%">
</colgroup>
<thead>
<tr class="header">
<th>Phase</th>
<th>Component</th>
<th>Latency</th>
<th>Frequency</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Online</strong> (serve)</td>
<td>Trie servers + cache</td>
<td>&lt;100ms</td>
<td>Per keystroke</td>
</tr>
<tr class="even">
<td><strong>Offline</strong> (build)</td>
<td>MapReduce + Trie builder</td>
<td>Minutes</td>
<td>Every 15-60 min</td>
</tr>
</tbody>
</table>
</section>
<section id="ranking-signals" class="level3">
<h3 class="anchored" data-anchor-id="ranking-signals">Ranking Signals</h3>
<table class="caption-top table">
<thead>
<tr class="header">
<th>Signal</th>
<th>Description</th>
<th>Weight</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>Query frequency</td>
<td>How often this query is searched</td>
<td>High</td>
</tr>
<tr class="even">
<td>Recency</td>
<td>Trending queries weighted higher</td>
<td>Medium</td>
</tr>
<tr class="odd">
<td>Personalization</td>
<td>User’s past search history</td>
<td>Medium</td>
</tr>
<tr class="even">
<td>Freshness</td>
<td>New events (e.g., breaking news)</td>
<td>Variable</td>
</tr>
</tbody>
</table>
<hr>
</section>
</section>
<section id="q9-design-a-distributed-key-value-store" class="level2">
<h2 class="anchored" data-anchor-id="q9-design-a-distributed-key-value-store">Q9: Design a Distributed Key-Value Store</h2>
<p><strong>Answer:</strong></p>
<p>A distributed key-value store provides fast, reliable storage and retrieval of data across a cluster of machines, handling partitioning, replication, and failure recovery.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    CLIENT["Client"]
    CLIENT --&gt; COORD["Coordinator Node&lt;br/&gt;(routes to correct partition)"]
    COORD --&gt; N1["Node 1&lt;br/&gt;(keys A-H)"]
    COORD --&gt; N2["Node 2&lt;br/&gt;(keys I-P)"]
    COORD --&gt; N3["Node 3&lt;br/&gt;(keys Q-Z)"]

    N1 --&gt; R1["Replica 1a"]
    N1 --&gt; R2["Replica 1b"]

    subgraph Ring["Consistent Hashing Ring"]
        H1["Hash(key) →&lt;br/&gt;walk clockwise →&lt;br/&gt;find node"]
    end

    style COORD fill:#56cc9d,stroke:#333,color:#fff
    style Ring fill:#ffce67,stroke:#333
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="requirements-8" class="level3">
<h3 class="anchored" data-anchor-id="requirements-8">Requirements</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 31%">
<col style="width: 68%">
</colgroup>
<thead>
<tr class="header">
<th>Type</th>
<th>Requirement</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Functional</strong></td>
<td><code>get(key) → value</code>; <code>put(key, value)</code>; <code>delete(key)</code>; Support arbitrary value sizes (up to 1MB)</td>
</tr>
<tr class="even">
<td><strong>Non-functional</strong></td>
<td>High availability (AP system); Tunable consistency; Low latency (&lt;10ms P99); Horizontal scaling (add nodes without downtime)</td>
</tr>
</tbody>
</table>
</section>
<section id="cap-theorem-trade-offs" class="level3">
<h3 class="anchored" data-anchor-id="cap-theorem-trade-offs">CAP Theorem Trade-offs</h3>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    CAP["CAP Theorem:&lt;br/&gt;Pick 2 of 3"]
    CAP --&gt; C["Consistency&lt;br/&gt;(every read gets latest write)"]
    CAP --&gt; A["Availability&lt;br/&gt;(every request gets a response)"]
    CAP --&gt; P["Partition Tolerance&lt;br/&gt;(works despite network failures)"]

    C --- CP["CP Systems:&lt;br/&gt;MongoDB, HBase, Redis Cluster"]
    A --- AP["AP Systems:&lt;br/&gt;Cassandra, DynamoDB, CouchDB"]

    style CAP fill:#56cc9d,stroke:#333,color:#fff
    style CP fill:#6cc3d5,stroke:#333,color:#fff
    style AP fill:#ffce67,stroke:#333
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
</section>
<section id="key-design-components" class="level3">
<h3 class="anchored" data-anchor-id="key-design-components">Key Design Components</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 30%">
<col style="width: 38%">
<col style="width: 30%">
</colgroup>
<thead>
<tr class="header">
<th>Component</th>
<th>Design Choice</th>
<th>Rationale</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Partitioning</strong></td>
<td>Consistent hashing with virtual nodes</td>
<td>Even distribution, minimal reshuffling when nodes join/leave</td>
</tr>
<tr class="even">
<td><strong>Replication</strong></td>
<td>Replicate to N=3 clockwise neighbors</td>
<td>Fault tolerance</td>
</tr>
<tr class="odd">
<td><strong>Consistency</strong></td>
<td>Quorum: W + R &gt; N (configurable)</td>
<td>Tunable: W=1,R=3 (fast writes) or W=2,R=2 (balanced)</td>
</tr>
<tr class="even">
<td><strong>Conflict resolution</strong></td>
<td>Vector clocks + last-write-wins</td>
<td>Handle concurrent writes during partitions</td>
</tr>
<tr class="odd">
<td><strong>Failure detection</strong></td>
<td>Gossip protocol</td>
<td>Decentralized, scalable node health checks</td>
</tr>
<tr class="even">
<td><strong>Write path</strong></td>
<td>Write-ahead log → MemTable → SSTable</td>
<td>Fast writes, durable, efficient reads</td>
</tr>
</tbody>
</table>
</section>
<section id="consistency-levels-dynamodbcassandra-style" class="level3">
<h3 class="anchored" data-anchor-id="consistency-levels-dynamodbcassandra-style">Consistency Levels (DynamoDB/Cassandra Style)</h3>
<table class="caption-top table">
<thead>
<tr class="header">
<th>Setting</th>
<th>Write (W)</th>
<th>Read (R)</th>
<th>Behavior</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Strong</strong></td>
<td>W=N</td>
<td>R=1 or W=1, R=N</td>
<td>Always latest value</td>
</tr>
<tr class="even">
<td><strong>Quorum</strong></td>
<td>W=2, R=2 (N=3)</td>
<td></td>
<td>Latest value if no concurrent writes</td>
</tr>
<tr class="odd">
<td><strong>Eventual</strong></td>
<td>W=1</td>
<td>R=1</td>
<td>Fastest, may read stale</td>
</tr>
</tbody>
</table>
<hr>
</section>
</section>
<section id="q10-design-an-api-gateway-and-load-balancer" class="level2">
<h2 class="anchored" data-anchor-id="q10-design-an-api-gateway-and-load-balancer">Q10: Design an API Gateway and Load Balancer</h2>
<p><strong>Answer:</strong></p>
<p>An API Gateway is the single entry point for all client requests, handling routing, authentication, rate limiting, and protocol translation. A Load Balancer distributes traffic across backend servers for high availability and throughput.</p>
<div class="cell" data-layout-align="default">
<div class="cell-output-display">
<div>
<p></p><figure class="figure"><p></p>
<div>
<pre class="mermaid mermaid-js">graph TD
    CLIENTS["Clients&lt;br/&gt;(Web, Mobile, Partners)"]
    CLIENTS --&gt; GW["API Gateway"]
    GW --&gt; AUTH["Auth Plugin&lt;br/&gt;(JWT / OAuth)"]
    GW --&gt; RL["Rate Limiter"]
    GW --&gt; ROUTE["Request Router"]
    GW --&gt; TRANSFORM["Protocol Translation&lt;br/&gt;(REST ↔ gRPC)"]

    ROUTE --&gt; LB1["Load Balancer&lt;br/&gt;(User Service)"]
    ROUTE --&gt; LB2["Load Balancer&lt;br/&gt;(Order Service)"]
    ROUTE --&gt; LB3["Load Balancer&lt;br/&gt;(Search Service)"]

    LB1 --&gt; US1["User Svc 1"]
    LB1 --&gt; US2["User Svc 2"]
    LB1 --&gt; US3["User Svc 3"]

    style GW fill:#56cc9d,stroke:#333,color:#fff
    style LB1 fill:#ffce67,stroke:#333
</pre>
</div>
<p></p></figure><p></p>
</div>
</div>
</div>
<section id="api-gateway-responsibilities" class="level3">
<h3 class="anchored" data-anchor-id="api-gateway-responsibilities">API Gateway Responsibilities</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 43%">
<col style="width: 56%">
</colgroup>
<thead>
<tr class="header">
<th>Function</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Routing</strong></td>
<td>Route <code>/api/users/*</code> → User Service, <code>/api/orders/*</code> → Order Service</td>
</tr>
<tr class="even">
<td><strong>Authentication</strong></td>
<td>Validate JWT/OAuth tokens before forwarding</td>
</tr>
<tr class="odd">
<td><strong>Rate limiting</strong></td>
<td>Per-client, per-endpoint throttling</td>
</tr>
<tr class="even">
<td><strong>Request transformation</strong></td>
<td>REST ↔︎ gRPC, request/response rewriting</td>
</tr>
<tr class="odd">
<td><strong>Circuit breaker</strong></td>
<td>Stop forwarding to unhealthy services</td>
</tr>
<tr class="even">
<td><strong>Caching</strong></td>
<td>Cache GET responses for static/semi-static data</td>
</tr>
<tr class="odd">
<td><strong>Logging &amp; metrics</strong></td>
<td>Centralized request logging, latency tracking</td>
</tr>
<tr class="even">
<td><strong>SSL termination</strong></td>
<td>Handle HTTPS at the edge</td>
</tr>
</tbody>
</table>
</section>
<section id="load-balancing-algorithms" class="level3">
<h3 class="anchored" data-anchor-id="load-balancing-algorithms">Load Balancing Algorithms</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 32%">
<col style="width: 38%">
<col style="width: 29%">
</colgroup>
<thead>
<tr class="header">
<th>Algorithm</th>
<th>How It Works</th>
<th>Best For</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Round Robin</strong></td>
<td>Distribute sequentially to each server</td>
<td>Equal-capacity servers</td>
</tr>
<tr class="even">
<td><strong>Weighted Round Robin</strong></td>
<td>Higher-capacity servers get more requests</td>
<td>Mixed hardware</td>
</tr>
<tr class="odd">
<td><strong>Least Connections</strong></td>
<td>Route to server with fewest active connections</td>
<td>Variable request duration</td>
</tr>
<tr class="even">
<td><strong>IP Hash</strong></td>
<td>Hash client IP → always same server</td>
<td>Session affinity</td>
</tr>
<tr class="odd">
<td><strong>Consistent Hashing</strong></td>
<td>Hash-ring-based routing</td>
<td>Cache servers, stateful services</td>
</tr>
</tbody>
</table>
</section>
<section id="health-checks" class="level3">
<h3 class="anchored" data-anchor-id="health-checks">Health Checks</h3>
<pre><code>Active health checks:
  - Gateway pings /health on each backend every 5-10s
  - Unhealthy after 3 consecutive failures
  - Healthy after 2 consecutive successes
  - Remove unhealthy servers from rotation

Passive health checks:
  - Monitor response codes and latency
  - If &gt;50% of requests to a server fail → mark unhealthy
  - Automatic recovery when success rate improves</code></pre>
</section>
<section id="high-availability-design" class="level3">
<h3 class="anchored" data-anchor-id="high-availability-design">High Availability Design</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 26%">
<col style="width: 73%">
</colgroup>
<thead>
<tr class="header">
<th>Layer</th>
<th>Redundancy Strategy</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>API Gateway</strong></td>
<td>Multiple instances behind DNS round-robin or network LB</td>
</tr>
<tr class="even">
<td><strong>Load Balancer</strong></td>
<td>Active-passive pair with virtual IP failover</td>
</tr>
<tr class="odd">
<td><strong>Backend services</strong></td>
<td>Minimum 3 instances per service, across availability zones</td>
</tr>
<tr class="even">
<td><strong>Database</strong></td>
<td>Primary-replica with automatic failover</td>
</tr>
<tr class="odd">
<td><strong>Cache</strong></td>
<td>Redis Cluster with replication</td>
</tr>
</tbody>
</table>
<hr>
</section>
</section>
<section id="summary-table" class="level2">
<h2 class="anchored" data-anchor-id="summary-table">Summary Table</h2>
<table class="caption-top table">
<colgroup>
<col style="width: 12%">
<col style="width: 33%">
<col style="width: 54%">
</colgroup>
<thead>
<tr class="header">
<th>#</th>
<th>System</th>
<th>Key Concepts</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>1</td>
<td><strong>URL Shortener</strong></td>
<td>Base62 encoding, key generation, read-heavy caching, 301 vs 302</td>
</tr>
<tr class="even">
<td>2</td>
<td><strong>Rate Limiter</strong></td>
<td>Token bucket, sliding window, Redis counters, API gateway placement</td>
</tr>
<tr class="odd">
<td>3</td>
<td><strong>Chat System</strong></td>
<td>WebSocket, message queues, Cassandra, fan-out, delivery receipts</td>
</tr>
<tr class="even">
<td>4</td>
<td><strong>News Feed</strong></td>
<td>Fan-out on write vs read, hybrid approach, feed ranking</td>
</tr>
<tr class="odd">
<td>5</td>
<td><strong>File Storage</strong></td>
<td>Chunked upload, deduplication, delta sync, conflict resolution</td>
</tr>
<tr class="even">
<td>6</td>
<td><strong>Video Streaming</strong></td>
<td>Transcoding pipeline, adaptive bitrate, CDN, HLS/DASH</td>
</tr>
<tr class="odd">
<td>7</td>
<td><strong>Notification System</strong></td>
<td>Multi-channel, priority queues, rate limiting, template rendering</td>
</tr>
<tr class="even">
<td>8</td>
<td><strong>Search Autocomplete</strong></td>
<td>Trie, offline pipeline, top-K ranking, two-phase architecture</td>
</tr>
<tr class="odd">
<td>9</td>
<td><strong>Key-Value Store</strong></td>
<td>Consistent hashing, CAP theorem, quorum reads/writes, vector clocks</td>
</tr>
<tr class="even">
<td>10</td>
<td><strong>API Gateway</strong></td>
<td>Routing, auth, rate limiting, load balancing algorithms, circuit breaker</td>
</tr>
</tbody>
</table>
<hr>
</section>
<section id="system-design-interview-framework" class="level2">
<h2 class="anchored" data-anchor-id="system-design-interview-framework">System Design Interview Framework</h2>
<p>Use this framework for <strong>any</strong> system design question:</p>
<table class="caption-top table">
<colgroup>
<col style="width: 22%">
<col style="width: 37%">
<col style="width: 40%">
</colgroup>
<thead>
<tr class="header">
<th>Step</th>
<th>Duration</th>
<th>What to Do</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>1. Requirements</strong></td>
<td>5 min</td>
<td>Clarify functional + non-functional; estimate scale (QPS, storage)</td>
</tr>
<tr class="even">
<td><strong>2. High-Level Design</strong></td>
<td>10-15 min</td>
<td>Draw core components; define APIs; identify data flow</td>
</tr>
<tr class="odd">
<td><strong>3. Deep Dive</strong></td>
<td>15-20 min</td>
<td>Database schema; algorithm choices; scaling strategies</td>
</tr>
<tr class="even">
<td><strong>4. Wrap Up</strong></td>
<td>5 min</td>
<td>Review requirements; discuss bottlenecks; suggest improvements</td>
</tr>
</tbody>
</table>
<hr>
</section>
<section id="whats-next" class="level2">
<h2 class="anchored" data-anchor-id="whats-next">What’s Next?</h2>
<p>This article covered the top 10 system design interview questions. For related content:</p>
<ul>
<li><strong>Foundational concepts:</strong> <a href="../../posts/system-design/System-Design-Interview-QA-1.html">System Design Interview QA - 1</a></li>
<li><strong>Infrastructure deep dives:</strong> <a href="../../posts/system-design/System-Design-Interview-QA-2.html">System Design Interview QA - 2</a></li>
<li><strong>Design patterns:</strong> <a href="../../posts/design-pattern/Design-Pattern-Interview-QA-1.html">Design Pattern Interview QA - 1</a></li>
<li><strong>Enterprise patterns (Spring, CQRS, MVC):</strong> <a href="../../posts/design-pattern/Design-Pattern-Interview-QA-2.html">Design Pattern Interview QA - 2</a></li>
<li><strong>Production API design:</strong> <a href="../../posts/swe-interview/Python-SWE-Interview-QA-4.html">Python SWE Interview QA - 4</a></li>
</ul>


</section>

 ]]></description>
  <guid>https://vectoringai.com/posts/system-design/System-Design-Interview-QA-3.html</guid>
  <pubDate>Thu, 21 May 2026 00:00:00 GMT</pubDate>
  <media:content url="https://vectoringai.com/images/system-design/thumb_system_design_interview_qa_300.png" medium="image" type="image/png" height="96" width="144"/>
</item>
</channel>
</rss>
