InterviewBiz LogoInterviewBiz
← Back
Explain the Concept of Database Sharding and Replication
software-engineeringmedium

Explain the Concept of Database Sharding and Replication

MediumHotMajor: software engineeringmeta, amazon

Concept

To handle large-scale data and high query loads, modern databases use sharding and replication — two key strategies for scalability and fault tolerance.

  • Sharding (horizontal partitioning) divides data across multiple database servers.
  • Replication creates copies of the same data across multiple servers.

They address different goals: sharding improves scalability and performance, while replication improves availability and fault tolerance.


1. Database Replication — Redundancy and High Availability

Replication involves copying data from one database (the primary) to one or more replicas (secondary nodes).

Common Models:

ModelDescription
Master–SlaveWrites go to the master, reads can go to replicas.
Master–MasterMultiple writable nodes, sync via conflict resolution.
Synchronous ReplicationWrites committed only after replicas confirm.
Asynchronous ReplicationMaster commits immediately, replicas catch up later.

Benefits:

  • Increases read scalability (via read replicas).
  • Provides failover capability — if the master fails, a replica takes over.
  • Enables geo-distributed deployments (replicas close to users).

Example (safe for MDX):

Client → Primary DB → Replicas (read-only)

Trade-offs:

  • Consistency vs Availability (per CAP theorem).
  • Potential replication lag in asynchronous models.

2. Database Sharding — Partitioning for Scale

Sharding splits a large dataset across multiple independent databases called shards, each responsible for a subset of data.

How It Works:

  • Each shard stores a unique subset of rows based on a shard key (e.g., user ID, region).
  • The application routes queries to the correct shard using this key.

Example (safe for MDX):

Shard 1 → User IDs 1–1M
Shard 2 → User IDs 1M–2M
Shard 3 → User IDs 2M–3M

Benefits:

  • Handles massive data volumes without a single node bottleneck.
  • Enables parallel reads/writes across shards.
  • Improves latency by distributing data geographically.

Challenges:

  • Complex to rebalance or reshard.
  • Cross-shard queries are slower and require aggregation layers.
  • Strong consistency across shards is hard to maintain.

3. Sharding Strategies

StrategyDescriptionExample Use
Range-basedData split by value rangeUser IDs 1–1M, 1M–2M
Hash-basedData distributed by hash functionhash(user_id) % 8
Directory-basedLookup table maps keys to shardsDynamic partitioning system

Example (safe for MDX):

shard_id = hash(customer_id) % total_shards

Best Practice: Choose a shard key that balances data evenly and supports efficient routing.


4. Sharding + Replication Combined

In large distributed systems, both are used together:

LayerFunction
ShardingHorizontal partitioning of datasets
ReplicationRedundancy for each shard

Example (safe for MDX):

Shard 1: Primary + 2 Replicas
Shard 2: Primary + 2 Replicas

This ensures:

  • Each shard handles only a portion of the data (scalability).
  • Each shard is replicated (fault tolerance).

5. Real-World Applications

CompanyImplementation
FacebookMySQL sharded by user ID; replicas across regions.
YouTubeVideo metadata sharded by content ID.
MongoDB / CassandraBuilt-in auto-sharding and replication.
Amazon DynamoDBPartitioned key-value store with multi-AZ replication.

6. CAP Theorem Connection

  • Consistency (C) — All nodes see the same data.
  • Availability (A) — Every request gets a response.
  • Partition Tolerance (P) — System functions despite network partitions.

Sharding and replication force trade-offs:

  • Replication can favor availability (async) or consistency (sync).
  • Sharding enhances partition tolerance, but complicates global consistency.

7. Interview Tip

  • Explain both concepts distinctly, then describe how they complement each other.
  • Mention shard key design, replication lag, and failover strategies.
  • Use examples (e.g., “Instagram shards by user ID for scaling user data”).
  • Be ready to sketch a high-level architecture diagram with primary-replica shards.

Summary Insight

Sharding scales data horizontally; replication ensures reliability and speed. Together, they form the foundation of globally distributed, high-availability data systems.