Database Sharding

--

Photo by imgix on Unsplash

What is Sharding?

  1. Database sharding is a technique for distributing a single across multiple servers.
  2. It is a key for horizontal scaling since the data, can be stored on multiple machines.
  3. It can also possibly improve the throughput of the database.

Advantages of Database Sharding

  1. Database sharding scalability : Sharding facilitate scale out, or horizontal scaling. By adding more machines to an existing stack, an organisation can permit more traffic, enable faster processing.
  2. Database sharding performance : Speed query response time. Sharding a massive table into multiple shard allows queries to pass over fewer rows and return result sets more rapidly.
  3. Reliability and Availability.

Drawbacks of Database Sharding

  1. Complexity : Increased complexity in designing sharding database.
  2. Hotspots. Even correctly implemented database sharding has a major impact on workflows as it requires that teams manage data across multiple shard locations without creating database hotspots and while ensuring even data distribution. Watch https://www.youtube.com/watch?v=ES2ov9s4ias&ab_channel=CockroachDB video to understand more about HotSpots.

Sharding Architectures

Key-based Sharding / Hash-based sharding : This is the most common way to split data across servers. Examples are consistent hashing, Ketama or Rendezvous.

Range Based Sharding

Directory Based Sharding

Resources

  1. https://aws.amazon.com/what-is/database-sharding/
  2. HotSpots : https://www.youtube.com/watch?v=ES2ov9s4ias&ab_channel=CockroachDB
  3. Sharding Pattern : https://learn.microsoft.com/en-us/azure/architecture/patterns/sharding

--

--