𝗦𝗵𝗮𝗿𝗱𝗶𝗻𝗴 𝗜𝗻 𝗔 𝗡𝘂𝘁𝘀𝗵𝗲𝗹𝗹

Sharding splits a large database into smaller pieces called shards. Instead of one massive database, you distribute data across multiple servers.

This method helps you:

  • Handle more data
  • Process more requests
  • Reduce load on single machines
  • Scale horizontally

You must decide how to route data to the correct shard. Here are the main strategies:

  1. Range Based Sharding You divide data based on a range of values. Example:
  • Shard 1: Users 1 to 3000
  • Shard 2: Users 3001 to 6000
  • Shard 3: Users 6001 to 10000
  1. Hash Based Sharding You use a mathematical function to pick a shard. Example: You use the modulo operator on a user ID. If the result is 0, the data goes to Shard 1. If the result is 1, it goes to Shard 2. This spreads data evenly.

  2. Directory Based Sharding You use a lookup table to find the correct shard. Example: A notification system looks up an app name like "YouTube" in a directory. The directory tells the system to use "Shard 1". This works like a file system folder.

  3. Geographical Sharding You store data based on location. Example:

  • India users go to the India shard.
  • USA users go to the USA shard.
  • European users go to the EU shard.
  1. Dynamic Sharding You do not hardcode ranges or hashes. Your application checks a configuration table at runtime. You can add new shards without changing your code.

  2. Hybrid Sharding You combine methods to get better results. A common pattern is Directory Based + Hash Based. First, you use a directory to find a group. Then, you use a hash to find the specific shard within that group. This gives you both flexibility and balance.

Source: https://dev.to/code_with_aravind/sharding-in-a-nutshell-5f6b