Redundancy
Master the foundation of reliability. Learn why having duplicates of servers, power, and networks is essential for system survival.
π Redundancy: The Art of Duplication
Redundancy is the duplication of critical components of a system with the intention of increasing reliability of the system, usually in the case of a backup or fail-safe.
π‘ The Logic (ELI5)
Think of a Spare Tire:
- Your car has 4 wheels to drive.
- If one wheel pops, you are stuck on the road.
- The Spare Tire in the trunk is your Redundancy.
- You don't use it for driving normally, but if the main one fails, it saves your life.
π The Deep Dive
Types of Redundancy
- Active Redundancy (N+1): multiple identical components are running. If one fails, the others just keep going (e.g., 5 servers instead of 4).
- Passive Redundancy (Standby): A second component is sitting there, turned off. It only turns on if the main one dies.
Layers of Redundancy
- Hardware: Dual power supplies, RAID hard drives.
- Service: Multiple instances of the same microservice.
- Geographic: Servers in both Virginia (USA) and Tokyo (Japan) so even a hurricane can't take your site down.
π― Interview Pulse
No Single Point of Failure (SPOF)
Your goal in any design is to eliminate the SPOF. Interview Scenario: "You have a Load Balancer, 3 Web Servers, and 1 Database." The Catch: The Database is the SPOF! If it dies, the whole system dies. The Fix: Add a redundant Database (Replica).
Cost vs Reliability
Redundancy is expensive. You are paying for hardware that you aren't using. Pro Answer: Only add redundancy where the cost of a failure (Downtime/Loss of user trust) is higher than the cost of the extra hardware. π