With hard drives, the question is not if but when they’ll fail.
Our systems are designed to handle hardware failures, including hard drive failures.
The theory
We use Erasure Coding algorithm (also called Reed-Solomon) to create parity data on multiple hard drives. Here is a Wikipedia article about it.
Simplistically Erasure coding takes the data and breaks it up into fragments and generates additional parity fragments. Every fragment goes to different hard drive.
How we handle it
At the moment of writing we have drive pools of 6 drives and we have divided it into 3 data drives and 3 parity drives.
Parity fragment or drive can be used instead of any of the data drive. If a data drive fails then our software automatically reads data from parity drive instead, re-generates original data and sends to customer. It is seamless and fast.
In this case, for fatal data loss to happen, we should theoretically lose 4 drives out of 6. But we are not waiting it to happen. If drive failure happens, we immediately start regenerating data to a new drive.
Our software is designed to handle any hardware failure and keep your data safe and available at all times
The number?
We do not have yet an official durability number, like 99.9…..%. Actual math behind this number is quite hard and needs some input data that we are yet collection. But we are certain that we have really strong protections in place and once we do calculate, it will be very-very close to 100%.