Dear Axe,
In traditional disk-level RAID mode, data is stored on different disks of a single node and cannot be restored when the entire node fails. To prevent data loss, the storage system needs to provide redundancy protection for data among nodes. Erasure coding (EC) is a redundancy protection mechanism that implements data redundancy protection by calculating parity fragments.
When writing data, the distributed storage system divides the data into N data fragments (N is an even number) and calculates M parity fragments (M can be 2, 3, or 4) by using the EC encoding algorithm.
Server-level security: N data fragments and M parity fragments are stored on different nodes. If M nodes or M disks are faulty, the system can still properly read and write data, services are not interrupted, and data is not lost.
Cabinet-level security: N data fragments and M parity fragments are stored in different cabinets. If M cabinets or M nodes or M disks of different cabinets are faulty, the system can still read and write data properly, services are not interrupted, and data is not lost.
The space utilization with the EC redundancy mode used is about N/(N + M). A larger value of N indicates higher space utilization. The data reliability is determined by the value of M. A larger value of M indicates higher reliability. For details about the redundancy ratios, see the EC redundancy ratios table at the end of this section. Considering both the performance and reliability, the 4+2 redundancy ratio is recommended.
Thanks.