Got it

Write hole protection for different RAID

Latest reply: May 31, 2021 17:19:00 740 10 3 0 0

Hello community members,

Today I will introduce the write hole to you, firstly I'll show you the write hole phenomenon, then tell you about write hole in different RAIDs, finally I will talk about how to avoid it.

"Write hole" phenomenon

The "write hole" effect can happen if a power failure occurs during the write. It happens in all the array types, including but not limited to RAID5, RAID6, and RAID1. In this case, it is impossible to determine which of data blocks or parity blocks have been written to the disks and which have not. In this situation, the parity data does not match to the rest of the data in the stripe. Also, you cannot determine with confidence which data is incorrect - parity or one of the data blocks.

Write hole in RAID5

"Write hole" is widely recognized to affect a RAID5, and most of the discussions of the "write hole" effect refer to RAID5. It is important to know that other array types are affected as well.

If the user data is not written completely, usually a filesystem corrects the errors during the reboot by replaying the transaction log. If a file system does not support journaling, the errors will still be corrected during the next consistency check.

If the parity (in RAID5) or the mirror copy (in RAID1) is not written correctly, it would be unnoticed until one of the array member disks fails. If the disk fails, you need to replace the failed disk and start RAID rebuild. In this case, one of the blocks would be recovered incorrectly. If a RAID recovery is needed because of a controller failure, a mismatch of parity doesn't matter.

A mismatch of parity or mirrored data can be recovered without user intervention, if at some later point a full stripe is written on a RAID5, or the same data block is written again in a RAID1. In such a case the old (incorrect) parity is not used, but new (correct) parity data would be calculated and then written. Also, new parity data would be written if you force the resynchronization of the array (this option is available for many RAID controllers and NAS).

Generally, a power failure during write is rare, uninterruptable power supply is cheap, and stripe block is not that big. Hence, the probability of encountering a "write hole" in practice is small.

Write hole in RAID1

Similarly to a RAID5, the write hole effect can happen in a RAID1. Even if one disk is designated as "first" or "authoritative", and the write operations are arranged so that data is always written to this disk first, ensuring that it contains the latest copy of data, two difficulties still remain:

a hard disk can cache data itself. Caching may violate the arrangement done by the controller.

if the disk that was designated as the first/authoritative fails, write holes may already been present on the second disk and it would be impossible to find them without the first disk data.

Write hole in RAID6

Theoretically, a RAID hole phenomenon can also happen in a RAID6 consisting of the large number of member disks. RAID write hole in a RAID5/RAID1 occurs when one of the member disks doesn't match the others and by the nature of single-redundant RAID5/RAID1 it is impossible to tell which of the disks is bad. Write hole in a RAID 6 occurs when two disks don't match the others simultaneously. Such a situation can happen, for example, if the power is turned off in the middle of the full stripe write.

Write hole in complex RAID types

Complex RAID types inherit a write hole vulnerability from those RAID types on which they are based.

RAID 10 inherits write hole from a RAID 1. If one of the mirrored copies has been written but the second one has not, it is impossible to know which of them is correct.

In a RAID 50, which can be represented as a set of RAID 5 arrays, write hole can occur in each of these arrays.

The same way RAID 100 is vulnerable and RAID 60 as well, albeit with lesser probability.

How to avoid the "Write hole"?

In order to completely avoid the write hole, you need to provide write atomicity. We call the operations which cannot be interrupted in the middle of the process "atomic". The "atomic" operation is either fully completed or is not done at all. If the atomic operation is interrupted because of external reasons (e.g. a power failure), it is guaranteed that a system stays either in original or in final state.

In a system which consists of several independent devices, natural atomicity doesn't exist. Variance of mechanical hard drives characteristics and data bus particularities don't allow to provide required synchronization. In these cases, transactions are typically used. Transaction is a group of operations for which atomicity is provided artificially. However, expensive overhead is required to provide transaction atomicity. Hence, transactions are not used in RAIDs.

One more option to avoid a write hole id to use a ZFS which is a hybrid of a filesystem and a RAID. ZFS uses "copy-on-write" to provide write atomicity. However, this technology requires a special type of RAID (RAID-Z) which cannot be reduced to a combination of common RAID types (RAID 0, RAID 1, or RAID 5).

How to reduce the negative effect of a "Write hole"?

Practically, the risk of losing data due to the write hole can be reduced up to the acceptable level even for usual arrays, such as RAID 1 and RAID 5.

Supply uninterruptable power. You can just use uninterruptable power supply (UPS) for the entire RAID. The second option is to use Battery Backup Unit (BBU) which is connected to a RAID controller directly. This battery allows to save write cache content of a controller if a power failure occurs. All the write operations, which are in the cache and are not completed due to a power failure, will be done after the power turns on again. BBU protects only the controller cache, not the hard disk's write caches.

Synchronize your array regularly. Synchronization is a process when parity values (for a RAID 5) or other data providing redundancy (for RAID 6, RAID 7, or RAID DP) are recalculated. In a RAID1, the data from one disk is copied to the other during synchronization. Synchronization destroys all the write holes accumulated during the operation. Once synchronization completes, redundant data will exactly match user data. In the same time synchronization detects bad sectors in rarely used areas of an array, because during synchronization all the array sectors are read from and written to. Modern hardware controllers usually allow to synchronize an array by schedule. RAIDs created using Windows cannot be synchronized by schedule.

If SSDs are used in RAID, usually you can turn off write cache and still can get enough performance for your particular task. Turning off write cache does not avoid a write hole totally, but decreases the probability of losing data and amount of data which can be lost because of a power failure.

"Write hole" protection for Huawei 

When a system failure (such as a power failure) causes incomplete write (not write failures), some stripes, and even some parity data of stripes, enter the uncertainty state. Writing data into these stripes will encounter errors. This phenomenon is called write hole.

Storing write failure data in the power failure protection zone and rewriting the data when appropriate can solve the write hole problem. The main protection scenarios are as follows:

  • For RAID 5, 6, and 50, write hole data is stored in the power failure protection zone and write hole protection does not need to be manually enabled.

  • When the power failure protection device is available, write hole data is recovered after a restart.

  • When the power failure protection device is unavailable, write hole data recorded in the power failure protection zone is lost after a restart.

  • When a drive of RAID 6 goes offline, write hole protection is supported.

  • When RAID 50 is partially degraded, write hole protection is supported and data will be restored by span. When RAID 50 is fully degraded, write hole protection is not supported.

  • RAID 0, 1, and 10 do not support write hole protection.

That's all for today. I hope it will be helpful to all of you! 


  • x
  • convention:

titusmahwe
Moderator Created May 31, 2021 11:24:35

Write holes are a big setback if you are not prepared for them. However the use of an atomic technique can prove vital.
View more
  • x
  • convention:

little_fish
little_fish Created Jun 1, 2021 00:54:53 (0) (0)
Thank you for your support.  
Unicef
MVE Created May 31, 2021 12:58:46

Thanks for the post
View more
  • x
  • convention:

little_fish
little_fish Created Jun 1, 2021 00:55:02 (0) (0)
 
wissal
MVE Created May 31, 2021 16:59:15

Very well explained
View more
  • x
  • convention:

little_fish
little_fish Created Jun 1, 2021 00:55:12 (0) (0)
 
azkasaqib
Created May 31, 2021 17:18:50


Thanks for the post
View more
  • x
  • convention:

little_fish
little_fish Created Jun 1, 2021 00:55:55 (0) (0)
haha  
azkasaqib
Created May 31, 2021 17:19:00

Write hole protection for different RAID-3953141-1
View more
  • x
  • convention:

little_fish
little_fish Created Jun 1, 2021 00:56:06 (0) (0)
 

Comment

You need to log in to comment to the post Login | Register
Comment

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " Privacy."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.