Skip Navigation

How RAID system reliable? possible of raid system failure

I know how RAID work and prevent data lost from disks failures. I want to know is possible way/how easy to recover data from unfunctioned remaining RAID disks due to RAID controller failure or whole system failure. Can I even simply attach one of the RAID 1 disk to the desktop system and read as simple as USB disk? I know getting data from the other RAID types won't be that simple but is there a way without building the whole RAID system again. Thanks.

49 comments
  • For recovering hardware RAID: most guaranteed success is going to be a compatible controller with a similar enough firmware version. You might be able to find software that can stitch images back together, but that's a long shot and requires a ton of disk space (which you might not have if it's your biggest server)

    I've used dozens of LSI-based RAID controllers in Dell servers (of both PERC and LSI name brand) for both work and homelab, and they usually recover the old array to the new controller pretty well, and also generally have a much lower failure rate than the drives themselves (I find myself replacing the cache battery more often than the controller itself)

    Only twice out of the handful of times I went to a RAID controller from a different generation

    • first time from a mobi failed R815 (PERC H700) physically moving the disks to an R820 (PERC H710, might've been an H710P) and they were able to foreign import easily
    • Second time on homelab I went from an H710 mini mono to an H730P full size in the same chassis (don't do that, it was a bad idea), but aside from iDRAC being very pissed off, the card ran for years with the same RAID-1 array imported.

    As others have pointed out, this is where backups come into play. If you have to replace the server with one from a different generation, you run the risk that the drives won't import. At that point, you'd have to sanitize the super block of the array and re-initialize it as a new array, then restore from backup. Now, the array might be just fine and you never notice a difference (like my users that had to replace a failed R815 with an 820), but the result pattern is really to the extremes of work or fault with no in between.

    Standalone RAID controllers are usually pretty resilient and fail less often than disks, but they are very much NOT infallible as you are correct to assess. The advantage to software systems like mdadm, ZFS, and Ceph is that it removed the precise hardware compatibility requirements, but by no means does it remove the software compatible requirements - you'll still have to do your research and make sure the new version is compatible with the old format, or make sure it's the same version.

    All that's said, I don't trust embedded motherboard RAIDs to the same degree that I trust standalone controllers. A friend of mine about 8-10 years ago ran a RAID-0 on a laptop that got it's super block borked when we tried to firmware update the SSDs - stopped detecting the array at all. We did manage to recover data, but it needed multiple times the raw amount of storage to do so.

    • we made byte images of both disks in ddrescue to a server that had enough spare disk space
    • found a software package that could stitch together images with broken super blocks if we knew the order the disks were in (we did), which wrote a new byte images back to the server
    • copied the result again and turned it into a KVM VM to network attach and copy the data off (we could have loop mounted the disk to an SMB share and been done, but it was more fun and rewarding to boot the recovered OS afterwards as kind of a TAKE THAT LENOVO...we were younger)
    • took in total a bit over 3TB to recover the 2x500GB disks to a usable state - and took about a week of combined machine and human time to engineer and cook, during which my friend opted to rebuild his laptop clean after we had images captured - to one disk windows, one disk Linux, not RAID-0 this time :P
  • Keep multiple reliable (and tested) backups, if something fails restore a backup.

    Don't rely on any storage, RAID or anything else to be recoverable when something goes wrong.

  • I'm sure I've seen paid software that will detect and read data from several popular hardware controllers. Maybe there's something free that can do the same.

    For the future, I'd say that with modern copy on write filesystems, so long as you don't mind the long rebuild on power failures, software raid is fine for most people.

    I found this, which seems to be someone trying to do something similar with a drive array built with an Intel raid controller

    https://blog.bramp.net/post/2021/09/12/recovering-a-raid-5-intel-storage-matrix-on-linux-without-the-hardware/

    Note, they are using drive images, you should be too.

    • Frankly hardware raid is dead and was never great. Software raid is significantly better.

      • I think it had its uses in the past, specifically if it had the memory backup to prevent full array rebuilds and cached data loss on power failure.

        Also at the height of raid controller use (I would say 90s and 2000s) there probably was some compute savings by shifting the work to a dedicated controller.

        In modern day, completely agree.

49 comments