Rescuing Data from a Buffalo Link Station with failed a RAID

Recently I was confronted with a Buffalo LinkStation which had a failed RAID0. The data on it was important, and the customer did not have any current backups. The were a lot of huge red flag warning signs that seemed to suggest a disk possibly going bad, but the the guy at the customers location that doubled as IT happily ignored those. Buffalos willingness to support this problem extended to replacing bad hard disks, since all data on a failed RAID0 is considered irrecoverably lost by them. That was not entirely unexpected to hear of a tier one support worker though.
I did a little digging and found out, that those LinkStations use some fairly common tools under the hood. So I agreed to have a look, but made it clear that I might not be able to get anything back. The customer wanted me to try anyway and I got a good amount of the data back. The Article describes how.

Things that should be considered before you attempt to restore the data yourself

  1. Make sure that there is nothing that writes to the disk from which you want to recover data. If you are recovering data from RAID volumes, make sure that there are no writes to any of the disks in the volume. Even if your data recovery is just about a simple deleted file, you will want to avoid writes to the disk. This is because any write cloud potentially overwrite recoverable data. Since some file systems write to the disk even when files are read, you should avoid mounting the disk directly or mount the disk as read only.
    If the hard drive is still in the Computer/NAS it is usually in, shut that system down or do not boot it the normal way. There may be programs or services that write to the affected drive. Starting that computer from a Live CD is usually OK though.
  2. If it is clear that there is a serious mechanical or electronic defect ,often identified by problem descriptions such as: “sounds funny” or “smells funny”, ask the user to consider a data recovery lab before you start working on it yourself. Anything you do in such a situation could negatively effect the results of the lab.
  3. Always image or clone the disk you are working on and only use the clone going forward. If you suspect that you wont be able to read the original disk a 2nd time, only work with copies of your image/clone. If your recovery gets somewhat more complex it is easy to do something irreversible that will negatively effect your chances of getting the data back.
    Even if creating the images takes a lot of time, just let it run and do something else meanwhile. Having a means to go back to to the original state is just basic cover your ass.

The Situation

I got the entire NAS-Box delivered. It was a Buffalo LinkStation DUO. It still booted (In retrospect I should have cloned the disks before trying to boot) up ok and the RAID was shown as normal, all the shares were listed when I accessed it over SMB, but I could not open any of the shares. The web interface showed the RAID to be OK. But I did not see any values for total storage and free space, which these LinkStations usually show prominently in the Web GUI. The disk settings showed both disks in the list, but the only information shown about them was they product number. This was very odd.

The disks were from Seagate so pulled them and put them in the USB dock on my desktop. I ran the Seatools and Crystal Disk Info to display the SMART values. One of the disks was fine, the other had almost 4000 reallocated sectors and over 700 sectors that were defective but could not be reallocated. The problem was fairly obvious….

Cloning the disks with dd

Next I put those disks (one at a time) and two larger disks (also one at a time) into a Linux machine and cloned the original disks to the other disk I put in with it. The linux tool dd does a great job at copying hard drives at block level and it is my go to tool if I need a block for block image from a disk. Especially if I need to work with that Image.

The dd command is fairly simple and short, but it will run quite some time (especially when there are bad sectors on one of the hard disks) and only give you an output if it encounters an error. The following example is assuming that your source disk (the ones from the NAS) is /dev/sdb and your target disk (the one you put in with it) is /dev/sdd:

Be absolutely certain that you selected the correct output before sending the command!

dd if=/dev/sdb of=/dev/sdd conv=noerror,sync bs=512

The options have the following meaning:

  • if: the input for dd. In this case the entire disk sdb.
  • of: the output for dd, this can be a block device or a file. Here the target is sdd. Be absolutely sure that you selected the correct output!
  • noerror: tells dd to continue even if errors are encountered
  • sync: tells dd to fill the block with zeros if an error is encountered, this means that the size of the original will be kept. Otherwise a block that could not be read will be skipped in the output and next readable block will be written directly after the last readable block. This would mess with file allocation tables.
  • bs: This specifies the block size. If the disk has bad sectors try to make this match the size of the disks sectors (in this case 512 byte) otherwise pick a larger block size (64k is a good value) to speed up the process. If you pick a larger block size on a faulty drive, the entire block will be written as zeros when a sector can’t be read.

Since the cloning process can take a lot of time and only gives you any output when it encounters an error, you might want to get a status from time to time. To do this open a second terminal and find out the process id of the running dd process with following command:

pgrep -l '^dd'

Once you know the Process ID you can run following command to produce an “ERROR” message in the first terminal with the current status of the dd (substitute the 38862 for the process id you got in the previous step):

kill -USR1 38862

Getting the RAID back online

After the cloning was done, I stored the source disks safely and made sure both clones were in my Linux comupter. The disks were using gpt and had 6 partitions each. The last partition was clearly the one with the data stored. The other 5 partitions seemed to be system and swap partitions for the NAS. The system partitions were set up as a RAID1. That seems sensible and I still have no idea why RAID0 is the default for the data partition.

After looking at the general structure of the disks and partitions, I had a closer look at the data partitions. I first tried to simply add both partitions to a new RAID. This will recognize existing RAID superblocks on the partition and recreate the an already existing RAID:

mdadm -A /dev/md123 /dev/sdb6
mdadm -A /dev/md123 /dev/sdd6

If this works for you, you can continue reading here.
But unfortunately there was no mdadm superblock found on the clone of the defective drive. I tried to find it using a bunch different tools to see if I could find the superblock, but I failed horribly.

So I decided to continue using more drastic measures (read this as potentially destructive) and recreate the RAID by force. But first I needed more Iiformation on what settings the RAID was created with. For that I simply read out the RAID superblock on the data partition of the good disk:

mdadm --examine /dev/sdb6

The output looked similar to this (I got this output when I ran the examine on one of the cloned drives in a USB dock a few weeks later):

/dev/sdf6:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 73bfe225:b30b764b:faf5b20a:9ac09813 (local to host skelsrv)
  Creation Time : Thu Mar 20 23:54:11 2014
     Raid Level : raid0
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 2

    Update Time : Thu Mar 20 23:54:11 2014
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
       Checksum : c04d5a9 - correct
         Events : 1

     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8       22        1      active sync

   0     0       8       54        0      active sync
   1     1       8       22        1      active sync

even though its not exactly the output I had originally, it still shows all the relevant information:

  • Version: the metadata Version of the RAID, it is important that this matches if you recreate the RAID.
  • RAID Level: In case you don’t already know what RAID Level was used
  • Total Devices: In this case it was clear from the beginning, but with 4 Bay NAS Boxes, the RAID might not use all the Devices
  • Chunk Size: This one is also very important to get right
  • The Order of the disks: You should specify the disks in the correct order in the create statement. In case of the output above, the disk is the second drive in the array.

Once you have all that information, you can create the RAID again:

mdadm --create /dev/md123 --assume-clean --level=0 --verbose --chunk=64 --raid-devices=2 --metadata=0.90 /dev/sdd6 /dev/sdb6

Mounting the XFS file system

Once the RAID is recreated you will have to mount the partition to see if you can get any data back. I strongly recommend mounting the partition as read only (for this to work you need to have xfs support, on Debian that is provided by the xfsprogs packet):

mkdir /mnt/md123
mount -o ro -t xfs /dev/md123 /mnt/md123

If this works, great. If your disaster is as bad as the one I was dealing with, you might get an error message along the lines of: the partition can’t be mounted because of log file corruption. I was not dissuaded and decided to ditch those pesky log files and all the files that had not been completely written to the NAS at the time of the crash with them. After all I was pretty far already and getting back some or even most files seemed to be better than getting none of them.But before you try this be aware that this form of repair pretty much drops the existing log files and can cause further corruption to the file system.:

xfs_repair -L /dev/md2
xfs_check /dev/md2
mount -o ro -t xfs /dev/md2 /mnt/md2

This last mount command worked for me and I could copy files off the newly mounted RAID. But as pretty much expected some of the files were corrupt. (You could copy them but not open them with their respective programs or you could open them and the content was garbage). Lucky for me I don’t have to figure out which files got corrupted, because that seems like a real pain in the ass.

Anyway once you retrieved the data, make sure to stress the importance of Backups before handing out the restored files.

If you do happen to have a convenient method to check over 700000 files of different formats (MS Office, PDF, single saved E-Mail messages from outlook, jpg files, Autocad files and a bunch of others) please share. But even if you don’t I hope that this article could help you.

6 thoughts on “Rescuing Data from a Buffalo Link Station with failed a RAID”

  1. THANK YOU VERY MUTCH!
    This saved my live.
    I searched so many Forums and other Stuff on the NET but this is the Best “Tutorial” that i found. It works perfectly for me. Alle the other People that i asked telled m,e that RAID0 cant be restored after a crash. But this works.
    SO NICE THX!
    Can i translate this into German and Italian?

    1. I have not problem at all if you translate this.
      Maybe link back to the article, but ultimately this is just a collection of info I got from other blogs and forum when I faced the problem.
      I am happy that it helped someone and if you translate it, it might help somebody else.

  2. I have problem with terastation TS5400R which configured to array with raid 0

    before 5 days ago, I got problem with array1 failed

    and below that in the driver table

    disk 1 normal
    disk 2 iscsi drives normal

    I don’t know what happen for the that.

  3. Got drive that fell over. No big distance. I fell over on the desk. You would have thought Buffalo made them more robust but no, it makes a clicking sound continuously but does not allow access to data saying its ‘unavailable’ …
    Any ideas how I can recover the data or access this drive at all would be most appreciated.

    Ed

    1. First off if the clicking started directly after a fall, then we are probably talking about physical damage caused by the read/write heads hitting the platter. At this point any attempt to still run the disk(and recover data) could make the damage worse.

      So before you attempt anything think about how important that data is to you. If this data is important and irreplacable I recommend to use a professional data recovery lab for the restore.

      If you do want to risk trying to recover youself, the steps to recover depend slightly on what kind of drive it is.
      If you are using a standard external usb drive, it should be ok to leave the disk in the usb enclosure. With a NAS drive, you will have to remove the drive from the enclosure and connect it to your computer directly.
      After the the steps are roughly this:
      1. boot into any linux with dd installed (most live cds should be good enough). Check if the drive is recognized (lsblk on the command line will give you a list of the drives connected)
      2. Use dd to create a drive image(as described in the aricle). It may be easier for you if you use dd to directly copy everything to another drive instead of an image file. The drive you are copying to needs to be the same or a larger size as the one that you are trying to copy. It is not enough to have just enough space for all the data you used.
      If you copy directly to another drive, everything on the targetdrive will be overwritten.
      3. If you had a NAS put the target disk into it. and start it.
      4. Boot back to your normal operating system and check if the data is accessible. If not a file system repair might help, but it may make everything worse. Depening on how much space you have and how important the data is, you may want to create a clone or image of the target disk before attempting any file system repair.
      5. Even if the data is accessible you should check all important files. There is a good chance that at least some of the data is corrupted.
      6. Choose a backup solution to avoid the issue in future.

      This is only a rough guide, exact steps will depend on what kind of buffalo you have. Also I am by no means a datarecovery expert, these are essentially just steps that I used before to recover data from failing hard disk when necesarry.

  4. A good toolset for recovering disks: TestDisk for Linux, MacOSX, and Windoze, go to https://www.cgsecurity.org/.

    The testdisk debian package has 2 tools: testdisk and photorec.

    The testdisk executable (“Scan and repair disk partitions”) will evaluate a given disk volume, search for lost partitions, ignore missing/damaged partition tables, and skip bad hardware spots on the disk and show you what you’ve got or, if you have minimal damage and you are brave, recover the subject disk. I wouldn’t do the recovery operation unless all of the damage was “soft” i.e. not hardware-related. Even then, I would prefer using `photorec`.

    The photorec executable (“Recover lost files from harddisk, digital camera and cdrom”) will attempt to salvage all the good stuff to a target mounted directory (presumably in good shape and has oodles of space for resurrected disk directories and files). Like testdisk, it is good at avoiding the bad stuff. Note that it will automatically resurrect any logically deleted directories and files that it encounters (not always desirable but better too much than too little!).

Leave a Reply

Your email address will not be published. Required fields are marked *