Latest interface: 0.3.1
Latest system: 010

3 posts

Posted on 15 May 2014 @ 19:34
I originally posted this in the Nexenta support but it's not product specific and their support is slow.

Please forgive me in that I'm not a ZFS expert, so my lingo may be off.

Some time ago I had a VMware ESI server that had a Nexenta guest running ISCSI services for a few other VMware servers. This server had an 6 disk raid 5 array, with 1 hot fail over. Internally the VMware has8 512gb virtual drives assigned to Nexenta, which them chopped it up into 3 separate ISCSI drives.

At the end of November, that Nexenta instance hung (100%) and for some reason we couldn't get VMware to kill the instance, so we had to hard boot the machine. After it came back up the two separate disk instances that as had allocated (2 sets of 4 of the 512gb drives) both reported that one drive was suspect and wouldn't mount the drives). After doing an IRC chat for some time in one of the ZFS channels and attempted multiple recoveries on these virtual drives, on guy had me issue a set of commends to destroy and recreate the drive arrays, which I did, and of course, blank. I stopped at that point. The underlying data is there inside the vmdk's in some state. Talking to a few Sun people with limited experience, they said it's toast and to give up, but unfortunately for me, I'm more persistent than that.

ESXI 5 -> Nexenta -> iSCSI -> shared to 2 other ESXI hosts

The server was a super micro box, 8 1TB enterprise class drives, 12gb ram. The client machines were sun boxes with lots of CPU and ram.

The data. This particular server was used by friends and family for hosting their ideas and little things (like my mom's little shop, my SBS server for email, and about 8 years of family content and documents). Some of it's backed up, some is not.

So what I would like to know if there are any tools out there to try to recover this type of array. The ZFS tools are limited and now I need a way to do more than just create a new disk array but rather detect the array elements and then rebuild the array the best it can. Similar to the old disk fat recovery utilities.

I have not touched the data since the crash, which happened Nov 30th, in the background I had to rebuild everything else on my small network from scratch.

I will be booting this machine up Sunday to copy the raw vmdk's to a 2 new 3TB drives so I can have a copy to play with and recovery.

Any help would be greatly appreciated.

1199 posts

Posted on 15 May 2014 @ 20:21
You run a six-disk RAID-Z array, but you give 8 virtual disks to the virtual machine? I do not understand.

To know what you can do in terms of recovery, you would need to boot the machine 'bare' thus without ESXi and look at the disk configuration and try to important a pool.

You speak about .vmdk files. Are you storing the ZFS data using file containers on a legacy filesystem? That would be a very bad setup, since it mitigates many protections that ZFS provides and makes recovery much more complicated.

And can it be true that you run Hardware or Software RAID5, then having a Ext4 legacy filesystem, storing .vmdk files on there, giving those to the virtual machine running ZFS? Is that what you have been doing?

3 posts

Posted on 15 May 2014 @ 22:22
ZFS was running inside a VMware guest. As such, the Nexenta and ZFS didn't have direct access to the underlying hardware. The short reason for this is that the server was already setup and running ESXi 5 hypervisor 75 miles away and so I just downloaded the Nexenta software and loaded it remotely. The reason for the multiple virtual disks exposed to the actual Nexenta file system is because how VMware was configured it had a limit of 512gb per disk allocation.

So let's assume that I just spin up the instance, those 8 disks look like physical disks to Nexenta and that's how ZFS was using it.

So technically, from your standpoint it's 8 disks allocated into 2 groups of 4, JBOD (as the underlying is raid already).

I know this wasn't the best setup, in fact, I was going to just run it as an NFS setup but someone convinced me otherwise, and that person has since then moved on (and doesn't have the technical skills to fix this anyway).

In essence, I made a bad decision, but I'm just hoping there is some light at the end of the tunnel.
Last Page

Valid XHTML 1.1