Hello;
I'm hoping there is some possible help, for a self inflicted issue:
- zguru installation: I used a couple of ssd's to do a mirror install, and also allocate space for swap,slog and l2arc.
- Separate from these had eight HDDs: I've created two zpools, one pool using two hdds in a Raidz1 mirror, and six hdds in a raidz2 config.
All was working as expected and I transfered from another server a few TB onto the two pools, and then I let this server dormant (turned off) for a couple of months.
Recently I needed the two ssds elsewhere and I just thought I can just remove them from zguru server (the system was turned off), and use them, thinking that my pools are safe.
Well... Today I started the server/ made a new boot disk, needing access to my data, and I discovered I cannot import/attach my zpools anymore. They are listed as unavailable due to missing devices.
I figured that the cache partitions are/were in fact considered part of the pool and removing them was a bad idea. (well it is a bad idea for zfs to include the cache partition in the pool, but that is not something i can change)
So .. my question is whether I lost all my data, or there is something I can still do to recover the pools. I'm hoping for some brilliant ideas, but i realize the chances to recover are slim...
Thanks very much for your time
hg
here is the screen message after scanning for pools:
pool: media-1
id: 17079645613291355010
state: UNAVAIL
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
devices and try again.
see: http://illumos.org/msg/ZFS-8000-6X
config:
media-1 UNAVAIL missing device
raidz2-0 ONLINE
gpt/MEDIA2 ONLINE
gpt/MEDIA3 ONLINE
gpt/MEDIA4 ONLINE
gpt/MEDIA5 ONLINE
gpt/MEDIA6 ONLINE
gpt/MEDIA7 ONLINE
cache
3179422814501861256
1157426471439342968
Additional devices are known to be part of this pool, though their
exact configuration cannot be determined.
Import pool media-0
pool: media-0
id: 6179961275973326286
state: UNAVAIL
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
devices and try again.
see: http://illumos.org/msg/ZFS-8000-6X
config:
media-0 UNAVAIL missing device
mirror-0 ONLINE
gpt/MEDIA0 ONLINE
gpt/MEDIA1 ONLINE
cache
14391042442005362899
2268315163498519468
Additional devices are known to be part of this pool, though their
exact configuration cannot be determined.
------------
I think the issue might be not only the missing cache, but also the swap partitions...
What is ZFS Gurus' assessment?
Latest interface: | 0.3.1 |
Latest system: | 010 |
hgeorges User 10 posts |
| |||
CiPHER Developer 1199 posts |
I think you can just import your pool. Have you tried on the command line as root: zpool import -m <poolname> ? | |||
CiPHER Developer 1199 posts |
Two questions though: -> is the pool version at least 19? -> are you running a modern ZFSguru version? I recommend you do. | |||
karmantyu User 157 posts |
Yes on older ZFS versions a missing Log device could kill the pool. However on modern ZFS v5000 I've pulled out Log and Cache devices without big problem already. So an 'uname -a' and 'zpool get version <poolname>' output from the system here would be useful. | |||
hgeorges User 10 posts |
Thank you for the prompt reply I tried a variety of commands yesterday and they didn't work. I must have tried "zpool import -m <poolname>" too, but hey, I tried again this morning, and the command finished w/o an error message. Interesting! The os (zfsguru) is a new install - as I mentioned briefly above. The pools however were created on one of the prior zfsguru versions (perhaps 9 or 10, more likely) ------ Powered by ZFSguru version 0.3.1 Running official system image 11.0.008 featuring FreeBSD 11.0-RELEASE-p1 with ZFS v5000. Running Root-on-ZFS distribution. ------ [root@zfsguru ~]# uname -a FreeBSD zfsguru.bsd 11.0-RELEASE-p1 FreeBSD 11.0-RELEASE-p1 #0: Mon Oct 3 09:27:08 UTC 2016 jason@zfsguru.com:/SHARED/obj/amd64/SHARED/sourcecode/sys/OFED-POLLING-ALTQ amd64 With the pools unavailable, 'get version' command returned no pool info. So I ran 'import -m' against first (media-0) pool, and now the pool changed from UNAVAILABLE, to DEGRADED. Exciting! [root@zfsguru ~]# zpool status media-0 pool: media-0 state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://illumos.org/msg/ZFS-8000-2Q scan: none requested config: NAME STATE READ WRITE CKSUM media-0 DEGRADED 0 0 0 mirror-0 ONLINE 0 0 0 gpt/MEDIA0 ONLINE 0 0 0 gpt/MEDIA1 ONLINE 0 0 0 logs mirror-1 UNAVAIL 0 0 0 17375710815694326626 UNAVAIL 0 0 0 was /dev/gpt/SSD-SLOG-00 7031202486153265567 UNAVAIL 0 0 0 was /dev/gpt/SSD-SLOG-10 cache 14391042442005362899 UNAVAIL 0 0 0 was /dev/gpt/SSD-L2ARC-01 2268315163498519468 UNAVAIL 0 0 0 was /dev/gpt/SSD-L2ARC-11 errors: No known data errors [root@zfsguru ~]# zpool get version media-0 NAME PROPERTY VALUE SOURCE media-0 version - default 'get version' doesn't show anything... What do I do from here? Re-create all those partitions and hope they'll be attached to the pool? I have the two SSDs back in the system, and can repartition and label them as before, but I doubt they'll get the same ids. Thanks again for your help | |||
hgeorges User 10 posts |
ok. I took a bold step and ran 'import -m' against the second pool and now this one is back in DEGRADED state. Wow! Not sure what happened over night! [root@zfsguru ~]# zpool import -m media-1 [root@zfsguru ~]# zpool status media-1 pool: media-1 state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://illumos.org/msg/ZFS-8000-2Q scan: scrub repaired 0 in 2h53m with 0 errors on Sat Sep 19 20:04:07 2015 config: NAME STATE READ WRITE CKSUM media-1 DEGRADED 0 0 0 raidz2-0 ONLINE 0 0 0 gpt/MEDIA2 ONLINE 0 0 0 gpt/MEDIA3 ONLINE 0 0 0 gpt/MEDIA4 ONLINE 0 0 0 gpt/MEDIA5 ONLINE 0 0 0 gpt/MEDIA6 ONLINE 0 0 0 gpt/MEDIA7 ONLINE 0 0 0 logs mirror-1 UNAVAIL 0 0 0 5134543617975030621 UNAVAIL 0 0 0 was /dev/gpt/SSD-SLOG-01 8012737256149287713 UNAVAIL 0 0 0 was /dev/gpt/SSD-SLOG-11 cache 3179422814501861256 UNAVAIL 0 0 0 was /dev/gpt/SSD-L2ARC-00 1157426471439342968 UNAVAIL 0 0 0 was /dev/gpt/SSD-L2ARC-10 errors: No known data errors [root@zfsguru ~]# Interesting and exciting!... How do I get the pools out of the DEGRADED state? New edit: I just checked and both pools are accessible! That's great! Did some additional searching on the internet and found this on Oracle's site: https://docs.oracle.com/cd/E53394_01/html/E54801/gbbba.html#SVZFSgbcdv ... You can resolve the log device failure in the following ways: Replace or recover the log device. Bring the log device back online. # zpool online storpool log device Reset the failed log device error condition. # zpool clear storpool To recover from this error without replacing the failed log device, you can clear the error with the zpool clear command. In this scenario, the pool will operate in a degraded mode and the log records will be written to the main pool until the separate log device is replaced. Consider using mirrored log devices to avoid the log device failure scenario. ... If I read this right, I need to proceed recreating all the previous SSD partitions (label them as before) and all will get back to normal. Hopefully! Thank you! | |||
CiPHER Developer 1199 posts |
You also have sLOG/ZIL disks; so that probably was the reason. You can try this to remove the SSDs from the pool: zpool remove media-1 5134543617975030621 zpool remove media-1 8012737256149287713 zpool remove media-1 3179422814501861256 zpool remove media-1 1157426471439342968 zpool status media-1 | |||
hgeorges User 10 posts |
Hello again. I want to let you know that everything is back to normal! After my earlier post, I went ahead and recreated the original (what I thought to be the original) partitions, used the same labels etc. I tried to attach them, but that didn't work, therefore I replaced the old references with the new ones (used zfsguru web menus). A couple of questions from this exercise: 1) When I attached the l2arc partitions, in the memory req's column I got "more memory needed" (1.5 GB for one, 2 GB for the other). There is no more memory to add, so I'm going to leave that alone. Should I have any concerns? 2) in the disks section, APM is disabled on all disks. Is it ok if I enable it? (I want to spin down the disks if the server is not in use) Is there any danger in doing that? Is there a setting better than other? Thank you very much for your help | |||
DVD_Chef User 130 posts |
You can also try to import damaged or stubborn pools in readonly mode, as many times this will work and at least allow you to copy the data off. This also keeps any changes from being made and possibly making a bad situation worse. zpool import -o readonly=on <poolname> | |||
hgeorges User 10 posts |
Thank you, DVD_Chef. That's one thing I didn't try... importing r/o. I'll keep that in mind... I'm glad things came back to normal (I've almost given up after my first tries and Internet readings)... initially the outlook was pretty bleak, but following Cipher's suggestion, somehow things worked out well. Now all basic NAS functionality is working - I only need to understand some of the configuration options (among those my last posted questions) and figure out how to add more services. The menu is straight forward, but not everything is running seamlessly after installation. Thank you hg | |||
hgeorges User 10 posts |
Hello CIPHER, any chance you can look at my last questions, and advise: ... A couple of questions from this exercise: 1) When I attached the l2arc partitions, in the memory req's column I got "more memory needed" (1.5 GB for one, 2 GB for the other). There is no more memory to add, so I'm going to leave that alone. Should I have any concerns? 2) in the disks section, APM is disabled on all disks. Is it ok if I enable it? (I want to spin down the disks if the server is not in use) Is there any danger in doing that? Is there a setting better than other? Thank you | |||
CiPHER Developer 1199 posts |
L2ARC uses up RAM memory, but only as the L2ARC is being filled! It doesn't use significant memory straight away. As the RAM gets full, the L2ARC devices simply 'cannot' be filled with more data and won't. Nothing bad will happen; you just won't get the most out of your SSDs to function as L2ARC, that's all. The ZFSguru page was also created to remember people that using L2ARC uses some RAM as well and give them a rough measure that 100GB of L2ARC is quite a lot and also requires several GB of RAM memory to fill that partition up to 100GB. As for the APM; i suggest you have a look at the /etc/rc.conf and adapt the file to your needs, by tuning the following section: APM or Advanced Power Management controls power saving features of your hard Good luck! :) | |||
hgeorges User 10 posts |
Thank you, CHIPHER On this last note, how do you suggest I corroborate the current APM settings in the web interface with what is in rc.conf? Configure the APM in the web to the desired parameter, and use the same in rc.conf? I'll stop with APM questions here, since it went beyond the original topic. Normally this should be moved to it's own thread. Thank you | |||
CiPHER Developer 1199 posts |
Yeah feel free to open a new thread if you have more questions. But the functionality in the Web-interface for APM/AAM is very primitive - it does not get remembered after a reboot. Editing /etc/rc.conf is the better option because it does work after a reboot, or when manually executing: service zfsguru-tuning start |
Last Page |