Latest interface: 0.3.1
Latest system: 010
hgeorges
User

8 posts

Posted on 19 February 2017 @ 22:45edited 23:45 38s
Hello;
I'm hoping there is some possible help, for a self inflicted issue:
- zguru installation: I used a couple of ssd's to do a mirror install, and also allocate space for swap,slog and l2arc.
- Separate from these had eight HDDs: I've created two zpools, one pool using two hdds in a Raidz1 mirror, and six hdds in a raidz2 config.

All was working as expected and I transfered from another server a few TB onto the two pools, and then I let this server dormant (turned off) for a couple of months.

Recently I needed the two ssds elsewhere and I just thought I can just remove them from zguru server (the system was turned off), and use them, thinking that my pools are safe.

Well... Today I started the server/ made a new boot disk, needing access to my data, and I discovered I cannot import/attach my zpools anymore. They are listed as unavailable due to missing devices.
I figured that the cache partitions are/were in fact considered part of the pool and removing them was a bad idea. (well it is a bad idea for zfs to include the cache partition in the pool, but that is not something i can change)

So .. my question is whether I lost all my data, or there is something I can still do to recover the pools. I'm hoping for some brilliant ideas, but i realize the chances to recover are slim...

Thanks very much for your time
hg

here is the screen message after scanning for pools:
pool: media-1
id: 17079645613291355010
state: UNAVAIL
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
devices and try again.
see: http://illumos.org/msg/ZFS-8000-6X
config:

media-1 UNAVAIL missing device
raidz2-0 ONLINE
gpt/MEDIA2 ONLINE
gpt/MEDIA3 ONLINE
gpt/MEDIA4 ONLINE
gpt/MEDIA5 ONLINE
gpt/MEDIA6 ONLINE
gpt/MEDIA7 ONLINE
cache
3179422814501861256
1157426471439342968

Additional devices are known to be part of this pool, though their
exact configuration cannot be determined.
Import pool media-0
pool: media-0
id: 6179961275973326286
state: UNAVAIL
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
devices and try again.
see: http://illumos.org/msg/ZFS-8000-6X
config:

media-0 UNAVAIL missing device
mirror-0 ONLINE
gpt/MEDIA0 ONLINE
gpt/MEDIA1 ONLINE
cache
14391042442005362899
2268315163498519468

Additional devices are known to be part of this pool, though their
exact configuration cannot be determined.
------------
I think the issue might be not only the missing cache, but also the swap partitions...
What is ZFS Gurus' assessment?
CiPHER
Developer

1199 posts

Posted on 20 February 2017 @ 04:26
I think you can just import your pool.

Have you tried on the command line as root:

zpool import -m <poolname>

?
CiPHER
Developer

1199 posts

Posted on 20 February 2017 @ 04:27
Two questions though:
-> is the pool version at least 19?
-> are you running a modern ZFSguru version? I recommend you do.
karmantyu
User

131 posts

Posted on 20 February 2017 @ 08:06edited 08:10 18s
Yes on older ZFS versions a missing Log device could kill the pool.
However on modern ZFS v5000 I've pulled out Log and Cache devices without big problem already.
So an 'uname -a' and 'zpool get version <poolname>' output from the system here would be useful.
hgeorges
User

8 posts

Posted on 20 February 2017 @ 11:48edited 11:51 47s
Thank you for the prompt reply

I tried a variety of commands yesterday and they didn't work. I must have tried "zpool import -m <poolname>" too, but hey, I tried again this morning, and the command finished w/o an error message. Interesting!

The os (zfsguru) is a new install - as I mentioned briefly above. The pools however were created on one of the prior zfsguru versions (perhaps 9 or 10, more likely)
------
Powered by ZFSguru version 0.3.1
Running official system image 11.0.008 featuring FreeBSD 11.0-RELEASE-p1 with ZFS v5000.
Running Root-on-ZFS distribution.
------
[root@zfsguru ~]# uname -a
FreeBSD zfsguru.bsd 11.0-RELEASE-p1 FreeBSD 11.0-RELEASE-p1 #0: Mon Oct 3 09:27:08 UTC 2016 jason@zfsguru.com:/SHARED/obj/amd64/SHARED/sourcecode/sys/OFED-POLLING-ALTQ amd64

With the pools unavailable, 'get version' command returned no pool info.

So I ran 'import -m' against first (media-0) pool, and now the pool changed from UNAVAILABLE, to DEGRADED. Exciting!

[root@zfsguru ~]# zpool status media-0
pool: media-0
state: DEGRADED
status: One or more devices could not be opened. Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
see: http://illumos.org/msg/ZFS-8000-2Q
scan: none requested
config:

NAME STATE READ WRITE CKSUM
media-0 DEGRADED 0 0 0
mirror-0 ONLINE 0 0 0
gpt/MEDIA0 ONLINE 0 0 0
gpt/MEDIA1 ONLINE 0 0 0
logs
mirror-1 UNAVAIL 0 0 0
17375710815694326626 UNAVAIL 0 0 0 was /dev/gpt/SSD-SLOG-00
7031202486153265567 UNAVAIL 0 0 0 was /dev/gpt/SSD-SLOG-10
cache
14391042442005362899 UNAVAIL 0 0 0 was /dev/gpt/SSD-L2ARC-01
2268315163498519468 UNAVAIL 0 0 0 was /dev/gpt/SSD-L2ARC-11

errors: No known data errors
[root@zfsguru ~]# zpool get version media-0
NAME PROPERTY VALUE SOURCE
media-0 version - default
'get version' doesn't show anything...


What do I do from here?
Re-create all those partitions and hope they'll be attached to the pool? I have the two SSDs back in the system, and can repartition and label them as before, but I doubt they'll get the same ids.
Thanks again for your help
hgeorges
User

8 posts

Posted on 20 February 2017 @ 12:00edited 12:16 31s
ok. I took a bold step and ran 'import -m' against the second pool and now this one is back in DEGRADED state. Wow! Not sure what happened over night!
[root@zfsguru ~]# zpool import -m media-1
[root@zfsguru ~]# zpool status media-1
pool: media-1
state: DEGRADED
status: One or more devices could not be opened. Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
see: http://illumos.org/msg/ZFS-8000-2Q
scan: scrub repaired 0 in 2h53m with 0 errors on Sat Sep 19 20:04:07 2015
config:

NAME STATE READ WRITE CKSUM
media-1 DEGRADED 0 0 0
raidz2-0 ONLINE 0 0 0
gpt/MEDIA2 ONLINE 0 0 0
gpt/MEDIA3 ONLINE 0 0 0
gpt/MEDIA4 ONLINE 0 0 0
gpt/MEDIA5 ONLINE 0 0 0
gpt/MEDIA6 ONLINE 0 0 0
gpt/MEDIA7 ONLINE 0 0 0
logs
mirror-1 UNAVAIL 0 0 0
5134543617975030621 UNAVAIL 0 0 0 was /dev/gpt/SSD-SLOG-01
8012737256149287713 UNAVAIL 0 0 0 was /dev/gpt/SSD-SLOG-11
cache
3179422814501861256 UNAVAIL 0 0 0 was /dev/gpt/SSD-L2ARC-00
1157426471439342968 UNAVAIL 0 0 0 was /dev/gpt/SSD-L2ARC-10

errors: No known data errors
[root@zfsguru ~]#
Interesting and exciting!... How do I get the pools out of the DEGRADED state?

New edit:
I just checked and both pools are accessible! That's great!
Did some additional searching on the internet and found this on Oracle's site:
https://docs.oracle.com/cd/E53394_01/html/E54801/gbbba.html#SVZFSgbcdv

...
You can resolve the log device failure in the following ways:

Replace or recover the log device.

Bring the log device back online.

# zpool online storpool log device
Reset the failed log device error condition.

# zpool clear storpool
To recover from this error without replacing the failed log device, you can clear the error with the zpool clear command. In this scenario, the pool will operate in a degraded mode and the log records will be written to the main pool until the separate log device is replaced.

Consider using mirrored log devices to avoid the log device failure scenario.
...

If I read this right, I need to proceed recreating all the previous SSD partitions (label them as before) and all will get back to normal.
Hopefully! Thank you!
CiPHER
Developer

1199 posts

Posted on 20 February 2017 @ 16:17
You also have sLOG/ZIL disks; so that probably was the reason. You can try this to remove the SSDs from the pool:

zpool remove media-1 5134543617975030621
zpool remove media-1 8012737256149287713
zpool remove media-1 3179422814501861256
zpool remove media-1 1157426471439342968

zpool status media-1
hgeorges
User

8 posts

Posted on 20 February 2017 @ 16:37edited 16:38 35s
Hello again. I want to let you know that everything is back to normal!

After my earlier post, I went ahead and recreated the original (what I thought to be the original) partitions, used the same labels etc.
I tried to attach them, but that didn't work, therefore I replaced the old references with the new ones (used zfsguru web menus).

A couple of questions from this exercise:
1) When I attached the l2arc partitions, in the memory req's column I got "more memory needed" (1.5 GB for one, 2 GB for the other). There is no more memory to add, so I'm going to leave that alone. Should I have any concerns?

2) in the disks section, APM is disabled on all disks. Is it ok if I enable it? (I want to spin down the disks if the server is not in use) Is there any danger in doing that? Is there a setting better than other?

Thank you very much for your help
DVD_Chef
User

128 posts

Posted on 20 February 2017 @ 18:14
You can also try to import damaged or stubborn pools in readonly mode, as many times this will work and at least allow you to copy the data off. This also keeps any changes from being made and possibly making a bad situation worse.

zpool import -o readonly=on <poolname>
hgeorges
User

8 posts

Posted on 20 February 2017 @ 22:22
Thank you, DVD_Chef. That's one thing I didn't try... importing r/o. I'll keep that in mind...

I'm glad things came back to normal (I've almost given up after my first tries and Internet readings)... initially the outlook was pretty bleak, but following Cipher's suggestion, somehow things worked out well.

Now all basic NAS functionality is working - I only need to understand some of the configuration options (among those my last posted questions) and figure out how to add more services. The menu is straight forward, but not everything is running seamlessly after installation.
Thank you
hg
hgeorges
User

8 posts

Posted on 21 February 2017 @ 23:57
Hello CIPHER,
any chance you can look at my last questions, and advise:
...
A couple of questions from this exercise:
1) When I attached the l2arc partitions, in the memory req's column I got "more memory needed" (1.5 GB for one, 2 GB for the other). There is no more memory to add, so I'm going to leave that alone. Should I have any concerns?

2) in the disks section, APM is disabled on all disks. Is it ok if I enable it? (I want to spin down the disks if the server is not in use) Is there any danger in doing that? Is there a setting better than other?
Thank you
CiPHER
Developer

1199 posts

Posted on 22 February 2017 @ 00:05edited 00:06 59s
L2ARC uses up RAM memory, but only as the L2ARC is being filled! It doesn't use significant memory straight away. As the RAM gets full, the L2ARC devices simply 'cannot' be filled with more data and won't. Nothing bad will happen; you just won't get the most out of your SSDs to function as L2ARC, that's all.

The ZFSguru page was also created to remember people that using L2ARC uses some RAM as well and give them a rough measure that 100GB of L2ARC is quite a lot and also requires several GB of RAM memory to fill that partition up to 100GB.


As for the APM; i suggest you have a look at the /etc/rc.conf and adapt the file to your needs, by tuning the following section:

APM or Advanced Power Management controls power saving features of your hard
# drives. To disable the dreaded 'headparking' feature, it is common to enable
# APM to a setting of 254, to disable headparking altogether.
# Caution: enabling APM may cause harddrive spindown to stop functioning!
zfsguru_tuning_apm_enable="YES"
zfsguru_tuning_apm_disks="ada0 ada1"
zfsguru_tuning_apm_level="254"
# To enable your disks to spindown whenever inactive for a number of seconds,
# configure both the timer (in seconds) and the disks you want to be spun down.
zfsguru_tuning_spindown_enable="NO"
zfsguru_tuning_spindown_disks="ada0 ada1 ada2 ada3"
zfsguru_tuning_spindown_timer="7200"


Good luck! :)
hgeorges
User

8 posts

Posted on 24 February 2017 @ 15:48
Thank you, CHIPHER

On this last note, how do you suggest I corroborate the current APM settings in the web interface with what is in rc.conf? Configure the APM in the web to the desired parameter, and use the same in rc.conf?
I'll stop with APM questions here, since it went beyond the original topic. Normally this should be moved to it's own thread.

Thank you
CiPHER
Developer

1199 posts

Posted on 24 February 2017 @ 17:15
Yeah feel free to open a new thread if you have more questions.

But the functionality in the Web-interface for APM/AAM is very primitive - it does not get remembered after a reboot. Editing /etc/rc.conf is the better option because it does work after a reboot, or when manually executing:

service zfsguru-tuning start
Last Page

Valid XHTML 1.1