Latest interface: 0.3.1
Latest system: 010

70 posts

Posted on 14 September 2011 @ 14:22

I wanted to delete a duplicated media directory containing approx 8K files. It takes ages to delete, larger files => longer time
The system is close to idle and top shows:

last pid: 28065; load averages: 0.04, 0.12, 0.10 up 4+13:44:27 12:21:46
36 processes: 1 running, 35 sleeping
CPU: 0.0% user, 0.0% nice, 1.4% system, 0.8% interrupt, 97.7% idle
Mem: 3376K Active, 5516K Inact, 7435M Wired, 856K Cache, 424M Free
Swap: 10G Total, 28M Used, 10G Free

27863 root 1 45 0 5824K 452K zio->i 2 1:22 0.59% rm
1288 root 1 44 0 6912K 356K select 0 3:13 0.00% powerd
1200 www 1 44 0 13352K 340K kqread 0 0:13 0.00% lighttpd
1164 root 1 44 0 20964K 704K select 0 0:08 0.00% nmbd
1321 root 1 44 0 26124K 212K select 1 0:05 0.00% sshd
1125 root 4 76 0 5824K 248K rpcsvc 0 0:03 0.00% nfsd
28013 root 1 44 0 9372K 1504K CPU3 3 0:02 0.00% top
1013 root 1 44 0 6916K 400K select 1 0:02 0.00% syslogd
1330 root 1 44 0 7972K 356K nanslp 1 0:02 0.00% cron
21285 root 1 44 0 9368K 732K select 2 0:02 0.00% screen
1168 root 1 44 0 28616K 608K select 1 0:02 0.00% smbd
1137 root 1 76 0 20260K 192K rpcsvc 1 0:01 0.00% rpc.lockd
1036 root 1 44 0 7972K 272K select 0 0:01 0.00% rpcbind
27842 root 1 44 0 9368K 216K pause 1 0:01 0.00% screen
27838 ssh 1 44 0 38056K 668K select 0 0:01 0.00% sshd
1131 root 1 44 0 275M 268K select 0 0:01 0.00% rpc.statd

Any suggestions? Have seached, but suggestion was driver fault, or disabling NCQ...

70 posts

Posted on 14 September 2011 @ 14:47
When the slowness occur, i do also se the STATE tx->tx on many process

806 posts

Posted on 14 September 2011 @ 14:48
You are using de-duplication? How large is the filesystem you have deduped? Do you have L2ARC device?

If you have a large filesystem with deduplication enabled, even removing a file will take a long time if the dedup tables do not fit into memory. My advice is to look carefully whether you really need dedup. Often it is much more logical to buy another disk instead to enable dedup since that requires ALOT of memory AND a L2ARC cache SSD for large datasets, otherwise the HDD is needed to lookup stuff in the dedup tables.

Try looking at gstat -a output (or I/O monitor on the Disks page if your disks are properly formatted with a label). If you see your disks doing alot of I/O, then ZFS is using your disks for the dedup table lookup, which will be VERY slow!

Deduplication is quite hyped but not really that sexy feature for many home users. I use it on a small dataset with mostly identical data, then the required RAM memory is very modest and impact on performance is very modest. But many people enable it on their TB+ filesystems and that's where the performance totally crashes, even with 8GiB RAM.

70 posts

Posted on 14 September 2011 @ 15:01
Jason, seems like you're spot on. for the 'fun' of having dedup available, I enabled it to see how it perform.
I have now turned it of, but the dataset that already is deduplicated is still there and that is why I am cleaning the system and deleting this.

70 posts

Posted on 19 September 2011 @ 17:17
Ok, I have a big problem and I really hope somebody can help me:

I had dedup turned on on my media storage, containing approx 8 TB data. I am now cleaning the system to backup and then re-create the pool with correct ashift. My problem is during deletion, the system seems to stop work due to full memory and swap.

I have to manually unplug the power and restart the server. This happens everytime.

So, anybody have a suggestion on how I can backup all my data and make sure that all data is over on backup without zfs is dying on me?

70 posts

Posted on 28 October 2011 @ 10:21

This must be FreeBSD ZFS implementation specific, but since last post. I have been migrating all my files to external storage. The server hangs on file deletion, and I only/still have 1 TB left. On Nexenta, things were slow, but it did never freeze the system like in ZFSguru (FreeBSD). No more dedup on me.

Storage is (was) cheap! Time is money...

806 posts

Posted on 28 October 2011 @ 15:15edited 15:16 37s
I don't have an explanation of why Nexenta or other Solaris platforms would be faster for you, but you do need tons of memory to use de-duplication. I suggest you do not use dedup unless for very small datasets on SSDs for example. Enabling dedup on multi-terabyte datasets without at least 64GiB memory + L2ARC available is asking for trouble.

For home users, adding an extra disk is a much better solution than saving a little bit of space with dedup, because of the enormous impact on performance if you do not have tons of memory available.

252 posts

Posted on 28 October 2011 @ 16:51
Depending on the data, compression can be a much better alternative to dedup.

70 posts

Posted on 30 October 2011 @ 01:07
I guess I've learned the hard way.

On Nexenta, I didn't go any faster, but it didn't freeze on me like FreeBSD does. I'm very soon done with dedup... *sigh*

55 posts

Posted on 21 December 2014 @ 11:05

If I have a large Pool (8st 3tb disks in raidz3) but I enabled deduplication on just a small dataset in this pool, to store ISOs and VMs (ca 100GB). Will this affect the performance on the rest of the pool? Right now I get 1.44% Dedup ratio. I googled around and after doing: zpool status -D tank and doing some math on the numbers I got, I figured out I only used 8MB of RAM for deduplication - so I should be fine right? :P

dedup: DDT entries 520077, size 1165 on disk, 164 in core

bucket allocated referenced
______ ______________________________ ______________________________
------ ------ ----- ----- ----- ------ ----- ----- -----
1 381K 47,6G 34,5G 34,6G 381K 47,6G 34,5G 34,6G
2 91,9K 11,5G 7,76G 7,71G 196K 24,5G 16,6G 16,5G
4 34,5K 4,31G 3,09G 3,08G 139K 17,4G 12,5G 12,4G
8 239 29,9M 22,9M 23,0M 2,19K 280M 216M 217M
16 284 35,5M 33,3M 33,4M 6,49K 831M 784M 786M
32 41 5,12M 4,52M 4,54M 1,42K 182M 158M 159M
64 7 896K 28K 63,9K 826 103M 3,23M 7,36M
512 1 128K 4K 9,12K 763 95,4M 2,98M 6,80M
Total 508K 63,5G 45,4G 45,4G 728K 91,0G 64,7G 64,6G
Last Page

Valid XHTML 1.1