Latest interface: 0.3.1
Latest system: 010
Becky_181
User

24 posts

Posted on 4 November 2014 @ 01:53
CiHPER,

Thanks for your insight in my other post in the other area.

Decided to ask the rest of my questions about the new build here.

What we are intending to do is use the case below

http://www.norcotek.com/item_detail.php?categoryid=1&modelno=RPC-4224

Which has a backplane attached to the 24 disks it then requires a SAS controller card to be added to your motherboard with a minimum of 6 mini sas ports on it to plug into the backplane. As we are trying to semi future proof this new build we would also require one with 6 sas ports externally to enable the addition of a second 24 drive chassis as below

http://www.norcotek.com/item_detail.php?categoryid=8&modelno=ds-24d

or

http://www.norcotek.com/item_detail.php?categoryid=8&modelno=ds-24e

We know we can get the server chassis locally but the additional storage cases we havent seen about yet but have been told should be no problem.

So looking at these chassis and storage add ons what would you recommend to use as a controller card that wont cost half the price of a house LOL.

We will be using a full server M/B with twin AMD chips or possibly Intel with about 12 cores each and about a min of 64 gig of ram 32 gig each chip.

Any advice gratefully appreciated on this build as it will be ment to run long into this century.

will be looking at as you suggested in other post 20 HDD of 4 TB in 4 vdevs with the remaining 4 slots being used for SSDs for speeding things up with larc2 and such.

Cheers

Becky

CiPHER
Developer

1199 posts

Posted on 4 November 2014 @ 14:16
So, you're looking for a 24-slot Norco and with the opportunity to have a separate casing providing an additional 24-slots for HDDs, netting 48 HDDs total?

That is a big build. But also a cool build. :)


Pool vdev options
You have lots of opportunities:

1. Have 10+10+10+10 (total 40 drives in 4 vdevs) - 40 drives leaving 8 ports unused. The SSDs are best put on the AHCI chipset ports and thus not mounted in the SAS chassis. This way you still have TRIM support; which you lose if you connect via SAS.

2. Have 10+10+10+10+10 (total 50 drives in 5 vdevs) - 48 on the SAS and 2 on the onboard controller. You should have 6 onboard chipset ports. You can mix these without problem; though the 2 drives would be outside of the SAS chassis and have to connected directly to the motherboard.

3. Use vdevs of 19 disks in RAID-Z3 for additional efficiency. This is also an optimal configuration for 4K disks and is more efficient than 2x10 disks in RAID-Z2 (20 disks with 4 parity, versus 19 disks with 3 parity). Total 38 disks. Then you have 10 more ports which you can use for one additional vdev of 10 disks in RAID-Z2. Your vdevs do not need to be of the same disk count or disk size or redundancy level! With this 'golden' configuration you have all 48 SAS ports occupied and have only 3+3+2=8 parity disks, meaning only 16.6% overhead.

Option 1 = 40 disks of which 8 parity disks = 20% overhead
Option 2 = 50 disks of which 10 parity disks = 20% overhead
Option 3 = 48 disks of which 8 parity disks = 16.6% overhead

All options are 4K optimal in terms of performance and space efficiency. You can use 'ashift=12' (meaning 4K optimized) on your pools without losing lots of space.

Options 1 and 3 look best to me, since option 2 means having two disks outside of the chassis. Which i don't think you would want?


Beware of expanders!
I am not in favor of SAS expanders and will advise you to stay clear of them, if possible. You avoid a shitload of compatibility issues, performance issues and it doesn't cost you much extra to be void of such witchcraft. It can work well, but it can also introduce a lot of problems. With careful planning you can do without.

One of the external casings you linked to has an internal expander; whereas the other has 'dumb' SAS ports (6 of them). I suggest the latter!


SAS controllers
This brings us to the controller part. With 48 ports total, that means 12 SAS connectors.

IBM M1015 is a common SAS controller offering 2 Mini-SAS connectors that each provide 4 disks (so 2x4=8 disks total). More expensive controllers have 4x Mini-SAS connectors (16 disks total). So with three of those controllers you have the total of 12 Mini-SAS (=48 disks total). You will need 6 of those connectors to be external, so you can hook it up with the second casing.

PCI-express ports is another issue. Normal consumer boards have only 16 PCI-express links coming from the CPU. This can be split into two x8 slots. Often a third PCI-express slot is possible, but then the three slots will run in x8/x4/x4 mode. The best is to have all three slots run in x8 mode.

If you use a server-grade chipset, you have more PCI-express links available, also if you have a motherboard with dual socket so two processors, then you have twice the number of PCI-express links. But beware: you can only use them if you actually have both sockets populated! With only one CPU in a dual socket mobo, you cannot use all PCI-express slots!


Solid State Disks
Regardless of all three options, you should have at least 2 SSDs which you can connect to the motherboard AHCI ports. With at least 2 SSDs you can mirror them for the Operating System part, have a striped L2ARC and a mirrored sLOG. This can be done quite easily in ZFSguru by partitioning the SSDs in several parts.

Important: you should have a large amount of overprovisioning on your SSDs when using L2ARC! This means a 120GB SSD should only be used like a 80GB SSD; leaving 40GB or more untouched. This is called overprovisioning. Server-grade SSDs have this space already reserved, so a 80GB SSD can be in reality 128GB large but only 80GB is visible. With consumer-grade SSDs you have to do this yourself simply by not partitioning the whole SSD. I strongly recommend 'protected' SSDs like Intel 320, Intel S3500/S3700 or Crucial M500/MX100. If you want the best, choose Intel 320 (not the successor), if you want cheaper choose Crucial MX100. The Intel 320 is better/safer for sLOG, there is virtually no difference for L2ARC or the OS disk part. Intel 320 has fully protected 192K SRAM buffercache. The others have only partial protection. Should be good enough though.


Processor: AMD or Intel?
Generally Intel is preferred because of its faster cores and lower latency. AMD provides cheaper cored but they are less powerful. It all depends on the price. ZFS can use multiple cores very efficiently, so if AMD provides at least twice the number of cores for the same price as an Intel build, you can consider it.


ECC RAM
For this kind of serious build; ECC memory is a must! You should not do without. Beware of the difference between ECC UDIMM (unbuffered) and ECC Registered (RDIMM). Generally UDIMM is slightly faster but with registered you can have more memory available.


Test your build thoroughly!
Before you begin using your build, you should test it thoroughly! In particular, you should test the failure of harddrives by pulling cables, and know what to do in case of drive failure.


Label your disks!
Strongly recommended to label your disks. This means that you format the disk with a disk label like 'C1-22' meaning Chassis 1 - disk 22. Then on the chassis you have the casing numbered 22. This means you know which disk in the web-interface is which physical disk in the chassis. Very important to do this correctly!

If your casing has HDD leds specifically to the port, then you can blink the LED of that port by reading from that disk only, and thus find the correct drive that corresponds to the case number.


Good luck! :)

Oh and: if you do build this awesome beast, be sure to make pictures and post them here! People looooove to see pics! And you know: PICS OR IT DIDN'T HAPPEN! :D
CiPHER
Developer

1199 posts

Posted on 4 November 2014 @ 16:06
Oh and one more thing: did you consider using 6TB disks instead of 4TB?

The WD60EZRX (WD Green 6TB) is my favourite at the moment. It uses the latest generation of 1.2TB platters meaning it can reach up to 175MB/s throughput per disk; astonishing for a 5400rpm disk. It is also very power efficient in that it uses only around 3 watts to hold up to 6TB of data available. This means you can forget about spindown and just run them 24/7.

With higher capacity disks, you may also not need the second casing as quickly. So potentially fewer cost as you need fewer SAS/SATA links.

I know WD Green gets a lot of negative but it's my favourite drive. You *DO* need to run WDIDLE.EXE on them before putting them into use, however, to set the headparking timer to 300 seconds. Headparking is a useful feature but by default it is set too aggressively. By setting it at 300 seconds (the maximum i think) you still have the benefits of headparking while not having excessive headparking which may shorten lifespan and cause minor delays.
Becky_181
User

24 posts

Posted on 5 November 2014 @ 00:04
CiPHER,

Thanks for the reply and all the good info. like you we have been sticking to the WD green drive though we had some problems with them which now, thanks to your SMART info in ZFS Guru i know to have been caused by the bad airflow in the drive bays which moving fans about has solved.

will check the price of the 6TB drives locally as the 4TB are now under $200 each at our local computer supplier who gives us Home builder discounts.

We are looking at a full server mobo for the extras it will give us and for longevity of the build and as we intend to buy the HDDs in lots of 10 your discussed raid setups will work perfectly.

when we start
this build i think for a bit of showing off will have to not only take pics will have to post a vid on you tube showing how good ZFS Guru is.

Cheers

Becky
CiPHER
Developer

1199 posts

Posted on 5 November 2014 @ 00:10
A video on Youtube is one of the best things you can do to give ZFSguru a warm heart!

I remember you saying that you got to know about ZFSguru due to the video victorB put on Youtube? It's great that users share their experiences and i hope more users get to know ZFSguru if you put your video online. Be sure to share the URL once you're there! :)
DVD_Chef
User

120 posts

Posted on 6 November 2014 @ 00:01
Sounds like an interesting build, and should give you a lot of space! What is the primary use for this storage server going to be? If space is your primary concern, then the 10 drive vdev's will work well, at the expense of maximum performance. Remember that from an IOPS perspective, each RAIDZ# vdev acts basically like a single drive, and the slowest drive in the vdev at that. SSD's for cache and log drives will help, depending on your workload. ZFSGuru has a very nice destructive benchmarking tool that will test different disk layout options and graph the performance each gives.

Lots of memory is key, as ZFS can then store file metadata in ram which speeds up browsing and accessing files from the client perspective. I was bit by this issue with a previous 15 drive build where the motherboard chosen did not allow installation of enough ram. As this server had a lot of smaller files accessed by multiple clients, the arc_meta value required was greater than available memory. This severely limited performance, and the "speed" of the system from the client perspective.

Should be a fun build, just plan and test thoroughly.
CiPHER
Developer

1199 posts

Posted on 6 November 2014 @ 01:03
Remember that from an IOPS perspective, each RAIDZ# vdev acts basically like a single drive, and the slowest drive in the vdev at that.That is correct. But remember that this is only true for reading; not for writing. ZFS makes random writes behave as sequential writes. So it is only the random reads that are troublesome.

To cope with this problem, enough RAM is a good thing. But L2ARC caching can help 'extend' RAM. The L2ARC is not as fast as real RAM, but still a lot faster than the slow disk with its very high seek latency.

Like processors having L1, L2 and L3 cache, each layer starts small and fast and becomes bigger but slower. I like to think of it like this:

CPU L1 -> CPU L2 -> CPU L3 -> RAM -> SSD L2ARC -> Disk

So the L1 layer starts at 64 kilobytes but extremely fast (more than 100 gigabytes a second), where the RAM can be 64 gigabyte but already slower (10GB/s) and the SSD can be even bigger but again slower (1GB/s when pairing two SSDs). Finally the Disk is the slowest, which can be 64 terabyte but extremely slow for random reads: 0,2 megabyte a second (60 IOps * 4K request).

So you see, the more caching layers 'above' the disk layer, the less you hit that extremely slow disk seek performance. The L2ARC SSD is a way of getting more 'RAM' albeit a tad slower, but still way faster than the disk layer.

Like DVD_Chef said, caching metadata is very cool and it is something that i'm very 'hot' on. I myself use 12GiB of my RAM just for the metadata. Tuning the arc_meta like DVD_Chef said is highly recommended; because by default i think it's only about 1 gigabyte or something. That would mean a lot of disk accesses just for the metadata.

Having all metadata on RAM and/or L2ARC SSD cache is a good thing; it allows quick searches and means the disks get mostly used for sequential access, which they are very good at. Only random reads for data files is still possible in this case, like for Virtual Machine images which you store on the pool. If you store mainly large files, then metadata caching is basically all you really need.
DVD_Chef
User

120 posts

Posted on 6 November 2014 @ 20:47edited 23:46 09s
As CiPHER said, having SSD L2ARC to offload the actual disk data cache to really helps. My latest servers have 16 4tb SAS drives for data and 128G of memory, of which 80G is set for the arc_meta_limit value. As my arc_meta values fluctuate between 50-75G, this keeps it all in RAM and allows my cached metadata hit rate to be over 90%. Really speeds up my rsync jobs, as their heavy read operations are fulfilled mostly from the metadata cache.
Becky_181
User

24 posts

Posted on 7 November 2014 @ 20:04
Basically it is a media server the one we have now is running with 10 x 2 TB WD greens in it using a gigabyte 890 FX UD5 mobo and a 64-bit quad core AMD Phenom(tm) II X4 945 Processor and 16 gig of RAM.

However the storage even now with ZFS on it is more than half used where under WHS 2011 we were down to only a couple of TB left with a hardware raid 5 setup with 6 of the disks and 4 disks using software raid 5. This has ment we can continue to use this setup for about a year or two before the new build will need to be ready and will enable us to do it right starting with 10 x 4 TB WD Greens and the SSDs for the L2ARC and logs etc as CiPHER has suggested. Sticking to the 4 TB as they are half the price locally as the 6 TB ones.

Also we can add vdevs in groups of ten drives as able.

Stage one will be the basic build of the case mobo and 2 or maybe 4 CPUs (yet to be decided on), a minimum of 64 GIG of RAM 10 drives

stage 2 will be to add the second 10 drives

stage 3 will be the added storage case and 10 more drives

stage 4 will be the final 10 drives

so far the costing is 40 x 4 TB drives @ $200 each

servercase $450

storage expansion case $400

and we are allowing $500 to $600 for a suitable server MOBO
$1000 for 2 x CPUs
and upto about $500 for the RAM using RDIMMs

the complete build to finished stage 4 will take about 2 years allowing for time to cover each stage's costs

Cheers

Becky
DVD_Chef
User

120 posts

Posted on 7 November 2014 @ 22:45
Is this going to just be for media storage, or will it be doing media transcoding on the fly to multiple clients or running virtual machines as well? Others can chime in here, but it sounds like multiple CPU's would really be overkill for just a storage box, as the processor load is minimal. Might be able to save a few dollars on processors and get more ram instead, and lower the power consumption and heat created.
Becky_181
User

24 posts

Posted on 14 November 2014 @ 13:08
Hi Ya DVD-Chief,

The build will mainly be for storage of multi media, as for transcoding will depend on the future items we buy at the moment it will serve a LAN of 6 computers and a couple of Android devices and at times a chrome book.

The idea will be that all devices can be accessing the multi media at the same time without causing any lag at all.

Also depending on how it performs it may end up becoming extra storage for some game programs to save on added drives, power requirements and heat from my son's and my gaming computers we have a gigalan which has a through put that when needed can run at a constant 500 to 800 mb/s. not lightening fast but should be suitable for the games we intend to place on it as they are very disk space hungry when you have the addons included but not on disk access when playing it will also give us the redundancy of the ZFS filesystem so we wont loose data and then have to re download addons which is a pain and time consuming.

At the moment our game rigs have each got a 3 TB drive for the placing of games so any system drive failures or problem dont usually effect the data on it.

Lastly this is a project we want to build simply so we can say we did it, we arent full on geeks in that we are self taught and have found over the years that we can do stuff that those with the usual training are blind to as it has been drummed into them you cant do this or that.

For us it is all about a shared interest something my son and I can do together and spend some fun times together. Sounds like a typical hobby to me. Which brings into play the why not argument do we need it as bold as this or as expensive as this maybe not but why not spend the time and money to have the pleasure and fun of doing it and seeing the looks people give when they see it lol.

My last job in the Military was looking after my units computers, My qualifications for the job was that i could fix them and keep them running in the bush with all the dust and heat and other crap as well as being like the units help desk when in base.

The best moment of the time doing the job was sitting in a tent with 2 Toshiba laptops pulled apart on a folding table as i took them from being unusable due to a CD and an A drive failure one on each and the Hard Drives need to have the required OS image installed again. Simple fix pull the A drives out and swap them we now had one fully functional computer and one that would work but had no CD or A drive to store stuff to. as they were identical units all i did was swap hard drives while re imaging them, In the middle of all this in walks the CO of the Brigade Signals unit and the Officer responsible for all the IT in the Brigade, they had no idea what was happening they thought the computers were wrecked but 15 min later they were astounded to see both all back together and working.

What should have happened was my unit should have waited till we got back to base and arranged for these to go to Toshiba to be fixed and be without them for the remaining time in the field. The qualified geeks in the brigade didnt even know how to open the laptops up, after that an added part of my job was to teach the geeks some simple fault finding and how to replace or mix and match bits to do similar things in the field.

What made this all the better was that as far as the military was concerned i was a clerk a driver not a geek and as i was a reservist i got paid $72 a day tax free when at base and $100 a day when in the field. How many get to say they were paid that sort of money to do their hobby and have to teach so called qualified people stuff they should know and have a record of never having a computer go down for more than an hour in the field.

i like to shock people this way LOL

Cheers

Becky
CiPHER
Developer

1199 posts

Posted on 14 November 2014 @ 20:22
Cool story, Becky!

I have similar experiences, but on the other side of the fence i guess.

In class (high school) i was known as the tech guy, always reading tech books instead of the school lecture. But at the computer class we were being tought things like Word and Excell; things i know almost nothing about. And this girl who was a-technical actually taught me a thing or two about creating tables if i remember correctly.

She was stunned that i did not knew that sort of thing. And she was quite adept at finding the correct tool in the maze of menus.

The moral of the story, i guess, is that we can all learn from each other. And every human being brings something unique to the table. So called experts must still acknowledge that they can learn quite a lot from other people. I think you can learn something from every other human being. It just takes the right circumstance.

Anyway, i just had to think about this when i read your story. :D
Becky_181
User

24 posts

Posted on 15 November 2014 @ 20:03
Way too true CiPHER my mom died at 89 and dhe always tried to learn something new every day, and i tried to teach my kids to look at things from different ways not always just look at a problem from one direction. It is why my son and I work so well with computers we both approach things from different places and i am never to proud to ask for his help.

Which is one thing lots of people forget even the smartest people in the world had to ask for help at some stage to learn all they know
aaront
User

75 posts

Posted on 13 January 2015 @ 22:23
IMO putting the board and everything into a norco can be more fun that I like to do personally. It's not that it's impossible, it's just a pita.

Check out supermicro, my current prod box is from them and it's been great

Hardware (about $4300):
supermicro SC846B 4U 920W RPS Chassis
x9DRH-7TF-O (dual onboard 10g-BaseT)
2x xeon E5-2609
6x 16gb DDR3-1333 (should have gotten 8 to balance out the slots, whatever)
2x LSI 9211-8I
Drives (about $5300):
10x seagate 4tb sas constellation ST4000NM0023 (using 8 as 4 mirrored pairs, 2 for spare. I like maxing at 8 drives in the pool so I can move to a single lsi in a whitebox if things go bad)
2x Intel S3700 200gb zil/l2arc. I know the size is overkill for zil but the performance is sick
4x Samsung 840 EVO 1TB. This will be a raid10 pool just for nfs export for running VMs
downloadski
User

17 posts

Posted on 22 January 2015 @ 06:54edited 06:57 03s
I have servers with 10x4TB drives in a raidz2 vdev under a pool. Both Seagate(5900 rpm) and Hitachi(7200 rpm)

I do not have multiple vdev in an single pool, so 2 servers, 40 discs in 4 pools. This way i can spin down half the discs when i play a movie, and i only have 1'server on at that time.

Avoid 7200 rpm drives if noise is an issue, they simply heat up more, so you have to cool more.
Such a pool delivers close to 1 TB/sec when you test in ZFSguru interface. Real time transfer i have seen op to 450 MByte/sec over single 10GE link. (Test from ram drive, max from myPC which has 4x3TB in raid-5 is about 350 MB/sec)

I use xeon E3-1220 (v1 and v2) and see just over 50% on one core with these transfer rates.
But i only feed 1 player with data, which is never more than about 50 mbps i think (blu ray max data rate on a disc)
downloadski
User

17 posts

Posted on 17 April 2015 @ 09:14
I have now got zfs send and receive working great over 10GE
Transfer of 18TB pool average of 880 MB/sec :)
zoid
User

18 posts

Posted on 3 May 2015 @ 20:17edited 20:40 45s
Hi everybody!
Hope it's ok to ask about tuning and pool build recommendation in this topic =)
I'm building my new media storage: lots of big media files (mostly 3-40GB each) and media transcoding on-the-fly to multiple users.
What's onboard:
24-bay servercase (with 2 additional places for 2'5 inside, so 26bay)
SuperMicro x10DAC
2x Xeon E5-2620 v3
Adaptec ASR-72405 (as a sata-hub, lol)
64 GB DDR4 2133 ECC Reg (8 pcs * 8Gb)
12x 5Tb HGST Desktar NAS (7200 rpm, with 38-41C temp while copying my previous 10Tb pool ), 128 Mb cache (the chipest $ per Tb of all desktar NAS)
2x SSD 120G Kingston HyperX 3K (will check whether they attached to AHCI ports in MB or not)

right now, i'm testing the config, and have not so fast write perfomance for my 5*5Tb (another 7 is on their way to me) RaidZ1 - 30-33MB/s on write =((( any ideas why?
Read perfomance from previous pool of 4*3Tb WD RED RaidZ is ok as the write speed of new pool is my bottleneck (110-120MB/s reading for 3-5 sec, then idle for writing cached data i suppose)

i dunno, do i have to optimize for 4K my new pool for data (i didnt when was making a pool). i didn't found any info about 4K support of my 5Tb HDDs. I've MADE 4K optimization for my mirrored SSDs (use them only for Operating System). ZFSGuru (v 0.2.0b9 with FreeBSD 10.1-001 STABLE, which i have to reinstall because it's not compatible for some reasons with the latest versions of Plex Media Server i want to use) shows me that all the disks (SSDs, old and new HDDs) use 512Kb sector, is it ok, or should i make some changes?
i'm not going to use cache devices because of huge RAM and big files use. is it ok, or i can improve overall perfomance with cache devices in my case?
i'm wondering, what the best pool i can make in my case of future 24 HDDs, and 12 HDDs which i got in 2 days? I'm too lazy to check up SMART often, so i think of using 2 vdev (11x HDD in Z3 + 1 hotspare, each vdev), but it's not optimal as i read CiPHER post about 20 and 16.6% overhead =) available space is important to me, but i don't think i can use all of 16*5Tb in a few years (8*5 for next year, mb 2 is ok) =)
Maybe i dont have to stick my site to 4K and i can use my disks more optimal?
for ex. 2 vdevs (12 HDDs in Z2 or Z3 w/o hotspares)?
I would be appreciate for any advices, you - Gurues, can give me =)

Thanks in advance.
CiPHER
Developer

1199 posts

Posted on 3 May 2015 @ 20:53
Hi zoid,

I would request you create a separate thread about this. Please create one and i'll respond to your inquiries.

Have a nice day!
zoid
User

18 posts

Posted on 3 May 2015 @ 21:16
Hi CiPHER,
sure, thanks
Last Page

Valid XHTML 1.1