计算化学公社

标题: DIY NAS小记 - 实验室团队版 [打印本页]

作者
Author:
Entropy.S.I    时间: 2024-6-6 22:44
标题: DIY NAS小记 - 实验室团队版
本帖最后由 Entropy.S.I 于 2024-6-17 21:59 编辑

DIY NAS小记 - 实验室团队版
June-2024 by ア熵增焓减ウ | yult-entropy@qq.com | entropylt@163.com

上个月笔者给自己实验室部署了一台NAS并实施了配套的组网,给实验室15个工位通了光缆,用于团队协同工作。一些实施细节记录于此。


1 简介

本次依旧是完全DIY,采用了大量二手硬件,当然,硬盘还是使用全新的。配置如下:

此配置和去年9月的“miniHPC存储节点”有诸多相似之处,主要的修改有以下几点:

我校财务政策较灵活,因此以上硬件均为自行采购并开发票报销,NAS主机部分花费~3.6万CNY,其余部分花费~6800CNY。



2 照片展示
▼ 12块WD Ultrastar DC HC560 20TB SATA HDD




▼ CPU、主板、DIMM三件套




▼ 老朋友机箱,由深圳君名创达/零夏壹度生产




▼ NAS内部视图




▼ PCIe AIC三件套:SSD、40G/56G Eth/IB网卡、HBA卡



▼ 下层硬盘笼后侧视图








▼ 部署完成




3 性能测试
3.1 ZFS调优

TrueNAS Scale 24.04进一步优化了默认的ARC配置策略,不需要手动调节ARC相关的参数,但L2ARC参数仍需调节。编辑/etc/modprobe.d/zfs.conf,写入以下3行:

options zfs l2arc_noprefetch=0

options zfs l2arc_write_boost=10000000000

options zfs l2arc_write_max=10000000000

如此调节后,所有向HDD阵列中写入的数据都会同时写入到L2ARC,以最大化L2ARC命中率。为L2ARC配的SSD写入寿命很长,即使设置了上述的“暴力”策略,在实验室团队协作的使用场景下也可以使用很久。


3.2 本地读写性能(参数同上一期

▼ 64GiB文件,128KiB数据块,顺序写

  1. test_64g: (g=0): rw=write, bs=(R) 128KiB-128KiB, (W) 128KiB-128KiB, (T) 128KiB-128KiB, ioengine=libaio, iodepth=4
  2. fio-3.33
  3. Starting 1 thread
  4. test_64g: Laying out IO file (1 file / 65536MiB)
  5. Jobs: 1 (f=1): [W(1)][100.0%][w=1492MiB/s][w=11.9k IOPS][eta 00m:00s]
  6. test_64g: (groupid=0, jobs=1): err= 0: pid=94414: Mon May 20 07:26:30 2024
  7.   write: IOPS=14.2k, BW=1770MiB/s (1856MB/s)(64.0GiB/37023msec); 0 zone resets
  8.     slat (usec): min=12, max=10981, avg=67.06, stdev=76.89
  9.     clat (usec): min=2, max=11532, avg=214.23, stdev=190.06
  10.      lat (usec): min=27, max=12736, avg=281.29, stdev=244.49
  11.     clat percentiles (usec):
  12.      |  1.00th=[   50],  5.00th=[   65], 10.00th=[   69], 20.00th=[   81],
  13.      | 30.00th=[   97], 40.00th=[  119], 50.00th=[  145], 60.00th=[  176],
  14.      | 70.00th=[  215], 80.00th=[  330], 90.00th=[  498], 95.00th=[  603],
  15.      | 99.00th=[  783], 99.50th=[  906], 99.90th=[ 1319], 99.95th=[ 1598],
  16.      | 99.99th=[ 3163]
  17.    bw (  MiB/s): min=  561, max= 4501, per=100.00%, avg=1775.79, stdev=604.47, samples=73
  18.    iops        : min= 4492, max=36020, avg=14206.21, stdev=4835.95, samples=73
  19.   lat (usec)   : 4=0.01%, 50=0.95%, 100=30.67%, 250=41.57%, 500=16.85%
  20.   lat (usec)   : 750=8.64%, 1000=0.99%
  21.   lat (msec)   : 2=0.30%, 4=0.02%, 10=0.01%, 20=0.01%
  22.   cpu          : usr=8.21%, sys=44.18%, ctx=165362, majf=0, minf=25
  23.   IO depths    : 1=0.1%, 2=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
  24.      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
  25.      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
  26.      issued rwts: total=0,524288,0,0 short=0,0,0,0 dropped=0,0,0,0
  27.      latency   : target=0, window=0, percentile=100.00%, depth=4

  28. Run status group 0 (all jobs):
  29.   WRITE: bw=1770MiB/s (1856MB/s), 1770MiB/s-1770MiB/s (1856MB/s-1856MB/s), io=64.0GiB (68.7GB), run=37023-37023msec
复制代码

▼ 64GiB文件,8KiB数据块,随机读

  1. test_64g: (g=0): rw=randread, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=32
  2. fio-3.33
  3. Starting 1 thread
  4. Jobs: 1 (f=1): [r(1)][100.0%][r=599MiB/s][r=76.7k IOPS][eta 00m:00s]
  5. test_64g: (groupid=0, jobs=1): err= 0: pid=99657: Mon May 20 07:32:17 2024
  6.   read: IOPS=37.5k, BW=293MiB/s (307MB/s)(64.0GiB/223892msec)
  7.     slat (usec): min=3, max=731, avg=25.39, stdev= 7.39
  8.     clat (usec): min=2, max=3090, avg=828.26, stdev=92.02
  9.      lat (usec): min=30, max=3284, avg=853.65, stdev=94.70
  10.     clat percentiles (usec):
  11.      |  1.00th=[  429],  5.00th=[  635], 10.00th=[  758], 20.00th=[  807],
  12.      | 30.00th=[  824], 40.00th=[  840], 50.00th=[  848], 60.00th=[  857],
  13.      | 70.00th=[  873], 80.00th=[  881], 90.00th=[  898], 95.00th=[  906],
  14.      | 99.00th=[  930], 99.50th=[  938], 99.90th=[ 1090], 99.95th=[ 1516],
  15.      | 99.99th=[ 1582]
  16.    bw (  KiB/s): min=265136, max=597856, per=99.95%, avg=299596.60, stdev=34686.06, samples=447
  17.    iops        : min=33148, max=74732, avg=37449.28, stdev=4335.72, samples=447
  18.   lat (usec)   : 4=0.01%, 50=0.01%, 100=0.01%, 250=0.01%, 500=2.21%
  19.   lat (usec)   : 750=7.42%, 1000=90.20%
  20.   lat (msec)   : 2=0.16%, 4=0.01%
  21.   cpu          : usr=5.11%, sys=94.88%, ctx=778, majf=0, minf=77
  22.   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
  23.      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
  24.      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
  25.      issued rwts: total=8388608,0,0,0 short=0,0,0,0 dropped=0,0,0,0
  26.      latency   : target=0, window=0, percentile=100.00%, depth=32

  27. Run status group 0 (all jobs):
  28.    READ: bw=293MiB/s (307MB/s), 293MiB/s-293MiB/s (307MB/s-307MB/s), io=64.0GiB (68.7GB), run=223892-223892msec
复制代码

▼ 64GiB文件,8KiB数据块,随机读写(7:3)

  1. test_64g: (g=0): rw=randrw, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=32
  2. fio-3.33
  3. Starting 1 thread
  4. Jobs: 1 (f=1): [m(1)][99.7%][r=292MiB/s,w=126MiB/s][r=37.4k,w=16.2k IOPS][eta 00m:01s]
  5. test_64g: (groupid=0, jobs=1): err= 0: pid=110803: Mon May 20 07:39:01 2024
  6.   read: IOPS=20.4k, BW=159MiB/s (167MB/s)(44.8GiB/287677msec)
  7.     slat (usec): min=2, max=31874, avg=30.16, stdev=29.48
  8.     clat (usec): min=42, max=34647, avg=1064.52, stdev=264.45
  9.      lat (usec): min=46, max=34708, avg=1094.68, stdev=270.28
  10.     clat percentiles (usec):
  11.      |  1.00th=[  537],  5.00th=[  799], 10.00th=[  889], 20.00th=[  938],
  12.      | 30.00th=[  963], 40.00th=[  988], 50.00th=[ 1012], 60.00th=[ 1057],
  13.      | 70.00th=[ 1139], 80.00th=[ 1221], 90.00th=[ 1319], 95.00th=[ 1385],
  14.      | 99.00th=[ 1532], 99.50th=[ 1663], 99.90th=[ 3195], 99.95th=[ 4883],
  15.      | 99.99th=[ 8160]
  16.    bw (  KiB/s): min=131472, max=346192, per=100.00%, avg=163396.66, stdev=20779.36, samples=574
  17.    iops        : min=16434, max=43274, avg=20424.61, stdev=2597.47, samples=574
  18.   write: IOPS=8749, BW=68.4MiB/s (71.7MB/s)(19.2GiB/287677msec); 0 zone resets
  19.     slat (usec): min=5, max=17227, avg=37.67, stdev=29.73
  20.     clat (usec): min=2, max=34660, avg=1063.67, stdev=267.21
  21.      lat (usec): min=40, max=34758, avg=1101.34, stdev=274.34
  22.     clat percentiles (usec):
  23.      |  1.00th=[  537],  5.00th=[  799], 10.00th=[  889], 20.00th=[  938],
  24.      | 30.00th=[  963], 40.00th=[  988], 50.00th=[ 1012], 60.00th=[ 1057],
  25.      | 70.00th=[ 1139], 80.00th=[ 1221], 90.00th=[ 1319], 95.00th=[ 1385],
  26.      | 99.00th=[ 1532], 99.50th=[ 1663], 99.90th=[ 3195], 99.95th=[ 4948],
  27.      | 99.99th=[ 8029]
  28.    bw (  KiB/s): min=57341, max=150080, per=100.00%, avg=70040.59, stdev=8930.77, samples=574
  29.    iops        : min= 7167, max=18760, avg=8754.97, stdev=1116.33, samples=574
  30.   lat (usec)   : 4=0.01%, 50=0.01%, 100=0.01%, 250=0.01%, 500=0.67%
  31.   lat (usec)   : 750=3.31%, 1000=42.34%
  32.   lat (msec)   : 2=53.40%, 4=0.21%, 10=0.07%, 20=0.01%, 50=0.01%
  33.   cpu          : usr=6.22%, sys=92.91%, ctx=34861, majf=0, minf=31
  34.   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
  35.      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
  36.      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
  37.      issued rwts: total=5871656,2516952,0,0 short=0,0,0,0 dropped=0,0,0,0
  38.      latency   : target=0, window=0, percentile=100.00%, depth=32

  39. Run status group 0 (all jobs):
  40.    READ: bw=159MiB/s (167MB/s), 159MiB/s-159MiB/s (167MB/s-167MB/s), io=44.8GiB (48.1GB), run=287677-287677msec
  41.   WRITE: bw=68.4MiB/s (71.7MB/s), 68.4MiB/s-68.4MiB/s (71.7MB/s-71.7MB/s), io=19.2GiB (20.6GB), run=287677-287677msec
复制代码

▼ 256GiB文件,128KiB数据块,随机读

  1. test_256g: (g=0): rw=randread, bs=(R) 128KiB-128KiB, (W) 128KiB-128KiB, (T) 128KiB-128KiB, ioengine=libaio, iodepth=32
  2. fio-3.33
  3. Starting 1 thread
  4. Jobs: 1 (f=1): [r(1)][100.0%][r=1409MiB/s][r=11.3k IOPS][eta 00m:00s]
  5. test_256g: (groupid=0, jobs=1): err= 0: pid=135893: Mon May 20 10:31:24 2024
  6.   read: IOPS=11.9k, BW=1485MiB/s (1557MB/s)(256GiB/176514msec)
  7.     slat (usec): min=16, max=19931, avg=82.63, stdev=154.27
  8.     clat (usec): min=3, max=23760, avg=2610.14, stdev=975.50
  9.      lat (usec): min=37, max=23821, avg=2692.77, stdev=993.63
  10.     clat percentiles (usec):
  11.      |  1.00th=[ 1336],  5.00th=[ 1631], 10.00th=[ 1811], 20.00th=[ 2040],
  12.      | 30.00th=[ 2212], 40.00th=[ 2376], 50.00th=[ 2540], 60.00th=[ 2704],
  13.      | 70.00th=[ 2868], 80.00th=[ 3097], 90.00th=[ 3392], 95.00th=[ 3654],
  14.      | 99.00th=[ 4293], 99.50th=[ 4686], 99.90th=[18744], 99.95th=[20317],
  15.      | 99.99th=[21890]
  16.    bw (  MiB/s): min= 1197, max= 1954, per=100.00%, avg=1487.26, stdev=168.17, samples=353
  17.    iops        : min= 9577, max=15633, avg=11897.77, stdev=1345.29, samples=353
  18.   lat (usec)   : 4=0.01%, 50=0.01%, 100=0.01%, 250=0.01%, 500=0.01%
  19.   lat (usec)   : 750=0.01%, 1000=0.01%
  20.   lat (msec)   : 2=18.16%, 4=79.87%, 10=1.72%, 20=0.19%, 50=0.06%
  21.   cpu          : usr=1.97%, sys=40.62%, ctx=409538, majf=0, minf=526
  22.   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
  23.      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
  24.      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
  25.      issued rwts: total=2097152,0,0,0 short=0,0,0,0 dropped=0,0,0,0
  26.      latency   : target=0, window=0, percentile=100.00%, depth=32

  27. Run status group 0 (all jobs):
  28.    READ: bw=1485MiB/s (1557MB/s), 1485MiB/s-1485MiB/s (1557MB/s-1557MB/s), io=256GiB (275GB), run=176514-176514msec
复制代码

▼ 256GiB文件,128KiB数据块,随机读写(7:3)

  1. test_256g: (g=0): rw=randrw, bs=(R) 128KiB-128KiB, (W) 128KiB-128KiB, (T) 128KiB-128KiB, ioengine=libaio, iodepth=32
  2. fio-3.33
  3. Starting 1 thread
  4. Jobs: 1 (f=1): [m(1)][99.1%][r=2338MiB/s,w=1004MiB/s][r=18.7k,w=8034 IOPS][eta 00m:02s]
  5. test_256g: (groupid=0, jobs=1): err= 0: pid=153195: Mon May 20 10:43:04 2024
  6.   read: IOPS=6527, BW=816MiB/s (856MB/s)(179GiB/224864msec)
  7.     slat (usec): min=28, max=24261, avg=136.55, stdev=189.30
  8.     clat (usec): min=2, max=85533, avg=3323.94, stdev=3062.59
  9.      lat (usec): min=35, max=86460, avg=3460.49, stdev=3194.26
  10.     clat percentiles (usec):
  11.      |  1.00th=[  938],  5.00th=[  971], 10.00th=[  988], 20.00th=[ 1020],
  12.      | 30.00th=[ 1074], 40.00th=[ 1156], 50.00th=[ 1303], 60.00th=[ 1631],
  13.      | 70.00th=[ 5735], 80.00th=[ 6456], 90.00th=[ 7242], 95.00th=[ 7963],
  14.      | 99.00th=[12911], 99.50th=[15008], 99.90th=[19792], 99.95th=[23200],
  15.      | 99.99th=[40109]
  16.    bw (  KiB/s): min=270848, max=2747392, per=99.91%, avg=834751.64, stdev=818904.12, samples=449
  17.    iops        : min= 2116, max=21464, avg=6521.32, stdev=6397.68, samples=449
  18.   write: IOPS=2799, BW=350MiB/s (367MB/s)(76.8GiB/224864msec); 0 zone resets
  19.     slat (usec): min=18, max=4789, avg=31.63, stdev=23.80
  20.     clat (usec): min=143, max=81540, avg=3327.94, stdev=3073.95
  21.      lat (usec): min=164, max=81620, avg=3359.57, stdev=3079.39
  22.     clat percentiles (usec):
  23.      |  1.00th=[  938],  5.00th=[  971], 10.00th=[  988], 20.00th=[ 1020],
  24.      | 30.00th=[ 1074], 40.00th=[ 1156], 50.00th=[ 1303], 60.00th=[ 1631],
  25.      | 70.00th=[ 5735], 80.00th=[ 6456], 90.00th=[ 7242], 95.00th=[ 7963],
  26.      | 99.00th=[12911], 99.50th=[15008], 99.90th=[19792], 99.95th=[23200],
  27.      | 99.99th=[42206]
  28.    bw (  KiB/s): min=102656, max=1202944, per=99.91%, avg=357940.76, stdev=350755.27, samples=449
  29.    iops        : min=  802, max= 9398, avg=2796.24, stdev=2740.26, samples=449
  30.   lat (usec)   : 4=0.01%, 50=0.01%, 100=0.01%, 250=0.01%, 500=0.01%
  31.   lat (usec)   : 750=0.01%, 1000=13.61%
  32.   lat (msec)   : 2=47.74%, 4=0.92%, 10=36.03%, 20=1.60%, 50=0.09%
  33.   lat (msec)   : 100=0.01%
  34.   cpu          : usr=2.98%, sys=31.77%, ctx=557496, majf=0, minf=25
  35.   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
  36.      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
  37.      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
  38.      issued rwts: total=1467743,629409,0,0 short=0,0,0,0 dropped=0,0,0,0
  39.      latency   : target=0, window=0, percentile=100.00%, depth=32

  40. Run status group 0 (all jobs):
  41.    READ: bw=816MiB/s (856MB/s), 816MiB/s-816MiB/s (856MB/s-856MB/s), io=179GiB (192GB), run=224864-224864msec
  42.   WRITE: bw=350MiB/s (367MB/s), 350MiB/s-350MiB/s (367MB/s-367MB/s), io=76.8GiB (82.5GB), run=224864-224864msec
复制代码

3.3 Windows PC SMB挂载读写性能

4 Tutorial节选(已脱敏)



作者
Author:
sss668800    时间: 2024-6-8 16:26
ZFS有没有开启去重?这个比较吃内存和cpu资源,不过像小课题组彼此数据独立,不是特别需要

是12*20T够用,不考虑升级了?装机架的话6U刚好能到36盘位,实际上4U就能24盘位了,你的进深800mm足够了。感觉上面的主板cpu部分浪费了机架空间


作者
Author:
Entropy.S.I    时间: 2024-6-8 23:12
sss668800 发表于 2024-6-8 16:26
ZFS有没有开启去重?这个比较吃内存和cpu资源,不过像小课题组彼此数据独立,不是特别需要

是12*20T够用 ...

Dedup是当前ZFS最失败的功能,坚决不用。

NAS不考虑“扩容”,“扩容”是自找麻烦的行为,搞过“扩容”才会知道有多麻烦。“扩容”只适用于分布式存储架构。

不考虑二手准系统机箱,不考虑不能静音化的机箱。目前用的机箱我已经在去年9月买过一次,我认为非常适合办公室使用。

不放在机房,消耗的机柜空间并不重要,我们办公室不会放太多需要上机柜的设备,目前的设备甚至不用机柜都没问题。自有算力都在HPC机房。
作者
Author:
希望先生    时间: 2024-6-10 09:53
由于其接口在市面上找不到,故采用了焊接方案
作者
Author:
sss668800    时间: 2024-6-10 21:31
希望先生 发表于 2024-6-10 09:53
由于其接口在市面上找不到,故采用了焊接方案

按理来说交换机的电源风扇和普通电源风扇一样,是2pin的那种,可以买2pin转3pin的或者2pin转4pind。当然自己焊接也行:

(, 下载次数 Times of downloads: 74)

作者
Author:
Entropy.S.I    时间: 2024-6-10 22:01
sss668800 发表于 2024-6-10 21:31
按理来说交换机的电源风扇和普通电源风扇一样,是2pin的那种,可以买2pin转3pin的或者2pin转4pind。当然 ...

此交换机电源型号PSR250-12A,东莞毓华代工版本,风扇接口如图,问过很多风扇商家都表示不认识此接口。
(, 下载次数 Times of downloads: 78)

作者
Author:
sss668800    时间: 2024-6-11 18:58
Entropy.S.I 发表于 2024-6-10 22:01
此交换机电源型号PSR250-12A,东莞毓华代工版本,风扇接口如图,问过很多风扇商家都表示不认识此接口。
...

这个3pin的我也没见过
作者
Author:
fantexi113    时间: 2024-6-13 14:54
CX4121A的网卡也不错,不知道TureNAS直通如何




欢迎光临 计算化学公社 (http://bbs.keinsci.com/) Powered by Discuz! X3.3