大家好,欢迎来到IT知识分享网。
磁盘性能测试及内核参数调整
IOPS: 每秒磁盘读写次数。
带宽BW:每秒钟磁盘吞吐量(读写大小)。
样机:阿里云ecs ip:192.168.99.181
测试命令:

[root@iZbp1ejqbafrtk1zjch3rzZ testio]# fio -direct=1 -iodepth=64 -rw=read -ioengine=libaio -bs=4k -size=10G -numjobs=1 -name=./fio.test
./fio.test: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.7
Starting 1 process
./fio.test: Laying out IO file (1 file / 10240MiB)
Jobs: 1 (f=1): [R(1)][100.0%][r=76.0MiB/s,w=0KiB/s][r=19.7k,w=0 IOPS][eta 00m:00s]
./fio.test: (groupid=0, jobs=1): err= 0: pid=: Thu Aug 17 14:05:43 2023
read: IOPS=20.1k, BW=78.7MiB/s (82.5MB/s)(10.0GiB/msec)
容器内部测试
[root@iZbp1ejqbafrtk1zjch3rzZ testio]# docker run –rm -it centos-fio:v3.7 bash
[root@cfc3b /]# fio -direct=1 -iodepth=64 -rw=read -ioengine=libaio -bs=4k -size=10G -numjobs=1 -name=./fio.test
./fio.test: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.7
Starting 1 process
./fio.test: Laying out IO file (1 file / 10240MiB)
Jobs: 1 (f=1): [R(1)][100.0%][r=78.4MiB/s,w=0KiB/s][r=20.1k,w=0 IOPS][eta 00m:00s]
./fio.test: (groupid=0, jobs=1): err= 0: pid=22: Thu Aug 17 06:17:20 2023
read: IOPS=20.1k, BW=78.7MiB/s (82.5MB/s)(10.0GiB/msec)
slat (usec): min=5, max=5582, avg= 7.49, stdev= 6.44
clat (usec): min=115, max=21187, avg=3168.01, stdev=3363.51
lat (usec): min=123, max=21193, avg=3175.58, stdev=3363.35
clat percentiles (usec):
| 1.00th=[ 510], 5.00th=[ 578], 10.00th=[ 619], 20.00th=[ 693],
| 30.00th=[ 775], 40.00th=[ 865], 50.00th=[ 947], 60.00th=[ 1156],
| 70.00th=[ 6259], 80.00th=[ 8029], 90.00th=[ 8356], 95.00th=[ 8455],
| 99.00th=[ 9634], 99.50th=[10421], 99.90th=[12780], 99.95th=[13829],
| 99.99th=[15795]
bw ( KiB/s): min=76672, max=, per=100.00%, avg=80606.35, stdev=10317.80, samples=260
iops : min=19168, max=61556, avg=20151.58, stdev=2579.45, samples=260
lat (usec) : 250=0.01%, 500=0.79%, 750=25.93%, 1000=27.70%
lat (msec) : 2=10.99%, 4=2.72%, 10=31.18%, 20=0.70%, 50=0.01%
cpu : usr=3.16%, sys=22.20%, ctx=, majf=0, minf=94
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued rwts: total=,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
READ: bw=78.7MiB/s (82.5MB/s), 78.7MiB/s-78.7MiB/s (82.5MB/s-82.5MB/s), io=10.0GiB (10.7GB), run=-msec
主机磁盘顺序写:
[root@iZbp1ejqbafrtk1zjch3rzZ testio]# fio -direct=1 -rw=write -ioengine=libaio -bs=4k -size=1G -numjobs=1 -name=fio_test_host-write.log
fio_test_host-write.log: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.7
Starting 1 process
fio_test_host-write.log: Laying out IO file (1 file / 1024MiB)
Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=22.8MiB/s][r=0,w=5829 IOPS][eta 00m:00s]
fio_test_host-write.log: (groupid=0, jobs=1): err= 0: pid=: Thu Aug 17 16:12:28 2023
write: IOPS=5686, BW=22.2MiB/s (23.3MB/s)(1024MiB/46102msec)
slat (usec): min=4, max=2052, avg= 7.39, stdev= 6.65
clat (usec): min=2, max=32309, avg=167.76, stdev=216.62
lat (usec): min=108, max=32325, avg=175.23, stdev=217.08
clat percentiles (usec):
| 1.00th=[ 113], 5.00th=[ 116], 10.00th=[ 119], 20.00th=[ 122],
| 30.00th=[ 126], 40.00th=[ 129], 50.00th=[ 133], 60.00th=[ 137],
| 70.00th=[ 145], 80.00th=[ 155], 90.00th=[ 180], 95.00th=[ 243],
| 99.00th=[ 1090], 99.50th=[ 1467], 99.90th=[ 2638], 99.95th=[ 3458],
| 99.99th=[ 5800]
bw ( KiB/s): min=17784, max=25664, per=99.94%, avg=22730.53, stdev=1803.00, samples=92
iops : min= 4446, max= 6416, avg=5682.57, stdev=450.80, samples=92
lat (usec) : 4=0.01%, 100=0.01%, 250=95.18%, 500=2.28%, 750=0.84%
lat (usec) : 1000=0.52%
lat (msec) : 2=0.96%, 4=0.18%, 10=0.04%, 50=0.01%
cpu : usr=2.52%, sys=6.44%, ctx=, majf=0, minf=30
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=22.2MiB/s (23.3MB/s), 22.2MiB/s-22.2MiB/s (23.3MB/s-23.3MB/s), io=1024MiB (1074MB), run=46102-46102msec
Disk stats (read/write):
vdb: ios=1/, merge=0/159, ticks=1/40726, in_queue=40727, util=98.77%
1个容器顺序写:
[root@c56d6c977ff2 /]# fio -direct=1 -rw=write -ioengine=libaio -bs=4k -size=1G -numjobs=1 -name=/tmp/fio_test_write.log
/tmp/fio_test_write.log: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.7
Starting 1 process
/tmp/fio_test_write.log: Laying out IO file (1 file / 1024MiB)
Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=23.6MiB/s][r=0,w=6052 IOPS][eta 00m:00s]
/tmp/fio_test_write.log: (groupid=0, jobs=1): err= 0: pid=27: Thu Aug 17 07:58:17 2023
write: IOPS=5673, BW=22.2MiB/s (23.2MB/s)(1024MiB/46203msec)
slat (usec): min=7, max=548, avg=10.34, stdev= 3.58
clat (usec): min=5, max=14359, avg=160.71, stdev=188.91
lat (usec): min=112, max=14380, avg=171.14, stdev=189.35
clat percentiles (usec):
| 1.00th=[ 112], 5.00th=[ 116], 10.00th=[ 118], 20.00th=[ 121],
| 30.00th=[ 124], 40.00th=[ 127], 50.00th=[ 131], 60.00th=[ 135],
| 70.00th=[ 141], 80.00th=[ 151], 90.00th=[ 176], 95.00th=[ 231],
| 99.00th=[ 889], 99.50th=[ 1254], 99.90th=[ 2540], 99.95th=[ 3490],
| 99.99th=[ 6194]
bw ( KiB/s): min=17904, max=26128, per=100.00%, avg=23283.91, stdev=1565.60, samples=90
iops : min= 4476, max= 6532, avg=5820.98, stdev=391.40, samples=90
lat (usec) : 10=0.01%, 50=0.01%, 100=0.01%, 250=95.56%, 500=2.32%
lat (usec) : 750=0.85%, 1000=0.44%
lat (msec) : 2=0.65%, 4=0.13%, 10=0.04%, 20=0.01%
cpu : usr=0.56%, sys=11.99%, ctx=, majf=0, minf=28
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=22.7MiB/s (23.8MB/s), 22.7MiB/s-22.7MiB/s (23.8MB/s-23.8MB/s), io=1024MiB (1074MB), run=45034-45034msec
[root@c56d6c977ff2 /]# ll -lh /tmp/
total 1.1G
-rw-r–r– 1 root root 1.0G Aug 17 07:58 fio_test_write.log.0.0
4个容器同时写:




相比较1个容器写磁盘,性能有明显下降。
比如,需要限制pod4的写BW为10MB/s,命令如下:
echo “253:16 ” > /sys/fs/cgroup/blkio/system.slice/docker-b91b7fbb9c8db82cfca89a9ec34802c81c9618a254ecc8f.scope/blkio.throttle.write_bps_device
效果,如下:

说明:在direct模式我们可以使用内核参数,有效控制容器对磁盘IO的限制。
继续试验,如果使用buffer模式,观察下磁盘吞吐量。

磁盘带宽没有收到10MB/s限制。
两种模式落盘的简约流程:
direct模式:进程的写操作,会直接发起系统调用wirte,经过kernel—> block layer–>磁盘驱动—>磁盘。
buffer模式:默认,进程写—–>内存page cache中—->内核线程pdflush—->刷入磁盘。
通过上面结果,buffer模式没有被限制。新的cgroup2版本中,这个问题虽然被解决了,但是目前很多centos系统还是沿用的cgrou1版本。
但是,将进程从裸机运行迁移到云容器后,对文件的写操作抖动很大,容器中消耗的时间比裸机中增加了4-7倍的时间。
dirty pages:先写入到page cache中的数据,在没有被刷入到磁盘之前,称为dirty pages。
当dirty pages的数量较多,达到dirty_ratio阈值时,进程的写操作会被暂停,此时内核线程pdflush被唤醒,将dirty pages中的数据刷入磁盘后,进行的写操作才能继续,因此增加了写数据时间。
观察dirty pages ,写入前后的变化:



内核dirty page相关参数在/proc/sys/vm目录下。
dirty_background_ratio:dirty pages占用内存 / 节点available内存 比例,默认值10%,如果脏页占比大于10%,则唤醒pdflush内核线程将内存数据刷入磁盘。
dirty_background_bytes:具体字节数,功能与dirty_background_ratio相同。
dirty_ratio:默认值20%,如果脏页内存百分比大于20%,执行buffer io的进程写操作被阻塞,直到将进程写的内存数据全部刷入磁盘。
dirty_bytes:具体字节数,功能与dirty_ratio相同。
dirty_expire_centisecs:一个时间值,百分之一秒为单位,缺省为3000【30秒】,为dirty pages 在内存中的最长存放时间。
dirty_writeback_centisecs:一个时间值,百分之一秒为单位,缺省值500【5秒】,表示每5秒钟唤醒内核线程来flush 脏页。
启动一个1G内存大小的容器,模拟写入一个10G大小的文件。
docker run –rm -it -v /var/lib/container/testio:/tmp -m=1024m centos-fio:v3.7 bash
[root@e28001b6576f /]# dd if=/dev/random of=/tmp/dirty-1.log bs=1M count=10240
观察脏页变化,如下图:

耗时:49秒

查看操作操作系统页大小为:4096
[root@iZbp1ejqbafrtk1zjch3rzZ testio]# dumpe2fs /dev/vda1 | grep “Block size”
dumpe2fs 1.42.9 (28-Dec-2013)
Block size: 4096
[root@iZbp1ejqbafrtk1zjch3rzZ testio]# getconf PAGE_SIZE
4096
根据捕捉到的dirty数量约为,约合1000M大小,说明容器内存限制生效。

继续,如果我们给容器环境的dirty_bytes 和 dirty_background_bytes,设置很小的值,

启动容器,观察脏页数量及写10G大小文件消耗的时间:


时间远远高于上次的49秒。
所以容器平台上给pod分配内存时,除考虑应用所需外,也要考虑文件IO对内存的使用。
建议调整内核参数:
分类 |
参数名 |
推荐参数值 |
kernel |
shmall |
shmmax/4096 |
shmmax |
MEM*80% |
|
shmni |
4096 |
|
sem |
250 32000 100 128 |
|
core_uses_pid |
1 |
|
file |
fs.file-max |
|
soft nproc |
|
|
hard nproc |
|
|
soft nofile |
|
|
hard nofile |
|
|
net |
net.ipv4.ip_local_port_range |
[32768 65500] |
net.core.rmem_default |
|
|
net.core.rmem_max |
|
|
net.core.wmem_max |
|
|
net.core.wmem_default |
|
|
net.ipv4.tcp_timestamps |
0 |
|
tcp |
tcp_syn_retries |
2 |
tcp_synack_retries |
1 |
|
tcp_keepalive_time |
600 |
|
tcp_keepalive_probes |
3 |
|
tcp_fin_timeout |
10 |
|
tcp_keepalive_intvl |
20 |
|
vm |
dirty_ratio |
30~50% |
免责声明:本站所有文章内容,图片,视频等均是来源于用户投稿和互联网及文摘转载整编而成,不代表本站观点,不承担相关法律责任。其著作权各归其原作者或其出版社所有。如发现本站有涉嫌抄袭侵权/违法违规的内容,侵犯到您的权益,请在线联系站长,一经查实,本站将立刻删除。 本文来自网络,若有侵权,请联系删除,如若转载,请注明出处:https://haidsoft.com/172829.html