Methodology
sequentially writing a 5GB file with cache --
cat /dev/urandom | tee /dev/stdout | tee /dev/stdout| tee /dev/stdout| tee /dev/stdout| tee /dev/stdout| tee /dev/stdout | tee /dev/stdout| tee /dev/stdout| tee /dev/stdout | dd of=random iflag=fullblock bs=1M count=5120
Next write without cache --
rm random
sync; echo 3 > /proc/sys/vm/drop_caches
cat /dev/urandom | tee /dev/stdout | tee /dev/stdout| tee /dev/stdout| tee /dev/stdout| tee /dev/stdout| tee /dev/stdout | tee /dev/stdout| tee /dev/stdout| tee /dev/stdout | dd of=random iflag=fullblock bs=1M count=5120 oflag=direct
Read without cache --
sync; echo 3 > /proc/sys/vm/drop_caches
dd if=random of=/dev/null iflag=direct bs=1M
Read with cache --
dd if=random of=/dev/null bs=1M
cat /dev/urandom | tee /dev/stdout | tee /dev/stdout| tee /dev/stdout| tee /dev/stdout| tee /dev/stdout| tee /dev/stdout | tee /dev/stdout| tee /dev/stdout| tee /dev/stdout | dd of=random iflag=fullblock bs=1M count=5120
Next write without cache --
rm random
sync; echo 3 > /proc/sys/vm/drop_caches
cat /dev/urandom | tee /dev/stdout | tee /dev/stdout| tee /dev/stdout| tee /dev/stdout| tee /dev/stdout| tee /dev/stdout | tee /dev/stdout| tee /dev/stdout| tee /dev/stdout | dd of=random iflag=fullblock bs=1M count=5120 oflag=direct
Read without cache --
sync; echo 3 > /proc/sys/vm/drop_caches
dd if=random of=/dev/null iflag=direct bs=1M
Read with cache --
dd if=random of=/dev/null bs=1M
Repeated read with cache --
sync; echo 3 > /proc/sys/vm/drop_caches
for i in {1..10}; do dd if=random of=/dev/null bs=1M; sleep 1; done
sync; echo 3 > /proc/sys/vm/drop_caches
for i in {1..10}; do dd if=random of=/dev/null bs=1M; sleep 1; done
FS format parameters and mount options --
2 benchmarks will be done for XFS. one with rmapbt=0 and the other with rmapbt=1
These are the xfs parameters --
mkfs.xfs -f -m rmapbt=0,reflink=0
mkfs.xfs -f -m rmapbt=1,reflink=0
ext4 format options are either for large file or for both large and small files.
ext4 format options optimized for large files --
mkfs.ext4 -m 1 -O none,dir_index,extent,^flex_bg,^bigalloc,has_journal,large_file,sparse_super2,^uninit_bg
mkfs.ext4 -m 1 -O none,dir_index,extent,^flex_bg,^bigalloc,has_journal,large_file,sparse_super2,^uninit_bg
ext4 format options optimized for both large and small files --
mkfs.ext4 -g 256 -G 4 -J size=100 -m 1 -C 2097152 -O none,bigalloc,extent,flex_bg,has_journal,large_file,sparse_super2,^uninit_bg,dir_index,dir_nlink,^sparse_super,^sparse_super2
mkfs.ext4 -g 256 -G 4 -J size=100 -m 1 -C 2097152 -O none,bigalloc,extent,flex_bg,has_journal,large_file,sparse_super2,^uninit_bg,dir_index,dir_nlink,^sparse_super,^sparse_super2
xfs mount options --
mount -o logbufs=8,logbsize=256k,noquota,noatime
ext4 mount options (when formatted for large file optimization) --
mount -o noquota,noatime,data=writeback,journal_async_commit,inode_readahead_blks=32768,max_batch_time=10000000
Benchmark results
xfs rmapbt on vs off in nvme --
Without rmapbt
sequentially writing a 5GB file with cache --
2.3 GB/s
Next write without cache --
1.7 GB/s
Read without cache --
2.2 GB/s
Read with cache --
2.9 GB/s
Repeated read with cache --
2.8
17.3
16.2
16.2
16.2
16.2
16.3
16.3
16.1
16.3
With rmapbt
sequentially writing a 5GB file with cache --
2.4 GB/s
Next write without cache --
1.7 GB/s
Read without cache --
2.2 GB/s
Read with cache --
2.8 GB/s
Repeated read with cache --
2.8 GB/s
16.4 GB/s
16.5 GB/s
16.5 GB/s
16.5 GB/s
16.4 GB/s
16.5 GB/s
16.4 GB/s
16.5 GB/s
16.5 GB/s
sequentially writing a 5GB file with cache --
2.3 GB/s
Next write without cache --
1.7 GB/s
Read without cache --
2.2 GB/s
Read with cache --
2.9 GB/s
Repeated read with cache --
2.8
17.3
16.2
16.2
16.2
16.2
16.3
16.3
16.1
16.3
With rmapbt
sequentially writing a 5GB file with cache --
2.4 GB/s
Next write without cache --
1.7 GB/s
Read without cache --
2.2 GB/s
Read with cache --
2.8 GB/s
Repeated read with cache --
2.8 GB/s
16.4 GB/s
16.5 GB/s
16.5 GB/s
16.5 GB/s
16.4 GB/s
16.5 GB/s
16.4 GB/s
16.5 GB/s
16.5 GB/s
Sequential read/write operations with rmapbt on/off in XFS
XFS (rmapbt=0) --
sequentially writing a 1GB file with cache --
116 MB/s,112 MB/s
Next write without cache --
105 MB/s
Read without cache --
104 MB/s
Read with cache --
104 MB/s
Read with cache again --
13.8 GB/s
Repeated read with cache --
This was done after formatting + sequentially writing a 1GB file with cache
105,17.4,17.3,17.3,17.3,17.3,17.4,17.4,17.4,17.4
Avg: 17.35555555555555555555
sequentially writing a 1GB file with cache --
116 MB/s,112 MB/s
Next write without cache --
105 MB/s
Read without cache --
104 MB/s
Read with cache --
104 MB/s
Read with cache again --
13.8 GB/s
Repeated read with cache --
This was done after formatting + sequentially writing a 1GB file with cache
105,17.4,17.3,17.3,17.3,17.3,17.4,17.4,17.4,17.4
Avg: 17.35555555555555555555
XFS format options with rmapbt=1 --
sequentially writing a 1GB file with cache --
115 MB/s
Repeated read with cache --
This was done after formatting + sequentially writing a 1GB file with cache
106, 13.9,13.8,13.4,13.9,13.7,14.1,13.9,13.8,14.0
Avg: 13.833333
sequentially writing a 1GB file with cache --
115 MB/s
Repeated read with cache --
This was done after formatting + sequentially writing a 1GB file with cache
106, 13.9,13.8,13.4,13.9,13.7,14.1,13.9,13.8,14.0
Avg: 13.833333
Sequential read/write operations on ext4
ext4 (optimized for large files) --
sequentially writing a 1GB file with cache --
112 MB/s,112 MB/s
Next write without cache --
104 MB/s
Read without cache --
105 MB/s
Read with cache --
105 MB/s
Read with cache again --
11.2 GB/s
Repeated read with cache --
This was done after formatting + sequentially writing a 1GB file with cache
108,11.4,11.3,11.2,13.7,12.5,12.3,12.4,12.2,12.3
Avg:12.14444444444444444444
sequentially writing a 1GB file with cache --
112 MB/s,112 MB/s
Next write without cache --
104 MB/s
Read without cache --
105 MB/s
Read with cache --
105 MB/s
Read with cache again --
11.2 GB/s
Repeated read with cache --
This was done after formatting + sequentially writing a 1GB file with cache
108,11.4,11.3,11.2,13.7,12.5,12.3,12.4,12.2,12.3
Avg:12.14444444444444444444
ext4 mount options optimized for both small and large files -
sequentially writing a 1GB file with cache --
115 MB/s
Repeated read with cache --
This was done after formatting + sequentially writing a 1GB file with cache
103,11.8,12.0,12.0,11.9,12.8,12.8,12.8,12.9,12.9
Avg: 12.433333
sequentially writing a 1GB file with cache --
115 MB/s
Repeated read with cache --
This was done after formatting + sequentially writing a 1GB file with cache
103,11.8,12.0,12.0,11.9,12.8,12.8,12.8,12.9,12.9
Avg: 12.433333
Conclusion --
For nvme/ssd, xfs with rmapbt on is the way for sequential operations on large file. This is also better than ext4 even for small file operations (benchmark published later).
For HDD Storage, xfs without rmapbt (or rmapbt=0) will perform the best.