Using LINSTOR to Tune DRBD for Write Performance

This article will help identify and tune DRBD settings using LINSTOR to achieve the best write performance.

These instructions use LINSTOR®, but the same config keys can be used directly in DRBD®'s res files in their appropriate sections.

Disable flushes. Whether this is acceptable in real usage depends on the details of the application and underlying storage. There are certainly real cases where it is acceptable, so it is fine for benchmarking:

WARNING: While fine for some quick benchmarking. In production, you should only disable device flushes when running DRBD on devices with a battery-backed write cache (BBWC). Most storage controllers allow to automatically disable the write cache when the battery is depleted, switching to write-through mode when the battery dies. It is strongly recommended to enable such a feature.

$ linstor controller drbd-options --disk-flushes=no

$ linstor controller drbd-options --md-flushes=no

Increase the amount of memory DRBD uses for temporary buffers:

$ linstor controller drbd-options --max-buffers=10000

Increase the epoch size. This will only have an effect if you are writing with very high concurrency:

$ linstor controller drbd-options --max-epoch-size=10000

Try using more smaller devices and aggregating the performance. This can be easily achieved by creating more resource definitions. For example, 8 volumes of 500GiB instead of 1 volume of 4TiB. This can remove bottlenecks caused by the concurrency and throughput limitations of single devices.

Tune the activity log. The activity log is described here if you would like to understand the theory:

DRBD 9 User Guide - 16.3. The Activity Log

Or just try one of these commands:

$ linstor controller drbd-options --al-updates=no

$ linstor controller drbd-options --al-extents=65534

WARNING: setting the al-updates=no option will completely disable the activity-log. With no activity log DRBD will need to do a full-sync to recover in the event of an unexpectedly lost Primary node (hard reboot, kernel panic, etc).

Edited 2020-12-14 – DJV