Understanding and Mitigating DRBD Two-phase Commit Timeout Messages

If you notice messages in your logs related to two-phase commit timeouts, this might be a side effect of a network “forgetting” TCP sessions if they are apparently “idle” for too long.

These messages look like this:

[...] drbd <resource-name>: Two-phase commit 1234567890 timeout

If the output of a drbdadm status command shows that at least one of your peers is still “connected” but does not report the “Preparing remote state change” for the two-phase commit that shows a timeout message in your logs, you most likely are affected by the issue described in this article.

If all peers also report “Preparing remote state change” for the same number as in the two-phase commit message, then whatever you are affected by is not the issue described here.

If your (virtual?) network has some “idle-timeout” and forgets TCP sessions if they are idle for too long, this might cause “100% packet loss” on a TCP session that otherwise still looks established.

How DRBD communicates state changes between peers

DRBD® uses two TCP sessions per replication connection between peers. One is mainly for “bulk data”, the other for “control” and acknowledgments.

On the “control” session, DRBD Ping and PingAck packets will be exchanged at regular intervals, even when otherwise idle. If the “control” session is still responsive, and TCP reports no errors, DRBD will usually keep both sockets of the replication connection open.

DRBD cluster-wide state changes

Several state changes in DRBD are “cluster wide state changes”. These are implemented as two-phase commits, and are initially communicated via the “data” session, because they must not overtake data requests.

If those TCP sessions are “dead” on the network, but still look OK locally, there will not be any reply, and the two-phase commit sequence will run into a timeout.

Mitigating DRBD two-phase commit sequence timeouts

You can mitigate timeout issues with the DRBD two-phase commit sequence by telling TCP to exchange keepalive probes on the data socket frequently enough that the network will not consider the session “idle”. Since DRBD version 9.0.13 (released in 2018), DRBD enables TCP_KEEPALIVE probes on the DRBD data socket to have the TCP stack do this for you. The relevant values for idle time, interval and count were the system wide sysctl in net.ipv4.tcp_keepalive_time, *_intvl and _probes. See also the tcp(7) man page.

Since DRBD version 9.1.16 (released in 2022), there are kernel module parameters in drbd_transport_tcp to be able to set DRBD specific values distinct from the system wide sysctl.

The module parameters are keepcnt, keepidle and keepintvl.

Setting kernel module parameters when module loads

You can explicitly set them on module load time. One way to persist those settings would be to write options to the DRBD TCP transport kernel module configuration file, by entering the following command:

echo options drbd_transport_tcp keepidle=23 keepintvl=23 keepcnt=9 | \
  sudo tee /etc/modprobe.d/drbd_transport_tcp.conf

Setting and verifying kernel module parameters at runtime

You can also verify and change these settings at runtime:

echo 23 | sudo tee /sys/module/drbd_transport_tcp/parameters/keepidle /sys/module/drbd_transport_tcp/parameters/keepintvl
echo 9 | sudo tee /sys/module/drbd_transport_tcp/parameters/keepcnt
grep ^ /sys/module/drbd_transport_tcp/parameters/keep*

A value of “0” means to use the system wide defaults from sysctl.

Changed settings are effective only for TCP sockets established after the change. To apply to existing replication connections, you have to disconnect and reconnect those.

📝 NOTE: With DRBD versions 9.2.13 and 9.2.14, parameters were set to defaults, but those were larger than intended by a factor of HZ (so 100, 250, or 1000, depending on environment). With 9.2.15, DRBD developers fixed that and default to the values suggested above, which are:

Start to send keepalive probes after the socket has been idle for 23s.
Repeat keep alive probes every 23s.
If you fail to get a response nine times in a row, consider the TCP session failed.

Mitigating two-phase commit timeouts by fixing your network or applications

Alternatively you could also fix your network to not forget idle TCP session, and fix applications that regularly keep too many idle TCP sessions open.

You can observe DRBD TCP sockets with tcpdump or other packet tracers. When idle, the “control” socket will still exchange DRBD Ping and PingAck packets every DRBD ping-int interval. With the active keepalive settings described earlier, the idle data sockets should still exchange TCP keepalive probes every 23s.

Written by LE, 2025/10/14.

Reviewed by MAT, 2025/10/15.