Troubleshooting a hung DRBD Proxy

E.G. Not responding to drbd-proxy-ctl commands, not stopping gracefully, etc…

It might not be something we can resolve right away, but some information to gather while in this state might be:

  • Run a ‘strace’ on the running pids on both nodes
  • ‘ss -tuna’ grepping for the DRBD ports. It’s the send and receive buffer information which is of interest here
  • ‘cat /proc/<pid>/stack
  • ‘pstree | grep drbd-proxy’ This should show how many running proxy threads there are

In the case of this incident the pstree revealed there were 143 proxy threads. Proxy is not tested above more than 12 threads. Issue was most likely bugs in threading. We pinned proxy to 4 threads by setting “DRBD_PROXY_CPU_COUNT=4” in /etc/default/drbdproxy

 

Reviewed 2020/12/01 - DGT