2-node Pacemaker Clusters: Quorum, Fencing, and Recovery Considerations

This article describes how quorum functions in 2-node Pacemaker clusters, outlines fencing requirements, and provides guidance for recovery scenarios. It also covers important configuration options and recommended practices for maintaining cluster availability and data integrity.

2-node clusters can present challenges compared to 3-node and larger clusters. Some of these are:

Preventing data divergence and split-brain without quorum by a true majority
Avoiding fencing loops where the nodes alternate fencing each other
Preventing a dual fencing event where both nodes fence each other simultaneously
Ensuring Pacemaker resources run only where intended, preventing IP conflicts, data corruption, and duplicate service instances
Recovering from a cluster-wide outage with one surviving node

Understanding `two_node: 1` and `wait_for_all`

The two_node: 1 setting in /etc/corosync/corosync.conf enables specialized quorum behavior for 2-node clusters. When creating a new 2-node cluster, both pcs cluster setup and crm cluster init automatically configure two_node: 1. With two_node: 1 defined, Corosync automatically enables the wait_for_all quorum option.

With wait_for_all, both nodes must be up and running and connected before the cluster becomes quorate for the first time. Only then does Pacemaker start services.

When a node is lost, the two_node: 1 setting grants quorum to the last surviving node, allowing it to continue running services or take over from the failed peer.

Fencing considerations

When a 2-node cluster loses communication between nodes, each node sees the other as unresponsive. With STONITH configured, the first node to detect the loss will fence the other. After a successful fence, the fenced node reboots.

📝 NOTE: Pacemaker does not add any delay or randomization to fencing operations by default.

Without fencing, a communication failure between nodes leads to data divergence (with replicated data) or data corruption (with shared storage).

With fencing, wait_for_all also plays an important role. For example, a simple network outage without wait_for_all in a 2-node cluster can lead to the following events:

A recently fenced (rebooted) node starts Pacemaker (pacemaker.service) automatically after successfully booting.
Pacemaker determines its peer is unresponsive due to the lack of network communication.
The recently fenced node now fences its peer.
The steps above repeat, this time from the perspective of the opposite peer node.
The fencing loop continues indefinitely until the network outage is restored, or an administrator manually intervenes.

📝 NOTE: The no-quorum-policy=ignore setting does not override wait_for_all. Corosync evaluates wait_for_all before Pacemaker begins managing the cluster.

Tuning fencing delays

Although the issue is rare, you can take further steps to prevent nodes from fencing each other simultaneously. It is also possible to tolerate very brief communication interruptions, thus avoiding unnecessary fencing events.

The pcmk_delay_base parameter on each STONITH resource adds a grace period before the fence action executes. A short delay gives transient network issues a chance to resolve before triggering a fence.

# pcs - set a 5-second base delay on a fence resource
pcs stonith update <fence_resource_id> pcmk_delay_base=5s

# crmsh
crm resource param <fence_resource_id> set pcmk_delay_base 5s

Setting priority-fencing-delay delays fencing the node currently running the most active resources. This gives the active node a larger window to fence its peer first, rather than both nodes fencing each other simultaneously. Set priority-fencing-delay to at least twice the value of pcmk_delay_base.

# pcs
pcs property set priority-fencing-delay=10s

# crmsh
crm configure property priority-fencing-delay=10s

💡 TIP: Rather than randomizing fencing delays between nodes, the combination of pcmk_delay_base and priority-fencing-delay staggers them deterministically. The node running the most active resources gets a predictable head start when fencing its peer, eliminating the risk of a simultaneous “double fence” event while also increasing the likelihood that running services remain uninterrupted. No randomization is needed, just a calculated difference in how soon each node can fence the other.

Understanding DRBD behavior with fencing

When DRBD® is configured with fencing resource-and-stonith in the net section and the crm-fence-peer.9.sh handler, DRBD coordinates with Pacemaker during fencing events.

Example DRBD resource configuration with fencing:

resource r0 {
  handlers {
    fence-peer "/usr/lib/drbd/crm-fence-peer.9.sh";
    after-resync-target "/usr/lib/drbd/crm-unfence-peer.9.sh";
  }

  net {
    fencing resource-and-stonith;
  }
  ...
}

The fencing sequence from the DRBD perspective:

The surviving node detects its peer is unresponsive. The crm-fence-peer.9.sh handler runs on the surviving node, setting a -INFINITY location constraint in Pacemaker that blocks the peer from future promotion. DRBD suspends I/O on the surviving node until fencing resolves.
The fence agent reboots the failed node. The surviving node confirms the fence succeeded, resumes I/O, and continues as Primary.
The fenced node reboots and rejoins the cluster. DRBD on the rebooted node connects to the surviving node and begins resyncing.
After resync completes, the crm-unfence-peer.9.sh handler runs on the rebooted node (triggered by after-resync-target). It removes the location constraint, making the rebooted node eligible for promotion again.

The following kernel log output (dmesg) from the rebooted node shows each DRBD disk state change as it rejoins.

The fenced node attaches its local disk as Consistent (data is intact regardless of clean or unclean shutdown, but no peer contact yet) and connects to the surviving node. The surviving node has UpToDate data, so the fenced node marks itself as Outdated.

drbd r0/0 drbd0: disk( Consistent -> Outdated )
drbd r0/0 drbd0: pdsk( DUnknown -> UpToDate ) repl( Off -> WFBitMapT )

DRBD begins resyncing from the surviving node. The fenced node transitions to Inconsistent during resync, then to UpToDate after resync completes.

drbd r0/0 drbd0: disk( Outdated -> Inconsistent )
drbd r0/0 drbd0: disk( Inconsistent -> UpToDate )

After reaching UpToDate, DRBD calls the after-resync-target handler (crm-unfence-peer.9.sh), which removes the location constraint from Pacemaker:

drbd r0 <peer>: helper command: /sbin/drbdadm unfence-peer
drbd r0 <peer>: helper command: /sbin/drbdadm unfence-peer exit code 0

At this point, the fenced node is fully resynced and eligible for promotion again. Pacemaker leaves resources on the surviving node unless a higher-scoring placement exists.

With proper fencing configured, there is never a need to manually force DRBD to become UpToDate or rely on automatic split-brain recovery handlers such as after-sb-0pri, after-sb-1pri, or after-sb-2pri which might have dangerous consequences.

In 2-node clusters, majority-based quorum in Pacemaker and the DRBD built-in quorum feature are unavailable. Fencing is essential to prevent split-brain and data divergence.

Recommended quorum configuration

For production clusters with STONITH configured:

Use no-quorum-policy=stop (the Pacemaker default).
Leave two_node: 1 in the quorum section of your Corosync configuration.

If you configure corosync.conf manually or through a template, verify that the configuration has a two_node: 1 setting.

Verify that the two_node: 1 setting is configured:

corosync-cmapctl | grep two_node

The property is also present in the quorum section of /etc/corosync/corosync.conf:

quorum {
    provider: corosync_votequorum
    two_node: 1
}

With two_node: 1, the fencing behavior during a node failure is as follows:

A failed node is fenced after it is confirmed to be unavailable.
The surviving node keeps quorum.
If the surviving node is actively running services, services continue to run on the surviving node.
If the surviving node is not actively running services, services start on the surviving node.
The wait_for_all setting prevents resources from starting on rebooted nodes until both nodes are online.

📝 NOTE: During a full cluster outage with only one surviving node, wait_for_all prevents resources from starting. See the Recovering a single node after a disaster section.

Understanding `no-quorum-policy=ignore` in 2-node clusters

In clusters without STONITH, no-quorum-policy=ignore is often used so the surviving node can continue running resources after the peer goes offline:

# pcs
pcs property set no-quorum-policy=ignore

# crmsh
crm configure property no-quorum-policy=ignore

The no-quorum-policy=ignore setting tells Pacemaker to disregard quorum loss. However, it does not override wait_for_all, which operates at the Corosync layer below Pacemaker. After a full cluster outage, wait_for_all (implied by two_node: 1) still prevents each node from becoming quorate until both nodes have joined, even with ignore set.

After both nodes have joined and the wait_for_all condition is satisfied, ignore changes the runtime behavior:

After a node failure, the surviving node continues running resources without checking quorum. This is the intended behavior for clusters without STONITH.
With STONITH enabled, fencing loops can happen because a rebooted node does not wait for quorum before attempting to fence its peer.

In contrast, with the no-quorum-policy=stop (the default) and two_node: 1 settings, the surviving node keeps quorum after losing its peer and handles the failure without needing to ignore quorum.

⚠️ WARNING: Only use no-quorum-policy=ignore in clusters where STONITH is disabled. With STONITH enabled, use no-quorum-policy=stop (the default) instead.

Recovering a single node after a disaster

If only one node is available (the peer is permanently offline or has been decommissioned), the surviving node will not start resources because of wait_for_all. Take the following steps to manually bring the surviving node into service.

First, cancel the wait_for_all hold so Corosync grants quorum to the single node:

corosync-cmapctl -s quorum.cancel_wait_for_all u8 1

⚠️ WARNING: Only run this command if you are certain the peer is down and will not come back with newer data.

If STONITH is enabled, Pacemaker might attempt to fence the offline peer before starting resources. If the fence agent cannot reach the peer, for example, because the peer node hypervisor or baseboard management controller (BMC) is also down, fencing will fail and resources will not start.

In this case, confirm the peer is truly offline, then manually mark the node as fenced:

# pcs - confirm peer is already down (skips fence agent)
pcs stonith confirm <peer_hostname>

# stonith_admin - confirm peer is already down (works without pcs or crmsh)
stonith_admin --confirm <peer_hostname>

If the fence device or management interface is still reachable, you can actively fence the peer instead:

# pcs
pcs stonith fence <peer_hostname>

# crmsh
crm node fence <peer_hostname>

# stonith_admin (works without pcs or crmsh)
stonith_admin --reboot <peer_hostname>

After these steps, Pacemaker should start resources on the surviving node.

Originally created by MAT (based on original content by LE) - 2022-07-21

Reviewed by DJV 2022-07-25

Updated by RR (and reviewed by DJV) - 2026-04-15