Node Availability

When enabled, auto-failover automatically fails over any node identified as unresponsive or unavailable.

Configuring Node Availability

Only Full Administrators and Cluster Administrators can enable or disable auto-failover. Auto-failover is disabled by default. When enabled, auto-failover handles all services except the Index Service.

Configuring Node Availability with the UI

The Node Availability panel is as follows:

settings autofailover

The following three checkboxes are provided:

  • Enable auto-failover after*x*seconds for up to*y*event: After the timeout period set here as x seconds has elapsed, and up to the limit of actionable events set here as y, an unresponsive or malfunctioning node is failed over. Replica copies of data, indexes, or query engines are promoted to active on other nodes, as appropriate. Note that this feature can only used when three or more nodes are present in the cluster. See Automatic Failover for more information.

  • Enable auto-failover for sustained data disk read/write failures after*z*seconds: After the timeout period set here as z seconds has elapsed, a node is failed over if it has experienced sustained data disk read/write failures. This checkbox can only be checked if Enable auto-failover after n seconds for up to x event has also been checked.

  • Enable auto-failover for server groups: Server-group failover is enabled. This checkbox (which can only be checked if Enable auto-failover after*n*seconds for up to*nn*event has also been checked) should only be checked if three or more server groups have been established, and capacity is available to absorb the combined load of all potentially failed-over groups. For information on groups and Server Group Awareness, see Groups.

The Node Availability screen contains the following, additional option, which is is available for Ephemeral Buckets:

ephemeralBucketsReprovisioningInterface

Checking this checkbox ensures that if a node containing active Ephemeral Buckets becomes unavailable, its replicas on other nodes are promoted to active status as appropriate, to avoid data-loss. Note, however, that this may leave the cluster in an unbalanced state, requiring a rebalance.

Configuring Node Availability with the CLI

The automatic failover settings can be changed using the cli:cbcli/couchbase-cli-setting-autofailover.adoc command in couchbase-cli.

Configuring Node Availability with the REST API

The automatic failover settings can be changed using the REST API /settings/autoFailover endpoint.

Below is an example of changing the automatic failover settings using the REST API:

curl -i -X POST -u Administrator:password http://10.142.180.103:8091/settings/autoFailover \
  -d 'enabled=true&timeout=72' \
  -d 'failoverServerGroup=true&maxCount=2' \
  -d 'failoverOnDataDiskIssues[enabled]=true&failoverOnDataDiskIssues[timePeriod]=89'