Rebalance

Rebalance is a process of re-distributing data and indexes among the available nodes.

Understanding Rebalance

When one or nodes have been added to or removed from a cluster, rebalance redistributes data and indexes among the available nodes, and the Cluster Map is appropriate updated and distributed to clients. The process can occur while the cluster continues to run and service requests: reads and writes continue to be made to an existing structure, while the data is moved in the background among nodes.

The rebalance is performed differently, depending on which services are deployed. The principal characteristics per service are described below. In cases where multiple services are resident on a node that is being added or removed, the rebalance behavior will be the combination of the per service characteristics.

Rebalancing the Data Service

As explained in Clusters and Availability, data is logically partitioned across the cluster-nodes running the Data Service. On rebalance, vBuckets are redistributed evenly among those nodes. If nodes are removed, each vBucket from the removed node is replicated to the new active node for that vBucket. A special case of this a swap rebalance where the number of nodes coming into the cluster is equal to the number of nodes leaving the cluster, ensuring that data is only moved between these nodes. Once vBucket data-transfer is complete, the new vBucket is made active, and operations are directed to it as appropriate. This process ensures uninterrupted data-access for applications.

vBucket data-transfer occurs sequentially: therefore, if rebalance stops for any reason, it can be restarted from the point at which it was stopped.

Rebalancing the Index Service

The Index Service maintains a cluster-wide set of index definitions and metadata, which allows the redistribution of indexes and index replicas from ejected nodes to nodes that continue as part of the cluster. However, indexes that reside on non-ejected nodes are unaffected by rebalance.

When it occurs, the redistribution taken account of nodes' CPU and RAM utilization, and achieves the best resource-balance possible. The redistribution does not move indexes or replicas: instead, it rebuilds them in their new locations, using the latest data from the Data Service. If more index replicas exist than can be handled by the number of existing nodes, replicas are dropped: the numbers are automatically made up subsequently, if additional Index Service nodes are added to the cluster.

During rebalance, no index node is removed until index-building has completed on alternative nodes. This ensures uninterrupted access to indexes.

Rebalancing the Search Service

The Search Service automatically partitions its indexes across all Search nodes in the cluster, ensuring that during rebalance the distribution across all nodes is balanced.

Rebalancing the Query Service

The addition or removal of Query Service nodes during rebalance is immediately effective: an added node is immediately available to serve queries; while a removed node is immediately unavailable, such that ongoing queries are interrupted, requiring the handling of errors or timeouts at application-level.