Cross Data Center Replication (XDCR)

Cross Data Center Replication (XDCR) allows data to be replicated across clusters that are potentially located in different data centers.

Introduction to XDCR

Cross Data Center Replication (XDCR) replicates data between clusters: this provides protection against data center failure, and also provides high-performance data-access for globally distributed, mission-critical applications.

XDCR replicates data from a specific bucket on the source cluster to a specific bucket on the target cluster. Data from the source bucket is pushed to the target bucket by means of an XDCR agent, running on the source cluster, using the Database Change Protocol. Any bucket (Couchbase or Ephemeral) on any cluster can be specified as a source or a target for one or more XDCR definitions.

Cross Data Center Replication differs from intra-cluster replication in the following, principal ways:

As indicated by their respective names, intra-cluster replication replicates data across the nodes of a single cluster; while Cross Data Center Replication replicates data across multiple clusters, each potentially in a different data center.
Whereas intra-cluster replication creates only replica vBuckets, XDCR creates only active vBuckets, which duly become available for the serving of data on the target cluster.
Whereas intra-cluster replication is configured and performed with reference to only a single bucket (to which all active and replica vBuckets will correspond), XDCR requires two buckets to be administrator-specified, for a replication to occur: one is the bucket on the source cluster, which provides the data to be replicated; the other is the bucket on the target cluster, which receives the replicated data.
Whereas intra-cluster replication is configured at bucket-creation, XDCR is configured following the creation of both the source and target buckets.

The starting, stopping, and pausing of XDCR all occur independently of whatever intra-cluster replication is in progress on either the source or target cluster. While running, XDCR continuously propagates mutations from the source to the target bucket. The source and target clusters can have different node-configurations. The source and target buckets can have different intra-cluster replication settings.

Tools and Procedures for Managing XDCR

Prior to XDCR management, source and target clusters should be appropriately prepared, as described in Prepare for XDCR. Then, XDCR is managed in three stages:

Define a reference to a remote cluster, which will be the target for Cross Data Center Replication. See Create a Reference.
Define and start a replication, which continuously transfers mutations from a specified source bucket to a specified target bucket. See Create a Replication.
Monitor the ongoing replication, pausing and resuming the replication if and when appropriate. See Monitor a Replication, Pause a Replication, and Resume a Replication.

Couchbase provides three options for managing these stages, which are by means of:

Couchbase Web Console, which provides a graphical user interface for interactive configuration and management of replications.
CLI, which provides commands and flags that allow replications to be managed from the command line.
REST API, which underlies both the Web Console and CLI, and can be expressed either as a curl command on the command line, or within a program or script.

For procedures that cover all main XDCR management tasks, performed with all three of the principal tools, see XDCR Management Overview.

XDCR Direction and Topology

XDCR allows replication to occur between source and target clusters in either of the following ways:

Unidirectionally: The data contained in a specified source bucket is replicated to a specified target bucket. Although the replicated data on the source could be used for the routine serving of data, it is in fact intended principally as a backup, to support disaster recovery.

Bidirectionally: The data contained in a specified source bucket is replicated to a specified target bucket; and the data contained in the target bucket is, in turn, replicated back to the source bucket. This allows both buckets to be used for the serving of data, which may provide faster data-access for users and applications in remote geographies.

Note that XDCR provides only a single basic mechanism from which replications are built: this is the unidirectional replication. A bidirectional topology is created by implementing two unidirectional replications, in opposite directions, between two clusters; such that a bucket on each cluster functions as both source and target.

Used in different combinations, unidirectional and bidirectional replication can support complex topologies; an example being the ring topology, where multiple clusters each connect to exactly two peers, so that a complete ring of connections is formed:

Note that when a bucket is specified as the source for an XDCR replication, all data in the bucket is replicated. Thus, if replication is started between source and target buckets that initially contain different data-sets, the replication-process eventually establishes a complete data-superset within each bucket.

XDCR Filtering

Filtering Expressions can be used in XDCR replications. Each is a regular expression that is applied to the document keys on the source cluster: those document keys returned by the filtering process correspond to the documents that will be replicated to the target. For information, See XDCR Filtering.

XDCR Payloads

XDCR only replicates data: it does not replicate views or indexes. Views and indexes can only be replicated manually, or by administrator-provided automation: when the definitions are pushed to the target server, the views and indexes are regenerated there.

When encountered on the source cluster, non-UTF-8 encoded document IDs are automatically filtered out of replication: they are therefore not transferred to the target cluster. For each such ID, the warning output xdcr_error.* is written to the log files of the source cluster.

XDCR Conflict Resolution

In some cases, especially when bidirectionally replicated data is being modified by applications in different locations, conflicts may arise: meaning that the data of one or more documents has been differently modified more or less simultaneously, requiring resolution. XDCR provides options for conflict resolution, based on either revision ID or timestamp, whereby conflicted data can be saved consistently on source and target. For more information, See XDCR Conflict Resolution.

XDCR-Based Data Recovery

In the event of data-loss, the cbrecovery tool can be used to restore data. The tool accesses remotely replicated buckets, previously created with XDCR, and copies appropriate subsets of their data back onto the original source cluster.

By means of intra-cluster replication, Couchbase Server allows one or more replicas to be created for each vBucket on the cluster. This helps to ensure continued data-availability in the event of node-failure.

However, if multiple nodes within a single cluster fail simultaneously, one or more active vBuckets and all their replicas may be affected; meaning that lost data cannot be recovered locally.

In such cases, provided that a bucket affected by such failure has already been established as a source bucket for XDCR, the lost data may be retrieved from the bucket defined on the remote server as the corresponding replication-target. This retrieval is achieved from the command-line, by means of the Couchbase cbrecovery tool.

For a sample step-by-step procedure, see Recover Data with XDCR.

XDCR Security

XDCR configuration requires that the administrator provide a username and password appropriate for access to the target cluster. When replication occurs, the password is automatically supplied, along with the data. By default, XDCR transmits both password and data in non-secure form. Optionally however, a secure connection can be enabled between clusters, in order to secure either password alone, or both password and data.

Note that if the password received by the destination cluster requires authentication by an LDAP server, the destination cluster communicates with the LDAP server in plain text, using saslauthd. This is described in Configure saslauthd.

A secure XDCR connection is enabled either by SCRAM-SHA or by TLS — depending on the administrator-specified connection-type, and the server-version of the destination cluster. Use of TLS involves certificate management: for information on preparing and using certificates, see Manage Certificates.

Two administrator-specified connection-types are possible:

Half Secure: Secures the specified password only: it does not secure data. The password is secured by hashing with SCRAM-SHA, when the destination cluster is running Couchbase Enterprise Server 5.5 or later; and by TLS encryption, when the destination cluster is running a pre-5.5 Couchbase Enterprise Server. The root certificate of the destination cluster must be provided, for a successful TLS connection to be achieved.
Full Secure: Handles both authentication and data-transfer via TLS.

For step-by-step procedures, see Secure a Replication.

XDCR Advanced Settings

The performance of XDCR can be fine-tuned, by means of configuration-settings, specified when a replication is defined. These settings modify compression, source and target nozzles (worker threads), checkpoints, counts, sizes, network usage limits, and more. For detailed information, see XDCR Advanced Settings.

XDCR Bucket Flush

The flush operation deletes data on a local bucket: this operation is disabled if the bucket is currently the source for an ongoing replication. If the target bucket is flushed during replication, the bucket becomes temporarily inaccessible, and replication is suspended.

If either a source or a target bucket needs to be flushed after a replication has been started, the replication must be deleted, the bucket flushed, and the replication then recreated.

XDCR and Expiration

Buckets and documents have a TTL setting, which determines the maximum expiration times of individual items. This is explained in detail in Expiration. For specific information on how TTL is affected by XDCR, see the section Bucket-Expiration and XDCR.

Monitoring XDCR

Couchbase Server provides the ability to monitor ongoing XDCR replications, by means of the Couchbase Web Console. Detailed information is provided in Monitor a Replication.