XDCR Filtering

XDCR Filtering allows a limited subset of documents to be replicated from the source bucket.

Configuring XDCR Filtering

Full, Cluster, and XDCR Administrators can configure XDCR filtering.

The filtering expression is a regular expression that is applied to document keys on the source cluster: those document keys returned by the filtering process correspond to documents that will be replicated to the target.

Multiple filtering expressions can be used by a single replication: the expressions should be ORed, as follows: filterExpression0|filterExpression1. For example, the expression airline|hotel would match both unitedairline and marriothotel.

Filtering is established when a replication is created, by means of the Couchbase Web Console, CLI, or REST API. Once established, the filtering expression for a replication cannot be modified: the replication must be deleted, and a new replication created.

It is important to avoid conditions where two replications to the same destination overlap partially or fully. If an overlap occurs, it will waste machine resources since a single key gets replicated multiple times. Overlapping filtering expressions cannot guarantee which of the two replications will replicate the overlapping key instance to the destination faster.

Filtering does not impact conflict resolution and can be used with either the revision-based or the timestamp-based conflict resolution policy. Filtering does not affect the ability to pause and resume replications.

XDCR Regular Expressions

The following JavaScript regular expressions (RegExes) can be used for XDCR filtering. Note that regular expressions are case-sensitive: a lowercase 'a' is distinct from an uppercase 'A'. You can enclose a range of characters in square brackets, to match against all of those characters.

Expression Description

[tT]here

Matches against 'There' and 'there'

[ ]

Can be used on a range of characters separated by a - character.

[0-9]

Matches any digit.

[A-Z]

Matches any uppercase alpha character.

[A-Za-z0-9]

Matches any alphanumeric character.

^

Matches beginning of input. If the multi-line flag is set to true, also matches immediately after a line-break character. For example, /^A/ does not match the 'A' in "an A", but does match the 'A' in "An E".

The '^' has a different meaning when it appears as the first character in a character set pattern. See complemented character sets for details and an example.

It can also be used as a "not" character, therefore [^0-9] matches against any character that is not a digit.

Ranges can be used to specify a group of characters. The following shortcuts are also available:

Expression Description

.

Matches against any character.

\d

Matches against a digit [0-9]. *

\D

Matches against a non-digit [^0-9]. *

\s

Matches against a whitespace character (such as a tab, space, or line-feed character).*

\S

Matches against a non-whitespace character.*

\w

Matches against an alphanumeric character [a-zA-Z_0-9].*

\W

Matches against a non-alphanumeric character.*

\xhh

Matches against a control character (for the hexadecimal character hh).*

\uhhhh

Matches against a Unicode character (for the hexadecimal character hhhh).*

Note that since the backslash character is used to denote a specific search expression, a double backslash (\\) must be entered when the backslash is the search target.

To match against occurrences of a character or expression, you can use the following.

Expression Description

*

Matches against zero or more occurrences of the previous character or expression.

+

Matches against one or more occurrences of the previous character or expression.

?

Matches zero or one occurrence of the previous character or expression.

(n)

Matches n occurrences of the previous character or expression.

(n,m)

Matches from n to m occurrences of the previous character or expression.

(n,)

Matches at least n occurrences of the previous character or expression.

You can provide text to replace all or part of your search string. To do this, you need to group together matches by enclosing them in parentheses,so that they can be referenced in the replacement. To reference a matched parameter, use $n where n is the parameter,starting from 1.