cbtransfer
Transfers data between clusters or from files.
Syntax
The basic syntax is:
cbtransfer [options] source destination
The following are syntax examples:
cbtransfer http://SOURCE:8091 /backups/backup-42 cbtransfer /backups/backup-42 http://DEST:8091 cbtransfer /backups/backup-42 couchbase://DEST:8091 cbtransfer http://SOURCE:8091 http://DEST:8091 cbtransfer file.csv http://DEST:8091
Description
The cbtransfer
tool is the underlying, generic data transfer tool upon which cbbackup
and cbrestore
are built.
It is a lightweight extract-transform-load (ETL) tool that transfers data between clusters and to and from files. The source and destination parameters are similar to URLs or file paths.
The cbtransfer
tool makes snap-shots of memory at the time it is invoked.
All keys that are extracted pertain to the snapshot and not to the data that is added, edited or deleted while cbtransfer
is running.
The most important way to use this tool is for transferring data from a Couchbase Server node that is no longer running to a cluster that is running.
The cbbackup , cbrestore , and cbtransfer tools do not communicate with external IP addresses for server nodes outside of a cluster.
Backup, restore, or transfer operations are performed on data from a node within a Couchbase Server cluster.
They only communicate with nodes from a node list obtained within a cluster.
If you install Couchbase Server with a default IP address, you cannot use an external hostname to access it.
|
Couchbase Server does not transfer design documents.
To back up a design document, use cbbackup to store the information and cbrestore to read it back into memory.
|
The tool is at the following location:
Operating system | Location |
---|---|
Linux |
/opt/couchbase/bin/ |
Windows |
C:\Program Files\Couchbase\Server\bin\ |
Mac OS X |
/Applications/Couchbase Server.app/Contents/Resources/couchbase-core/bin/ |
Options
The following are the command options:
Parameters | Description | ||
---|---|---|---|
|
Command line HELP. |
||
|
Single named bucket from a source cluster to transfer. |
||
|
Single named bucket on the destination cluster that receives the transfer. You can transfer to a bucket with a different name than your source bucket; if you do not provide a name, it defaults to the same name as the bucket-source. |
||
|
Transfer only items that match the vbucket ID. |
||
|
Transfer only items with keys that match the |
||
|
No actual transfer occurs, just the validation of parameters, files, connectivity, and configurations. |
||
|
REST username for the source cluster or the server node. |
||
|
REST password for the cluster or the server node. |
||
|
Number of concurrent worker threads performing the transfer. Memcached uses 75% of the number of cores reported of the system (with a minimum of 4 cores) 2. |
||
|
Verbose logging; provide more verbosity. |
||
|
Provide extra, uncommon configuration parameters. |
||
|
Transfer from a single server node in a source cluster. This single server node is a source node URL. |
||
|
Only transfer from the source vbuckets occurs in this state, such as |
||
|
Only transfer to destination vbuckets in this state, such as |
||
|
Perform this operation on transfer.
|
||
|
Export a |
1: Regex (Regular Expression Syntax) specifies a set of strings that matches it; the functions in this module let you check if a particular string matches a given regular expression (or if a given regular expression matches a particular string, which comes down to the same thing). For a complete reference, see https://docs.python.org/2/library/re.html.
2: To tune memcached to adjust dynamically the number of worker threads, do one of the following:
-
Export
MEMCACHED_NUM_CPUS=number
of threads you want before starting Couchbase Server. -
Use the
-t
<number > command line argument. -
Specify it in the configuration file that is read during startup. Be aware that when started from the full server this file is regenerated every time, and you will loose the modifications.
The following are extra, specialized command options with the cbtransfer -x
parameter.
-x options |
Description |
---|---|
|
Maximum backoff time during the rebalance period. |
|
Transfer this # of bytes per batch. |
|
Transfer this # of documents per batch. |
|
Split backup file on destination cluster if it exceeds the MB. |
|
By default, enable conflict resolution. |
|
For value 1, transfer only data from a backup file or cluster. |
|
For value 1, transfer only design documents from a backup file or cluster. Default: 0. |
|
Max number of sequential retries if the transfer fails. |
|
For value 0, display extended fields for stdout output. |
|
0 or 1, where 1 retries transfer after a NOT_MY_VBUCKET message. Default: 1. |
|
Amount of bytes for every TCP/IP batch transferred. |
|
For value 1, rehash the partition IDs of each item. Rehashing is required when transferring data between clusters with a different number of partitions, such as when transferring data from a Mac OSX server to a non-Mac OSX cluster. |
|
Number of batches transferred before updating the progress bar in the console. |
|
Number of batches transferred before emitting progress information in the console. |
|
By default, start |
|
Transfer documents with metadata.
Default: 1.
The value of |
|
For value 1, restore data in the uncompressed mode. |
Examples
Example for transferring data between nodes:
To transfer data from a non-running node to a running cluster:
cbtransfer couchstore-files://COUCHSTORE_BUCKET_DIR couchbase://HOST:PORT --bucket-destination=DESTINATION_BUCKET
cbtransfer couchstore-files:///opt/couchbase/var/lib/couchbase/data/default couchbase://10.5.3.121:8091 --bucket-destination=foo
The response shows 10000 total documents transferred in batch size of 1088 documents each.
[####################] 100.0% (10000/10000 msgs) bucket: bucket_name, msgs transferred... : total | last | per sec batch : 1088 | 1088 | 554.8 byte : 5783385 | 5783385 | 3502156.4 msg : 10000 | 10000 | 5230.9 done
Example for sending data to the standard output:
To send all the data from a node to the standard output:
cbtransfer http://10.5.2.37:8091/ stdout: set pymc40 0 0 10 0000000000 set pymc16 0 0 10 0000000000 set pymc9 0 0 10 0000000000 set pymc53 0 0 10 0000000000 set pymc34 0 0 10 0000000000
Example for importing/exporting csv files:
The cbtransfer
tool is also used to import and export csv
files.
Data is imported into Couchbase Server as documents and documents are exported from the server into comma-separated values.
Design documents associated with vBuckets are not included.
In these examples, the following records are in the default bucket where re-fdeea652a89ec3e9 is the document ID, 0 are flags, 0 is the expiration, and the CAS value is 4271152681275955. The actual value is the hash starting with "{""key"".......
re-fdeea652a89ec3e9, 0, 0, 4271152681275955, "{""key"":""re-fdeea652a89ec3e9"", ""key_num"":4112, ""name"":""fdee c3e"", ""email"":""fdee@ea.com"", ""city"":""a65"", ""country"":""2a"", ""realm"":""89"", ""coins"":650.06, ""category"":1, ""achievements"":[77, 149, 239, 37, 76],""body"":""xc4ca4238a0b923820d ....... ""}" ......
This example exports these items to a .csv file.
All items are transferred from the default bucket, -b default
available at the node http://host:8091
and put into the /data.csv
file.
If a different bucket is provided for the -b
option, all items are exported from that bucket.
Credentials are required for the cluster when exporting items from a bucket in the cluster.
cbtransfer http://[host]:8091 csv:./data.csv -b default -u Administrator -p password
The following example response is similar to that in other cbtransfer
scenarios:
[####################] 100.0% (10000/10000 msgs) bucket: default, msgs transferred... : total | last | per sec batch : 1053 | 1053 | 550.8 byte : 4783385 | 4783385 | 2502156.4 msg : 10000 | 10000 | 5230.9 2013-05-08 23:26:45,107: mt warning: cannot save bucket design on a CSV destination done
The following example syntax shows 1053 batches of data transferred at 550.8 batches per second. The tool outputs "cannot save bucket design…." to indicate that no design documents were exported. To import information from a.csv file to a named bucket in a cluster:
cbtransfer /data.csv http://[hostname]:[port] -B bucket_name -u Administrator -p password
If the .csv file is not correctly formatted, the following error displays during import:
w0 error: fails to read from csv file, .....