node-mongosync v0.1.1
https://github.com/crumbjp/momonger/tree/master/node-mongosync
node-mongosync
Offer the way to replicate from a MongoDB replica-set to another cluster.
Feature
When using MongoDB, sometime we want a realtime copy cluster. For example, I want fresh data on the staging environment.
However, MongoDB don't offer the way to do it.
MongoDB replica-set has the OpLog collection for replication for themselves.
node-mongosync
can read this OpLog and reflect to another cluster.
Also, node-mongosync
offer multiple source replication.
+---------+ +-----------------+
| RepSet1 | - node-mongosync process -> | |
+---------+ | |
| MongoDB cluster |
+---------+ | |
| RepSet2 | - node-mongosync process -> | |
+---------+ +-----------------+
Using tailable cursor
OpLog collection is created as a Capped-collection.
Capped-collection is offering the way to read effectively.
node-mongosync
is using this.
Bulk operation
node-mongosync
writes data to destination cluster by using Ordered-bulk-operation.
But command-operation is excepted.
node-mongosync
executes and waits all Bulk-operation before execute Command-operation.
Restart
Always logging the reflected OpLog timestamp to destination cluster's collection that specified by config.
node-mongosync
can continue from certain log at restart.
MongoDB version
OpLog format might be changed by each MongoDB version.
I don't know the MongoDB's guideline which the OpLog format compatibility is saved or not.
node-mongosync
guarantee its behavior on MongoDB version by test and human confirmation.
MongoDB 2.6.X ~ 3.0.X
It's looks like there is the almost perfect compatibility. I have been run replicate production level TB class cluster from 2.6.X to 3.0.X a year.
Quick start
1. Download and extract MongoDB
from https://www.mongodb.org/downloads#production
2. Prepare directory
$ mkdir -p /tmp/mongosync_test
$ cd /tmp/mongosync_test
$ mkdir data1 data2 tmp
3. Start source mongod (localhost:27017)
$ mongod --dbpath ./data1 --logpath ./tmp/1.log --port 27017 --replSet rs --fork
$ mongo <<<"rs.initiate({_id: 'rs', members: [ {_id: 1, host:'localhost:27017'}]})"
4. Start destination mongod (localhost:27018)
$ mongod --dbpath ./data2 --logpath ./tmp/2.log --port 27018 --fork
5. Start mongosync
$ npm install node-mongosync
$ echo "{
name: 'mongosync_test',
src: {
type : 'replset',
hosts : ['localhost:27017'],
},
dst: {
host : 'localhost',
port : 27018
},
options: {
loglv: 'verbose',
targetDB: {
'test': 'test2',
'*': false
},
syncIndex: {
create: true,
drop: false
},
syncCommand: {
'*': true,
dropDatabase: false,
},
}
}" > test.conf
$ node ./node_modules/node-mongosync/index.js -c test.conf
Sync only 'test' database as 'test2' database
targetDB: {
'test': 'test2',
'*': false
}
Sync createIndex but don't sync dropIndex
syncIndex: {
create: true,
drop: false
}
Sync all command without 'dropDatabase'
syncCommand: {
'*': true,
dropDatabase: false,
}
Test sync (from another terminal)
1. Basic write operation
$ mongo localhost:27017/test <<<'
for(i=0;i<10;i++){
db.tmp.save({a:i})
}
db.tmp.update({a:0}, {$set: {b: 0}})
db.tmp.update({a:{$gt:5}}, {$set: {b: 5}}, {multi: 1})
db.tmp.remove({a:4})
'
WriteResult({ "nInserted" : 1 })
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
WriteResult({ "nMatched" : 20, "nUpserted" : 0, "nModified" : 20 })
WriteResult({ "nRemoved" : 1 })
Confirm sync process terminal
[verbose]: test2.tmp: i:10, u:21, d:1, U: 0
Will sync to destination
$ mongo localhost:27018/test2 <<<"db.tmp.find()"
{ "_id" : ObjectId("56ce862e9e230530d689b4ed"), "a" : 0, "b" : 0 }
{ "_id" : ObjectId("56ce862e9e230530d689b4ee"), "a" : 1 }
{ "_id" : ObjectId("56ce862e9e230530d689b4ef"), "a" : 2 }
{ "_id" : ObjectId("56ce862e9e230530d689b4f0"), "a" : 3 }
{ "_id" : ObjectId("56ce862e9e230530d689b4f2"), "a" : 5 }
{ "_id" : ObjectId("56ce862e9e230530d689b4f3"), "a" : 6, "b" : 5 }
{ "_id" : ObjectId("56ce862e9e230530d689b4f4"), "a" : 7, "b" : 5 }
{ "_id" : ObjectId("56ce862e9e230530d689b4f5"), "a" : 8, "b" : 5 }
{ "_id" : ObjectId("56ce862e9e230530d689b4f6"), "a" : 9, "b" : 5 }
2. createIndex operation
$ mongo localhost:27017/test <<<'db.tmp.createIndex({a: 1})'
Confirm sync process terminal
[info]: createIndex test2.tmp { a: 1 } { name: 'a_1' }
Will sync to destination
$ mongo localhost:27018/test2 <<<"db.tmp.stats().indexSizes"
{ "_id_" : 8176, "a_1" : 8176 }
3. dropIndex operation
$ mongo localhost:27017/test <<<'db.tmp.dropIndex("a_1")'
Confirm sync process terminal
[info]: Skip dropIndex { deleteIndexes: 'tmp', index: 'a_1' }
Won't sync to destination
$ mongo localhost:27018/test2 <<<"db.tmp.stats().indexSizes"
{ "_id_" : 8176, "a_1" : 8176 }
4. command operation
$ mongo localhost:27017/test <<<'
db.tmp.drop()
db.dropDatabase()
'
Confirm sync process terminal
[info]: command { drop: 'tmp' }
[info]: Skip command {dropDatabase: 1}
Config field
field | type | format | |
---|---|---|---|
name | string | The source ReplSet ID. Must be unique at multiple source replication. | |
src | hash | mongo-info | Source MongoDB cluster. Must specify replset: true |
dst | hash | mongo-info | Destination MongoDB cluster. Specify the collection by database and collection field. node-mongosync save a oplog Timestamp that already reflected for restart process. database: 'mongosync', collection: 'last' is default. |
options | hash | options |
mongo-info
field | type | format | |
---|---|---|---|
type | string | standalone replset mongos standalone is default. | |
hosts | string array | hostname :port | Required when type is not standalone . |
host | string | Required when type is standalone . | |
port | integer | Required when type is standalone . | |
authdbname | string | ||
user | string | ||
password | string |
options
field | type | format | |
---|---|---|---|
loglv | string | debug trace verbose info error | info is default. |
targetDB | hash | from : to | Required. from is source dbname and to is 'destination dbname'. '*': true means all database to same database name. |
syncIndex | hash | create: boolean, drop: boolean | Sync or not createIndex and dropIndex . {create: false, drop: false} is default. |
syncCommand | hash | command: boolean | * means all command. |
bulkInterval | integer | 1000 is default. | |
bulkLimit | integer | 5000 is default. |