castor-client v0.1.5
Castor: NodeJS Cassandra Client
Cassandra client library for NodeJS, using the native binary protocol.
Key features:
- Using Cassandra's binary protocol
- Object oriented data access
- Automatic input encoding
- Connection pooling
- Using promises
- Native support for all Cassandra datatypes
Data structure is read automatically, so Castor knows how to encode the input.
This improves the security as well, since you are well protected against
injection attacks when you stick with the get
, set
and del
methods,
rather than sending raw queries using the query
method.
Installation
Install using npm install castor-client
Connecting
var Castor = require('castor-client');
var db = new Castor('localhost', 'keyspace');
Retreiving data
Data is retreived using the get
function. You can call this function
directly after connecting to the database. There is no need to wait for an
event, as queries are automatically stacked and executed when the connection is
ready.
Every call to get
requires that you specify a tablename as parameter. The
list of fields which you want to retreive can be specified using the fields
function. All fields will be returned if not specified. The query is executed
when you call the then
function or when using the execute
function to
get the raw promise. You can use the filter
function multiple times.
Iterating on the resultset can be done in two ways. You can iterate using the
valid
, current
and next
functions or you can use the toArray
function to get the whole resultset as an array.
db.get('user')
.fields(['user_id', 'birthdate'])
.filter('username', 'John Doe')
.orderBy('username', 'asc')
.limit(10)
.allowFiltering(true)
.then(function(rows) {
// Basic iteration.
while (rows.valid()) {
var row = rows.current();
console.log(row.user_id);
rows.next();
}
// Before we can loop again, we must call the rewind() function.
rows.rewind();
// Using the toArray() function.
rows.toArray().forEach(function(row)) {
console.log(row.user_id);
}
// Get the 'username' column as an array.
var usernames = rows.getColumn('username');
// Get the row count.
var rowCount = rows.count();
}).fail(function(error) {
console.log(error);
});
Updating and inserting data
There is no real difference between UPDATE
and INSERT
queries in
Cassandra. It is possible to insert data using an UPDATE
query. But the
update query does not allow us to insert rows with only the primary key and the
insert query does not allow us to use increments / decrements. In Castor, you
always use the set
function for both cases.
db.set('user')
.field('firstname', 'John')
.field('lastname', 'Doe')
.then(function() {
console.log('updated');
}).fail(function(error) {
console.log(error);
});
Counter columns can be updated using the incr
and decr
functions.
db.set('user_logins')
.field('user_id', user_id)
.incr('logins')
.execute();
Generate UUID
In Cassandra, it's common to use UUID's for identifying rows. Castor provides a
simple way to generate a UUID matching the UUID version 4 standard. To generate
a UUID, use the uuid
function.
var user_id = db.uuid();
db.set('user')
.field('user_id', user_id)
.field('username', 'John')
.execute();
Deleting data
Deleting data can be done using the del
method.
db.del('user')
.filter('user_id', user_id)
.then(function() {
console.log('deleted');
}).fail(function(error) {
console.log(error);
});
In Cassandra, it is also possible to delete just a few fields from the row.
db.del('user')
.fields(['firstname', 'lastname'])
.filter('user_id', user_id)
.execute();
Consistency
The desired consistency can be provided using the consistency
function.
This function is available on get
, set
, del
and query
.
db.get('user')
.consistency(db.CONSISTENCY_ONE)
.then(function(rows) {
console.log(rows.toArray());
});
The following options are available:
CONSISTENCY_ANY
(not applicable onget
)CONSISTENCY_ONE
CONSISTENCY_TWO
CONSISTENCY_THREE
CONSISTENCY_QUORUM
CONSISTENCY_ALL
CONSISTENCY_LOCAL_QUORUM
CONSISTENCY_EACH_QUORUM
CONSISTENCY_LOCAL_ONE
Joins
The join
function allows you to easily include values derived from another
table through a foreign key. Be aware that Cassandra does not support joins.
This function does actually do another query for each row. This function should
be used with great care, as it might have detrimental effects on performance
when used on large resultsets.
db.get('user')
.fields(['user_id', 'username'])
.join('user_id', 'post.user_id', ['title'])
.then(function(rows) {
while (rows.valid()) {
var row = rows.current();
console.log(row.username + ' has post ' + row.title);
rows.next();
}
});
Joins do not multiply the number of rows like SQL does. For each row and field,
the join can provide at most one value. When no matching row in the right table
can be found, a null is provided. So if a user in the example above has multiple
posts in the table post
, the column title
will only contain the first
post title returned from the database.
The join function accepts an optional fourth argument which is used as fieldname prefix. This is useful when using multiple joins.
db.get('user')
.fields(['user_id', 'username'])
.join('user_id', 'post.user_id', ['title', 'image_id'], 'post_')
.join('post_image_id', 'image.image_id', ['data'])
.then(function(rows) {
while (rows.valid()) {
var row = rows.current();
console.log(row.username + ' has post ' + row.title);
rows.next();
}
});
Joins are allowed to use fields from the preceding joins.
Token-based iteration
Rows are identified by tokens in Cassandra. A token is a hash of the primary key
value, represented as a 64bit signed (-263 to 263-1).
Use the includeToken
function to include the token in the resultset.
Tokens can be used to iterate through the whole column family. Iteration can be
done by using WHERE token(field) > 234
(in CQL). In Castor, this filter can
be added with the fromToken
function.
Rows are ordered by their token and returned in that order. Iterating can be
done by combining fromToken
with limit
. Queries without fromToken
will always start with the first rows in the column family, thus with token
-263.
The following example will iterate the user table row by row.
function fetchRow(token) {
var query = db.get('user')
.fields(['user_id'])
.includeToken()
.limit(1);
if (typeof token !== 'undefined') {
query.fromToken(token);
}
query.then(function(rows) {
if (rows.valid()) {
var user = rows.current();
console.log('Got user ' + user.user_id);
fetchRow(user.token);
}
else {
console.log('done');
}
}).done();
};
fetchRow();
Tokens are not unique for rows in tables with multiple fields in the primary key. The example above only works when the primary key has one field (which is likely "user_id"). Do not use tokens for iterating wide tables (tables with multiple columns in the primary key).
The token values are returned as strings and accepted in that format by the
fromToken
function. The application should not make any assumptions about
the token format.
Promises
Query results can be retreived as promises using the execute
function.
These promises can be consumed by other promise libraries like "Q".
By using execute
, you can return the promise to include the query in a chain
of 'thenables'.
Q.when(true).then(function() {
return db.get('user').execute();
}).then(function(users) {
doSomethingWith(users);
return db.get('posts').execute();
}).then(function(posts) {
doSomethingWith(posts);
}).fail(function(error) {
console.log(error);
});
The advantage of this workflow is that you can specify an error handler (the
fail
function) once for all queries in the chain.
When not using chains, you can directly call then
after execute
.
db.get('users').execute().then(function(rows) { });
As shown in many examples above, the then
function is also directly
available on the query specification. The then
function automatically calls
the execute
function and delegates the call to the then
function of the
promise. That means that the example above is the same as:
db.get('users').then(function(rows) { });
This can only be done when using then
. You cannot call fail
in this way.
You still have to use execute
when you want to use fail
without using
then
.
db.get('users').execute().fail(function(error) { });
Column specifications
You can get a specification of the columns in the resultset using the
getColumns
function. All available functions can be found in the following
example.
db.get('users').then(function(rows) {
rows.getColumns().forEach(function(column) {
column.getKeyspace();
column.getTablename();
column.getName();
// Get specification as string (e.g. "user_id <uuid>").
column.toString();
// Get type specification.
var type = column.getType();
type.getType();
type.getTypeName();
// Valuetype for keys used in maps.
type.getKeyType().getTypeName();
// Valuetype for values used in lists, sets and maps.
type.getValueType().getTypeName();
if (type.getType() == type.VARCHAR) {
// ...
}
});
});
The getType
function on the type specification returns an integer. This can
be compared to one of the following constants:
CUSTOM
ASCII
BIGINT
BLOB
BOOLEAN
COUNTER
DECIMAL
DOUBLE
FLOAT
INT
TEXT
TIMESTAMP
UUID
VARCHAR
VARINT
TIMEUUID
INET
COLLECTION_LIST
COLLECTION_MAP
COLLECTION_SET
Retreiving database schema
The database schema can be retreived with the db.schema() function. You can provide a column family name. The whole keyspace will be returned if you omit this parameter.
The schema is read after connecting to the database and is then cached in memory.
If you need to reload it you can use the reloadSchema
function, which will
return a promise that is resolved without a value when the schema is reloaded.
All new queries (except raw queries via query
) are queued until the new
schema is loaded.
Using multiple keyspaces
It is possible to use multiple keyspaces, but you always need to start the
connection with a specific keyspace. You can get a new client with the use
function. The new client will use the same connection as the parent client.
var Castor = require('castor-client');
var db1 = new Castor('localhost', 'keyspace1');
var db2 = db1.use('keyspace2');
// Execute a query on keyspace1.
db1.get('user').execute();
// Execute a query on keyspace2.
db2.get('user').execute();
When switching to a different keyspace, the new client will load its schema.
This is a performance hit for applications that use multiple keyspaces with
identical schema's. In this case you can pass true
to the second argument.
This will bypass the schema loading. The new client will use a reference to its
parent schema.
var db2 = db1.use('keyspace2', true);
db2.get('user').execute();
Executing raw queries
Instead of using get
, set
and del
, you may also use query
to
execute raw queries. This is not the recommended way to use Castor, since there
is no sanitizing available for user input.
db.query('SELECT * FROM user').then(function(rows) {
console.log(rows.toArray());
});
Development
A Vagrant provisioning file is included to run Cassandra on VirtualBox. This can be used to run the unit test and for development of other Cassandra applications. Use the commands below to setup this environment.
git clone https://github.com/mauritsl/node-castor-client.git
cd node-castor-client
vagrant up
vagrant ssh
You are now logged into the virtual machine, where you can run the test with the following commands:
sudo su
cd /data
npm test
You can access the shell from within the virtual machine by typing cqlsh
.
Other applications can connect to this database on host 192.168.11.11
at
port 9042
(default).