0.10.0 • Published 8 years ago

banana-split v0.10.0

Weekly downloads
6
License
MIT
Repository
github
Last release
8 years ago

Banana Split - early alpha version

Build Status

A small split testing (also called A/B testing and multivariate testing) library for Node.js using MongoDB for storage.

What it does

  • Stores all data in MongoDB, including experiments, participants, and events
  • Allocates a random variation for each participant in an experiment
  • Tracks conversion events generated by participants. e.g. "signup", "upgraded", "clicked-button"
  • Calculates the conversion rates for each variation in a given (experiment, event) pair

What it doesn't do

  • No UI - there's no admin panel or dashboard of any kind
  • No HTTP server - it doesn't have a HTTP API (although it would be easy to write one if you wish)

Getting Started

Add banana-split to your node.js project:

npm install banana-split --save

In your node.js code, initialize the module as follows:

// set up a mongodb connection with mongoose
var mongoose = require('mongoose');
var db = mongoose.createConnection("mongodb://localhost:27017/myappdata");

var bananaSplit = require('banana-split')({
  db: db, 
  mongoose: mongoose
});

Add an experiment...

bananaSplit.initExperiment({
  name: 'buttonColor',
  variations: ['red', 'green']
});

Let's participate a user with ID 'user-1' and IP address '127.0.0.1'...

bananaSplit.participate({
  experiment: 'button-color',
  user: 'user-1',
  ip: '127.0.0.1'
}, function (err, variation) {
  // variation will now be either 'red' or 'green'
})

Track a couple of events by this user...

bananaSplit.trackEvent({
  user: 'user-1',
  ip: '127.0.0.1',
  name: 'signup'
})
bananaSplit.trackEvent({
  user: 'user-1',
  ip: '127.0.0.1',
  name: 'click-button'
})

Later, after many users have participated and generated events, calculate the results with...

bananaSplit.getResults({
  experiment: 'buttonColor',
  event: 'name'
}, function (err, result) {
  // put code to deal with result here
});

Here's an example of the kind of the result from getResult():

TODO: example

To interpret these confidence intervals, there's a 95% chance that the true conversionRate lies within the range:

conversionRate ± confidenceInterval

i.e. we have a 95% confidence that conversion rate for the "red" variation is:

51.5% ± 3.0%

Anonymous and signed in users

Banana-split doesn't distinguish between anonynous and signed in users. In case it helps, this is the method I'm using to handle anonymous users:

1. New anonymous visitor hits landing page

  • Generate a user ID for the anonymous visitor and place it in session storage.
  • Participate in any appropriate experiments on this landing page using this user ID.
  • Render the page based on the variations.

2. Anonymous visitor signs up for new account

  • Use the generated user ID as their new permenant user ID

3. Anonymous visitor signs in to an existing account

  • The temporary user ID is no longer interesting, and to avoid adding noise to the data, I opt-out this temporary user using the following function:
bananaSplit.optOut({
  user: '54dded1e5287fcd4a5717c04'
})

More about getResult()

Filtering: only one participant from each IP address

The getResult() function filters out all but the first user from a given IP address. This is to:

  1. Eliminate a lot of new 'users' who were generated when an existing user signs out.
  2. Prevent many requests from one IP address from adding noise. e.g. search engine bots or other web-scrapers will only be counted once each
  3. Prevent many users from one IP address skewing the results. e.g. if many users joined from a single IP address there's a good chance they all belong to the same familiy or organization and may share a certain bias which could skew the results.

The data for all these users is stored in MongoDB so changing this behavior is possible after gathering the data. The behavior above is based on my intuition and needs and if you'd like it to be different please let me know and I can add an option.

WARNING: Scaling for large websites

As the number of participants and events increases, calls to getResult() will become more expensive. It would make sense to calculate this incrementally instead of re-calculating it from scratch each time.

State of development

I'm using this in production for Readlang but it's still immature. If you decide to use it I'd love to hear from you, please report issues and suggestions for improvements on the issues page.

0.10.0

8 years ago

0.9.0

9 years ago

0.8.1

9 years ago

0.8.0

9 years ago

0.7.1

9 years ago

0.7.0

9 years ago

0.6.2

9 years ago

0.6.1

9 years ago

0.6.0

9 years ago

0.5.1

9 years ago

0.5.0

9 years ago

0.4.1

9 years ago

0.4.0

9 years ago

0.3.0

9 years ago

0.2.7

9 years ago

0.2.6

9 years ago

0.2.5

9 years ago

0.2.4

9 years ago

0.2.3

9 years ago

0.2.2

9 years ago

0.2.0

9 years ago

1.0.0

10 years ago