1.0.0 • Published 5 years ago

recos v1.0.0

Weekly downloads
24
License
-
Repository
-
Last release
5 years ago

Recos

An easy-to-use collaborative filtering based recommendation engine and NPM module built on top of Node.js and Redis. The engine uses the Jaccard coefficient to determine the similarity between users and k-nearest-neighbors to create recommendations. This module is useful for anyone with users, a store of products/movies/items, and the desire to give their users the ability to rate and receive recommendations based on similar users. Recos takes care of all the recommendation and rating logic. It can be paired with any database as it does not keep track of any user/item information besides a unique ID.

This repo is heavily based on Guy Morita previous work https://github.com/guymorita/recommendationRaccoon. It's been completely refactored to use async/await instead of a mix of callbacks and promises. It also doesn't allow dislikes (though you can unlike by setting negative value to the rating, the overall rating cannot be lower than 0 for the moment), but instead allows values ratings. For example, some actions on your app/website might have a greater "liking" value than other.

Requirements

  • Node.js 7.6.x
  • Redis

Quickstart

Recos keeps track of the ratings and recommendations from your users. It does not need to store any meta data of the user or product aside from an id. To get started:

Install Recos:

npm install recos

Setup Redis:

If local:

npm install redis
redis-server

If remote or you need to customize the connection settings use the process.env.REDIS_URL

redis://h:<password>@<url>:<port>

Require recos:

const recos = require('recos')

Add in ratings & Ask for recommendations:

recos.rated('garyId', 'movieId').then(() => {
  return recos.rated('garyId', 'movie2Id')
}).then(() => {
  return recos.rated('chrisId', 'movieId')
}).then(() => {
  return recos.recommendFor('chrisId', 10)
}).then((recs) => {
  console.log('recs', recs)
  // results will be an array of x ranked recommendations for chris
  // in this case it would contain movie2
})

using async/await:

(async () => {
  await recos.rated('garyId', 'movieId')
  await recos.rated('garyId', 'movie2Id')
  await recos.rated('chrisId', 'movieId')
  await recos.recommendFor('chrisId', 10)
})().then((recs) => {
  console.log('recs', recs)
  // results will be an array of x ranked recommendations for chris
  // in this case it would contain movie2
})

config

// these are the default values but you can change them
recos.config.nearestNeighbors = 5// number of neighbors you want to compare a user against
recos.config.className = 'movie'// prefix for your items (used for redis)
recos.config.numOfRecsStore = 30// number of recommendations to store per user

Full Usage

Inputs

Likes:

recos.rated('userId', 'itemId').then(() => {
})
// after a user likes an item, the rating data is immediately
// stored in Redis in various sets for the user/item, then the similarity,
// wilson score and recommendations are updated for that user.
recos.rated('userId', 'itemId', options).then(() => {
})
// available options are:
{
  updateRecs: false
    // this will stop the update sequence for this rating
    // and greatly speed up the time to input all the data
    // however, there will not be any recommendations at the end.
    // if you fire a like/dislike with updateRecs on it will only update
    // recommendations for that user.
    // default === true
}

Recommendations

recos.recommendFor('userId', 'numberOfRecs').then((results) => {
  // returns an ranked sorted array of itemIds which represent the top recommendations
  // for that individual user based on knn.
  // numberOfRecs is the number of recommendations you want to receive.
  // asking for recommendations queries the 'recommendedZSet' sorted set for the user.
  // the movies in this set were calculated in advance when the user last rated
  // something.
  // ex. results = ['batmanId', 'supermanId', 'chipmunksId']
})

recos.mostSimilarUsers('userId').then((results) => {
  // returns an array of the 'similarityZSet' ranked sorted set for the user which
  // represents their ranked similarity to all other users given the
  // Jaccard Coefficient. the value is between -1 and 1. -1 means that the
  // user is the exact opposite, 1 means they're exactly the same.
  // ex. results = ['garyId', 'andrewId', 'jakeId']
})

recos.leastSimilarUsers('userId').then((results) => {
  // same as mostSimilarUsers but the opposite.
  // ex. results = ['timId', 'haoId', 'phillipId']
})

User Statistics

Ratings:

recos.bestRated().then((results) => {
  // returns an array of the 'scoreboard' sorted set which represents the global
  // ranking of items based on the Wilson Score Interval. in short it represents the
  // 'best rated' items based on the ratio of likes/dislikes and cuts out outliers.
  // ex. results = ['iceageId', 'sleeplessInSeattleId', 'theDarkKnightId']
})

recos.worstRated().then((results) => {
  // same as bestRated but in reverse.
})

Rated lists and counts:

recos.mostRated().then((results) => {
  // returns an array of the 'mostLiked' sorted set which represents the global
  // number of likes for all the items. does not factor in dislikes.
})

recos.mostDisrated().then((results) => {
  // same as mostLiked but the opposite.
})

recos.ratedBy('itemId').then((results) => {
  // returns an array which lists all the users who rated that item.
})

recos.ratedCount('itemId').then((results) => {
  // returns the number of users who have rated that item.
})

recos.allWatchedFor('userId').then((results) => {
  // returns an array of all the items that user has rated or disrated.
})

Recommendation Engine Components

Jaccard Coefficient for Similarity

There are many ways to gauge the likeness of two users. The original implementation of recommendation Recos used the Pearson Coefficient which was good for measuring discrete values in a small range (i.e. 1-5 stars). However, to optimize for quicker calcuations and a simplier interface, recommendation Recos instead uses the Jaccard Coefficient which is useful for measuring binary rating data (i.e. like/dislike). Many top companies have gone this route such as Youtube because users were primarily rating things 4-5 or 1. The choice to use the Jaccard's instead of Pearson's was largely inspired by David Celis who designed Recommendable, the top recommendation engine on Rails. The Jaccard Coefficient also pairs very well with Redis which is able to union/diff sets of like/dislikes at O(N).

K-Nearest Neighbors Algorithm for Recommendations

To deal with large user bases, it's essential to make optimizations that don't involve comparing every user against every other user. One way to deal with this is using the K-Nearest Neighbors algorithm which allows you to only compare a user against their 'nearest' neighbors. After a user's similarity is calculated with the Jaccard Coefficient, a sorted set is created which represents how similar that user is to every other. The top users from that list are considered their nearest neighbors. recommendation Recos uses a default value of 5, but this can easily be changed based on your needs.

Redis

When combined with hiredis, redis can get/set at ~40,000 operations/second using 50 concurrent connections without pipelining. In short, Redis is extremely fast at set math and is a natural fit for a recommendation engine of this scale. Redis is integral to many top companies such as Twitter which uses it for their Timeline (substituted Memcached).

Run tests

npm test
1.0.0

5 years ago

0.1.0

5 years ago

0.0.4

5 years ago

0.0.3

5 years ago

0.0.2

5 years ago

0.0.1

5 years ago