3.0.0 • Published 4 years ago

knn-recommender v3.0.0

Weekly downloads
-
License
MIT
Repository
github
Last release
4 years ago

knn-recommender

A pure JavaScript implementation of a K-nearest neighbour based collaborative filtering recommender primarily for like/dislike User-Item matrices. You can use the recommender e.g. for Item-Item characteristics matrices as well. So this library enables you to provide "You liked this, you might also like this" or "items similar to this item" recommendations.

This library should run both in node and browser environments. This is an experimental implementation and is intended for fairly small size matrices (~1000 users). If you are looking for a more high performing (and properly threaded) library, I'd recommend you to check out recommendationRacoon.

The recommender takes a user item matrix of size X x Y where X0 column represents the user id's and Y0 the item labels. The cells in the matrix are expected to contain either -1 (dislike), 0 (no rating given) or 1 (like). This information can be used to calculate the similarity of users in the matrix based on jaccard similarity.

Example of a possible User-Item matrix:

[
    ['emptycorner', 'item 1', 'item 2', 'item 3', 'item 4',
        'item 5', 'item 6', 'item 7'],
    ['user 1', 1, -1, 0, 0, -1, 1, 0],
    ['user 2', 1, -1, 0, 1, -1, 0, 0]
]

Example of a possible Item-Item characteristics matrix:

[
    ['emptycorner', 'characteristic 1', 'characteristic 2', 'characteristic 3'],
    ['item 1', 1, 0, 1],
    ['item 2', 1, 1, 0]
]

Jaccard similarity calculates the common ratings between two users and divides that by the total ratings given by the users. The non-ratings are not considered when calculating the similarity of two users.

So if user X has a rating matrix (1, -1, 0, 0, -1, 1, 0) and Y (1, -1, 0, 1, -1, 0, 0) their Jaccard similarity is (1+1+1) / 5 = 3/5. 5 is effectively the number of the elements that at least one of the two users has either liked or disliked. Based on this we could provide a recommendation for user Y for item number 6 as Y hasn't expressed any preference for that and X has liked this item.

This recommender can also work only based on non recommendations and recommendations (0's and 1's so it's not necessary to provide dislikes (-1).

Installation

npm install --save knn-recommender

You can also download the javascript source for knn-recommender.js directly from this repository. Or you can download the minified version compiled for npm-distribution that should run in browsers and node-environment.

Basic usages

Basic usage for User-Item matrices

If you have the User-Item matrix available, you can initialize the recommender for all users.

import KNNRecommender from 'knn-recommender';

const kNNRecommender = new KNNRecommender([['emptycorner', 'item 1', 'item 2', 'item 3', 'item 4','item 5', 'item 6', 'item 7'], ['user 1', 1, -1, 0, 0, -1, 1, 0], ['user 2', 1, -1, 0, 1, -1, 0, 0]])
kNNRecommender.initializeRecommender().then(() => {
    const userRecommendations = kNNRecommender.generateNNewUniqueRecommendationsForUserId('user 2')
    console.log(`new recommendation for user 2 ${userRecommendations[0].itemId}`)
})

Or you can start filling the items and users to the matrix one by one and also initialize the recommender only for only certain users. Initializing the recommender only for one user is significantly faster, so you should do that if you only provide recommendations for this particular user.

import KNNRecommender from 'knn-recommender';

const kNNRecommender = new KNNRecommender(null)
kNNRecommender.addNewItemToDataset('item 1')
kNNRecommender.addNewItemToDataset('item 2')
kNNRecommender.addNewEmptyUserToDataset('user 1')
kNNRecommender.addNewEmptyUserToDataset('user 2')

kNNRecommender.addDislikeForUserToAnItem('user 1', 'item 1')
kNNRecommender.addLikeForUserToAnItem('user 1', 'item 2')
kNNRecommender.addDislikeForUserToAnItem('user 2', 'item 1')

kNNRecommender.initializeRecommenderForUserId('user 2')

const user2Recommendations = kNNRecommender.generateNNewUniqueRecommendationsForUserId('user 2')

//should print 'item 2'
console.log(`new recommendation for user 2 ${user2Recommendations[0].itemId}, (recommender user id: ${user2Recommendations[0].recommenderUserId}, similarity 
with the recommender: ${user2Recommendations[0].similarityWithRecommender})`)

kNNRecommender.addNewItemToDataset('item 3')
kNNRecommender.addLikeForUserToAnItem('user 2', 'item 3')

kNNRecommender.initializeRecommenderForUserId('user 1')

const user1Recommendations = kNNRecommender.generateNNewUniqueRecommendationsForUserId('user 1', {amountOfDesiredNewRecommendations: 1})

//should print 'item 3'
console.log(`new recommendation for user 1 ${user1Recommendations[0].itemId}`)

Basic usage for Item - Item charasteristic matrices

const kNNRecommender = new KNNRecommender(null)

kNNRecommender.addNewItemCharacteristicToDataset('characteristic 1')
kNNRecommender.addNewItemCharacteristicToDataset('characteristic 2')
kNNRecommender.addNewEmptyItemAsRowToDataset('item 1')
kNNRecommender.addNewEmptyItemAsRowToDataset('item 2')

kNNRecommender.addCharacteristicForItem('item 1', 'characteristic 1')
kNNRecommender.addCharacteristicForItem('item 1', 'characteristic 2')

//you can also remove characteristics like this:
//kNNRecommender.removeCharacteristicForItem('item 1', 'characteristic 2')

kNNRecommender.addCharacteristicForItem('item 2', 'characteristic 1')

kNNRecommender.initializeRecommenderForItemId('item 2')

const similarItemsForItem2 = kNNRecommender.getNNearestNeighboursForItemId('item 2', 1)

//should print item 1
console.log(`most similar item with item 2 is ${similarItemsForItem2[0].otherRowId}`)
//should print 0.5
console.log(`similarity score between item 1 and item 2 is ${similarItemsForItem2[0].similarity}`)

kNNRecommender.addNewItemCharacteristicToDataset('characteristic 3')
kNNRecommender.addCharacteristicForItem('item 2', 'characteristic 3')

kNNRecommender.initializeRecommenderForItemId('item 1')

const similarItemsForItem1 = kNNRecommender.getNNearestNeighboursForItemId('item 1', 1)

//should print item 2
console.log(`most similar item with item 1 is ${similarItemsForItem1[0].otherRowId}`)

Sidenote: If you use node without babel, you have to import the module like this:

const KNNRecommender = require('knn-recommender');
const kNNRecommender = new KNNRecommender.default(null)
kNNRecommender.addNewItemToDataset('item 1')
...

API

The api methods are described primarily for user item matrix, but there are similar convenience methods for item - item charasteristics matrices as well. You can see the usage of these methods in the item - item charasteristic matrix example above.

Scroll to the right to see all the columns...

MethodArgumentsReturnsDescriptionExample
KNNRecommendermatrix: Array<Array<string or number>> or nullvoidThis constructor takes a X x Y user item matrix (or item - item charasteristic matrix) as its argument. X0 column represents the user id's and Y0 the item labels. The cells in the matrix are expected to contain either -1 (dislike), 0 (no rating given) or 1 (like). The matrix can be null and you can use the addNewItemToDataset anda addNewUserToDataset methods for initializing the matrixconst kNNRecommender = new KNNRecommender([['emptycorner', 'item 1', 'item 2', 'item 3', 'item 4','item 5', 'item 6', 'item 7'], ['user 1', 1, -1, 0, 0, -1, 1, 0], ['user 2', 1, -1, 0, 1, -1, 0, 0]])
initializeRecommenderno argumentsPromise<boolean>Initializes the recommender for all users based on the provided user item matrix so we can start asking recommendations from it. If you add new items or users to the matrix and want the updates to affect the recommendations, you need to run this initialization again. This initialization is a heavy (roughly: O(n^3) + O(n * log(n)) operation. The method returns a Promise that resolves to true when the initialization is completed successfully.kNNRecommender.initializeRecommender().then(() => {...
initializeRecommenderForUserIduserId: stringvoid(Re)-initialize the recommender only for one userId. This is significantly faster than initializing the recommender for all users, so you should use this method when possible.kNNRecommender.initializeRecommenderForUserId('user 1')
generateNNewRecommendationsForUserIduserId: string, {amountOfDesiredNewRecommendations: number = 1, amountOfDesiredNearestNeighboursToUse: number = 3, excludingTheseItems = []}Array<Recommendation>Try to generate the desired amount of new recommendations for a user based on what similar users have liked. The method starts with the most similar user and collects all the likings from him/her where the current user hasn't expressed their preference yet. If the amount of desired recommendations hasn't been fulfilled yet, it proceeds to the second most similar user and so on. The method might add the same recommendation twice if an item has been recommended by several similar users. If you want to have these potential multi recommendations excluded use the method generateNNewUniqueRecommendationsForUserId instead. You can also give a string list of item id's to be exluded from the results as an argument. Returns an array containing the recommendations or an empty array if no recommendations can be generated from the data e.g. [{itemId: 'item 1', recommenderUserId: 'user 3', similarityWithRecommender: 0.6}, {itemId: 'item 1', recommenderUserId: 'user 2', similarityWithRecommender: 0.4} {itemId: 'item 3', recommenderUserId: 'user 2', similarityWithRecommender: 0.4}, null]const userRecommendations = kNNRecommender.generateNNewRecommendationsForUserId('user 3', { amountOfDesiredNewRecommendations: 3, amountOfDesiredNearestNeighboursToUse: 2 }); console.log(`${userRecommendations[0].itemId} ${userRecommendations[0].recommenderUserId} {userRecommendations[0].similarityWithRecommender}`)
generateNNewUniqueRecommendationsForUserIduserId: string, amountOfDesiredNewRecommendations: number = 1, amountOfDesiredNearestNeighboursToUse: number = 3, excludingTheseItems = []Array<Recommendation>Try to generate the desired amount of new recommendations for a user based on what similar users have liked. The method starts with the most similar user and collects all the likings from him/her where the current user hasn't expressed their preference yet. If the amount of desired recommendations hasn't been fulfilled yet, it proceeds to the second most similar user and so on. The method doesn't add the same recommendation twice even if it would be recommended by several users. If you want to have these potential multi recommendations included use the method generateNNewRecommendationsForUserId instead. You can also give a string list of item id's to be exluded from the results as an argument. Returns an array containing the recommendations or an empty array if no recommendations can be generated from the data e.g. [{itemId: 'item 1', recommenderUserId: 'user 3', similarityWithRecommender: 0.6}, itemId: 'item 3', recommenderUserId: 'user 2', similarityWithRecommender: 0.4}, null]const userRecommendations = kNNRecommender.generateNNewUniqueRecommendationsForUserId('user 3', { amountOfDesiredNewRecommendations: 3, amountOfDesiredNearestNeighboursToUse: 2 }); console.log(`${userRecommendations[0].itemId} ${userRecommendations[0].recommenderUserId} {userRecommendations[0].similarityWithRecommender}`)
getNNearestNeighboursForUserIduserId: string, amountOfDesiredNeighbours: number = -1Array<Similarity>Returns a sorted list of the n most similar users to the given userId. The elements in the list contain objects in the form {otherRowId, similarity}. E.g. [{otherRowId: 'User 2', similarity: 0.53}, {otherRowId: 'User 3', similarity: 0.4}]const user1ToOtherUsersArray = kNNRecommender.getNNearestNeighboursForUserId('user 1')
getAllRecommendationsForUserIduserId: stringArray<string or number>Get all the recommendations for certain user id. You can use this method together with getNNearestNeighboursForUserId to manually generate recommendations for one user based on the recommendations of other users. Returns e.g. ['user 1', 1, 0, -1, 0]const allUserRecommendations = kNNRecommender.getAllRecommendationsForUserId('user 1')
addLikeForUserToAnItemuserId: string, itemId: stringvoidUpdate the liking value for a certain user and item. NOTE: This method does not invoke an automatic recalculation of the user similarities. You need to tricker that manually if you wish by running initializeRecommender-methodkNNRecommender.addLikeForUserToAnItem('user 1', 'item 2')
addDislikeForUserToAnItemuserId: string, itemId: stringvoidUpdate the disliking value for a certain user and item. NOTE: This method does not invoke an automatic recalculation of the user similarities. You need to tricker that manually if you wish by running initializeRecommender-methodkNNRecommender.addDislikeForUserToAnItem('user 1', 'item 2')
addNewUserToDatasetuserRow: Array<string or number>voidAdd a new user row to the data set. NOTE: This method does not invoke an automatic recalculation of the user similarities. You need to tricker that manually if you wish by running initializeRecommender-method.kNNRecommender.addNewUserToDataset(['user x', 1, 0, -1, ...])
addNewEmptyUserToDatasetuserId: stringvoidConvenience method to add an empty user to data set with only user id. All the recommendations are initialized with 0. NOTE: This method does not invoke an automatic recalculation of the user similarities. You need to tricker that manually if you wish by running initializeRecommender-method.kNNRecommender.addNewEmptyUserToDataset('user 3')
addNewItemToDatasetitemId: stringvoidAdd a new item to the user item matrix and initialize all user recommendations with value 0 for the new item. NOTE: This method does not invoke an automatic recalculation of the user similarities. You need to tricker that manually if you wish by running initializeRecommender-methodkNNRecommender.addNewItemToDataset('item 8')

Performance

Performance tests were run on a 2014 Macbook Pro with a 2,2 GHz Quad-Core Intel Core i7 processor and 16 GM 1600 MHZ DDR3 memory.

Initialization times with different size matrices were as listed here:

Matrix size (users x items)Initialization time
100 x 10040ms
500 x 5002.8s
1000 x 100023s
1000 x 501.2s
50 x 100080ms
50 x 100000.5s

Adding more users (or items) raises the initialization times radically. So if you find a way to divide your users into clusters, you can reduce the amount of user rows needed for providing recommendations for a certain user and thus provide recommendations faster. Or you can initialize the recommender only for individual users.

Contact

Ohto Rainio

Licence

MIT