1.0.2 • Published 11 months ago

nosnaplet v1.0.2

Weekly downloads
-
License
CC-BY-NC-ND-4.0
Repository
-
Last release
11 months ago

noSnaplet (Snaplet for MongoDB)

Introduction

I wanted a tool similar to Snaplet for MongoDB to anonymize production data for development environments in order to comply with GDPR. This package allows you to import, anonymize, and manage your MongoDB data in a way that ensures privacy and data protection across different environments.

This package is still under development, so there may be bugs, especially in the inter-collection link functionality.

Documentation

nosnaplet documentation site - COMING SOON.

Getting started

To get started with noSnaplet, follow these steps:

  1. Install MongoDB tools for mongoexport

    Mongo Downdload database tools

  2. Install noSnaplet

      npm install -g nosnaplet
  3. Run noSnaplet CLI by running following command:

      npx nosnaplet fakesnap

Commands

Test your connection to database

  npx nosnaplet tryconnect

You will be prompted to insert your MongoDB connection URI.

Example prompt:

Enter the connection URI of the mongo database to replicate (production data). Please include '?authSource=admin' in the URI. Here's an example: mongodb://root:example@localhost:27017/?authSource=admin. Enter URI : mongodb://userdb:superpassword@mongo.nosnaplet.dev:27017/?authSource=admin

Run export to MongoDB only

  npx nosnaplet snapshot

Insert your MongoDB connection URI when prompted.

Upon completion, you will see: : Schemas and links written to .schema-directory

You have two folders created. .output-directory and .schema-directory

For more information, you can consult the structure of these folders in the relevant section.

Only import and anonymize into MongoDB

  npx nosnaplet faked

You will be prompted to insert your MongoDB connection URI. Upon completion, you will see the following message:

All document links have been updated.
Folder .schema-directory deleted.
Folder .output-directory deleted.
Closing MongoDB connection.

The logs will also indicate which databases received anonymized data.

If you change the directory after taking a snapshot of the production database :

Copy the .output-directory and .schema-directory generated by the export command into your current directory.

Run all

You can execute both the snapshot and anonymization processes in one step using the following command:

npx nosnaplet fakesnap

This command runs both npx nosnaplet snapshot (which exports the data) and npx nosnaplet faked (which imports and anonymizes the data) sequentially. It's a convenient way to streamline the process without having to run the commands separately.

File Structure

.
├── .output-directory/
│   ├── db1/
│   │   ├── collection1.json
│   │   └── collection2.json
│   ├── db2/
│   │   └── collection1.json
│   └── ...
└── .schema-directory/
    ├── interfaces.ts
    └── links.json

Into output-directory you can see a json file to the collection with your data.

interfaces.ts

If you need to customize the types or structure of your data, you can modify the interfaces.ts file located in the .schema-directory. This file defines the schema of your MongoDB collections.

Example

Here is a section of the original interfaces.ts file:

Original :

import { ObjectId } from 'mongodb';

interface MS_nosnapletcode_prospectpulse {
  entrepriseprospects: {
    _id: ObjectId;
    nom: string;
  }
}

Customization

You can customize the schema to fit your needs. For instance, if you want to change the type of the nom field from string to number, you would modify the interface as shown below.

import { ObjectId } from 'mongodb';

interface MS_nosnapletcode_prospectpulse {
  entrepriseprospects: {
    _id: ObjectId;
    nom: number;
  }
}

Supported Types

The following types are supported within the interfaces.ts file:

  • string: Represents text data.
  • number: Represents numeric values, including integers and floats.
  • boolean: Represents true/false values.
  • Date: Represents date and time values.
  • ObjectId: Represents a MongoDB ObjectId, used as a unique identifier for documents.
  • Array: Represents an array of values, which can be of any type (e.g., string[], number[]).
  • Nested Objects: You can define nested objects with their own properties and types.

links.json

The links.json file located in the .schema-directory defines the relationships between different collections across databases in your MongoDB setup. It serves as a mapping that helps you maintain referential integrity when importing or anonymizing data. Essentially, it specifies how documents in one collection are linked to documents in another collection through fields that act as references, typically using ObjectId values.

Structure of links.json

The links.json file consists of an array of link objects. Each link object describes a relationship between a field in one collection and a corresponding collection in another database.

Here is a breakdown of the fields within each link object:

  • field: The name of the field in the source collection that holds the reference to another collection.
  • fromDatabase: The name of the source database where the reference field is located.
  • fromCollection: The name of the source collection where the reference field is located.
  • toDatabase: The name of the target database that contains the collection you are linking to.
  • toCollection: The name of the target collection that the field is referencing.
  • type: The type of the reference, typically ObjectID, indicating that the field holds MongoDB ObjectId values that reference documents in the target collection.

Example of links.json Here’s the content of your links.json file:

{
  "links": [
    {
      "field": "createdBy",
      "fromDatabase": "database1",
      "fromCollection": "entrepriseprospects",
      "toDatabase": "database2",
      "toCollection": "users",
      "type": "ObjectID"
    },
    {
      "field": "organisation",
      "fromDatabase": "database1",
      "fromCollection": "entrepriseprospects",
      "toDatabase": "database2",
      "toCollection": "organisations",
      "type": "ObjectID"
    }
  ]
}

Explanation :

  • Field: createdBy
  • From Database: database1
  • From Collection: entrepriseprospects
  • To Database: database2
  • To Collection: users
  • Type: ObjectID

This link indicates that the createdBy field in the entrepriseprospects collection of the database1 database references a document in the users collection of the database2 database. The reference is an ObjectId.

How links.json Is Used

After anonymizing and importing data, the script can use this file to update the ObjectId fields in the source collection to point to the correct documents in the target collection.

Customization

You can add or modify entries in links.json if you need to define additional relationships between collections in your databases.

Ensure that the field, fromDatabase, fromCollection, toDatabase, toCollection, and type properties correctly reflect the schema and relationships in your MongoDB setup.

License

This project is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0).

You are free to share and redistribute the material in any medium or format under the following terms:

  • Attribution: You must give appropriate credit, provide a link to the license, and indicate if changes were made.
  • NonCommercial: You may not use the material for commercial purposes.
  • NoDerivatives: If you remix, transform, or build upon the material, you may not distribute the modified material.

For the full license text, please visit CC BY-NC-ND 4.0 License.