1.2.9 • Published 5 years ago

json-ditto v1.2.9

Weekly downloads
22
License
ISC
Repository
github
Last release
5 years ago

NPM

David David Snyk Vulnerabilities for npm package Code Climate maintainability Code Climate coverage Try json-ditto on RunKit CircleCI (all branches)

Ditto

JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. When dealing with data integration problems, the need to translate JSON from external formats to adhere to an internal representation become a vital task. Ditto was created to solve the issue of unifying external data representation.

Ditto parses a mapping file (see mapping rules) and produces a JSON output from the input data to match the output definition. Ditto has three main mapping steps as shown in the diagram below where the output of each step is fed as input to the next one:

  • _preMap: Start the pre-mapping process which runs before the main mapping process to transform the input data
  • _map: Start the unification process based on the manual mapping files. This is the step where the mapping file is read and the data is mapped accordingly
  • _postMap: Start the post-mapping process which will run after the main mapping is done and transform the mapping result (output)

How to use Ditto

Ditto exposes a class that can be instantiated with a mapping file and/or plugins list. You can either use the unify method with the document you wish to unify with the mappings and plugins passed to the constructor.

const Ditto = require('json-ditto');

// This is valid JSON mapping file that maps to the mapping rules below
const myCustomMappings = require('./myMappingFile');

// Create a new mapper that will always use the "myCustomMappings" file
const myCustomMapper = new Ditto(myCustomMappings);

// Call the unify function that will read the "documentToBeUnified" and transforms it via the mapping file
myCustomMapper.unify(documentToBeUnified).then((result) => {
    .....
});

or you can create a default instance and pass the document with the mappings to the unify function.

const Ditto = require('json-ditto');

// This is valid JSON mapping file that maps to the mapping rules below
const myCustomMappings = require('./myMappingFile');

// Call the unify function that will read the "documentToBeUnified" and transforms it via the mapping file passed to the constructor
return new Ditto.unify(myCustomMappings, documentToBeUnified).then((result) => {
    .....
});

By default, you can use the built-in plugins provided with Ditto using the syntax defined below in your mapping. However, if you wish to register additional plugins you can use the following addPlugins method or you can pass the plugins to the constructor directly via new Ditto(myCustomMappings, myCustomPlugins).

Note: Adding plugins extends and will not overwrite the default plugins

addPlugins(plugins)

Add extra set of plugins to the default ones

ParamTypeDescription
pluginsObjectthe extra plugins passed to be added to the default set of Ditto plugins

_map

The _map function is the main processing step. It takes in the mapping file and processes the rules inside to transform the input object.

The mapping file has to be defined with specific rules that abide with main mapping function. The mapping function contains the following methods:

processMappings(document, result, mappings)

Process the mappings file and map it to the actual values. It is the entry point to parse the mapping and input files.

ParamTypeDescription
documentObjectthe object document we want to map
resultObjectthe object representing the result file
mappingsObjectthe object presenting the mappings between target document and mapped one

applyTransformation(path, key)

Apply a transformation function on a path. A transformation is a function that is defined in the mapping file with the @ symbol and is declared in the plugins folder.

ParamTypeDescription
pathStringthe path to pass for the _.get to retrieve the value
keyStringthe key of the result object that will contain the new mapped value

An example plug-in:

'use strict';

const _ = require('lodash');

module.exports = function aggregateExperience(experience) {

    let totalExperience = 0;

    _.each(experience.values, function(experience){
        if (!!experience.experienceDuration) totalExperience += experience.experienceDuration
    })

    return totalExperience === 0 ? null : totalExperience;
};

As you can see from the example above, a plug-in is nothing more than a simple JavaScript function that can take 0+ arguments. These arguments are passed in the mapping file and are separated by |. For example: @concatName(firstName|lastName). The following examples demonstrate the abilities of plug-ins and their definitions:

arguments can be defined using any syntax acceptable by the getValue function described below

  • @concatName(firstName|lastName): call the function concatName with the values firstName and lastName extracted from their paths in the input file by calling the getValue() function on them
  • @concatName(firstName|>>this_is_my_last_name): call the function concatName with the firstName argument extracted from the path input file and passing the hard-coded value this_is_my_last_name that will be passed as a string
  • @concatName(firstName|lastName|*>>default): call the function concatName with three arguments. firstName and lastName and a default value. Default values are arguments that have a * perpended to them. Default arguments follow as well the syntax for getValue function and can be either hard-coded or extracted from the input file or the result file.

If you are passing a number of arguments that you might not exactly know their number, we recommend using the arguments built-in JavaScript keyword to extract the arguments passed and process them accordingly

function concatName() {
	return _.flatten(_.values(arguments)).join(' ').replace(/\s\s+/g,' ').trim();
}

Functions in Functions

Sometimes you may want to pass the value of a function into another function as a parameter. You can do this easily by calling the function name inside the arguments. However, an important to thing to note is that inner function calls, if they contain more than one parameter, then the paramteres have to be separated by a comma , rather than the traditional |.

Examples:

"type"   : "@getLinkType(value|@getLinkService(value,service))",
"value"  : "@cleanURI(value|@getLinkType(value,@getLinkService(value,service)))",

Plugins Activation

The plugins are activated in the /ditto/plugins/plugins.js file by adding the plugin name (corresponds exactly to the file name .js of the definition) in the plugins array. The plugin will be reuqired and exported to be used in the main mapping function in the interface.

'use strict';

module.exports = {
	aggregateExperience              : require('./aggregateExperience'),
	assignIds                        : require('./assignIds'),
	assignUniqueIds                  : require('./assignUniqueIds')
	....

getValue(path) ⇒ Object

Returns: Object - result the object representing the result file

ParamTypeDescription
pathStringthe path to pass for the _.get to retrieve the value

This function will get the value using the _.get by inspecting the scope of the get. The getValue works on two scopes:

  • The input file which is the main file we wish to transform
  • The result file which is the file that contains the result of transforming the input file. Often, we need to reference the result file in our mapping. We do that by prepending the path with ! so it defines a local scope in the result object rather than the input document.

The formats acceptable by the getValue are:

  • Starts with !: This will denote that the contact on which the _.get will be on a previously extracted value in the result file
  • Starts with >>: This means a hard-coded value e.g., >>test -> test
  • Starts with @: This means that a function will be applied on the value before the @ sign
  • Starts with @!: We are passing a built-in JavaScript function that will be executed e.g., @!New Date()
  • Contains %: This will denote a casting function to the value using eval e.g., >>%true -> will be a true as a boolean and not as a string
  • contains ||: This means a fall-back to a value .. the value is anything which can be either hard-coded value e.g., something_falsy||>>test -> test or a reference to a path
  • Contains ??: This means that we are applying a condition before we assign the value
  • Contains *: If appended before the last parameter, this acts as a default value for the function and the value of that value will be assigned automatically

Conditional Assignment

Conditional assignment is what happens when a ?? is present in the mapping path. This is very useful as it restricts assigning the value unless a condition is met.

Example: value: value??keys[0]#==#>>f5e32a6faaa7ead6ba201e8fa25733ee

This will mean that we want to assign value path from the input document to the result document only if the key[0] element (the first element in the key array in the input document) is equal to the hardcoded string "f5e32a6faaa7ead6ba201e8fa25733ee"

Mapping Rules:

Mapping "flat" structures is straightforward. For example:

	"name"                    : "firstName",
	"nickname"                : "nickname>>nickname_not_found",
	"fullName"                : "@concatName(firstName|lastName)",
	"fullNameDefault"         : "@concatName(firstName|*!fullName)",
	"fullNameDefaultHardcoded": "@concatName(firstName|lastName|*>>default)",
	"completeName"            : "@concatName(firstName|!fullName)",
	"displayName"             : "!fullName",
    "email": {
        "value": "email"
    }

In here we are parsing directly flat structure and creating objects out of them. For example, we will not have the email value defined as an object email:{value:"test@email.com"} instead of what it was in the input file as email:"test@email.com"

However, things can become a bit more complex when we trying to create complex objects like arrays or objects. Defining these structures requires defining various extra parameters in the mapping file:

  • output: This will define the output path type whether it is an array [] or an object {}
  • key: This is a required filed only when the output is set to be an object {} as objects assigned needs to have a key defined
  • innerDocument: Since we are creating a "collection" we are most probably looping inside of a collection as well. The innerDocument property tells the mapper on which collection to loop. However, if the innerResult is set to ! then this mean that the innerDocument scope is the current input document root.
  • prerequisite (optional): This defines a condition that has to be met before a parsed result is pushed or assigned to the collection. The prerequisite works on the already extracted result, so it will be defined for example as !!innerResult.value whereas the !!innerResult is taken always with context to the mapping
  • required (optional): Similar to prerequisite this defines a condition that has to be met before the result is pushed. However, this examines the data after it has been porcessed while the prerequisite works directly on the innerResult object.
  • requirements (optional): Simlar to required, this works on the result after it has been assigned. For example, this can be a check to make sure that the resulting array or object contains unique values
  • mappings: This defines how you want to map each object that will be pushed to the collection. The mappings here are relative to the innerDocument path. For example, if the innerDocument is defined as experience and the mappings object has name: "companyName" that means that companyName is a property inside of experience object.

Mapping

"social_media_addresses": {
    "output": [],
    "innerDocument": "linksv2.values",
    "prerequisite": "!!innerResult.value",
    "requirements": ["@uniqueArray(!social_media_addresses|>>value)", "@transformTwitterHandle(!social_media_addresses)"],
    "mappings": {
        "value": "value??type#==#>>social"
    }
}

Result

"social_media_addresses": [{
    "value": "@ahmadaassaf"
}]

Mapping FAQs:

  • How can i point the Ditto to a deep nested Object ?

    Ditto uses Lodash's _.get which means that you can pass any path in form of a String or an Array e.g., a.b.c[0].d or [a, b, c[0], d]

  • How can i iterate over a nested Object ?

    To iterate over a sub-path, you need to define an innerDocument. The inner document path is again parsed with the _.get so it can be as complex as it can get. However, as the structure of the Ditto requires that an innerDocument has to be defined when creating an array or Object of Objects, you can refer to the current document root with !

  • I see some paths prefixed with ! .. what does that mean ?

    Sometimes you need to access already parsed values (as in values in your result file). This is seen for example when we are trying to create the keys array from the already generated ids. In that case, the ! prefixed path e.g., !links.values will refer to the already extracted links.values Object in the result file

  • If i want to extract data from multiple places for the same Object, how can i do that ?

    Ditto allows to specify multiple Objects to be set as a parsing target. For example, if we are creating an Object and you to have the values extracted from multiple places then you define your values as an array of objects where each Object will have output, innerDocument, etc. (you can check the contacts.v2.js sample mappping file). However, if you are creating an Object without values then your direct mapping will be an array of Object (check test.js sample mapping file and see the social_links mapping)

  • If i am creating an Object of Object, each Object should have a key. How can i define that ?

    For object of objects (i believe you have defined the output as {}) then you need to define a key object. The key object is an array where you define that various targets that will be parsed as a key. The key is defined either as a relative path to the currently parsed Object or as a function call e.g., "key": "@generateId($key|degree)"

  • If i am iterating on an array or an object, can i have access to the array value and index, or the object key ?

    Yes, definitely. These variables can be access via the $value which referrs to teh value of the object or the array or the $key which refers to the Object key or the array element index

  • In functions, how can i pass a string ?

    In the same way we hardcode values by appending >> you can pass any String to the function. e.g., @getImageLink(>>http://photo.com/|!fullName) where we pass the url http://photo.com as a first parameter

Check the test files for complete use cases coverage

Plugins

base64Encode(input) ⇒ String

base64 encode a text string

Kind: global function

ParamType
inputString

cleanEmail(source) ⇒ String

normalize an email by converting it into an all lowercase This will be extended in te future by doing more robust email validation

Kind: global function

ParamTypeDescription
sourceStringthe string we want to clean out

concatName(source) ⇒ String

Clean a string from special characters, HTML tags and trailing dots and commas

Kind: global function

ParamTypeDescription
sourceStringthe string we want to clean out

cleanURI(source) ⇒ String

Clean a URI from special characters, HTML tags and trailing dots and commas

Kind: global function

ParamTypeDescription
sourceStringthe URI we want to clean out

concatName() ⇒ String

Concatinate a string with one or more other strings separated by a space Since we might be passing one or more (n) strings, we will use arguments

Kind: global function

concatString() ⇒ String

Concatinate a string with one or more other strings

Kind: global function

isValidString(str)

A string is considered valid if is a string and is not empty

Kind: global function

ParamType
strString

concatWithComma() ⇒ String

Concatinate a string with one or more other strings and join them using comma and space.

Kind: global function

createURL(url, source) ⇒ String

Create a url from passed parameteres

Kind: global function

ParamTypeDescription
urlStringthe main url base
sourceStringthe string to concatinate to the base url

extractName(fullName, position) ⇒ String/Array

Extract the first name of a contact as it is a required field

Kind: global function Returns: String/Array - Returns the extracted firstName or lastName as Strings or the middleName(s) as an array

ParamTypeDescription
fullNameStringthe contact fullname
positionStringthe position of the name to extract (firstName, lastName, middleName)

formatDate(date, format, isUtc) ⇒ String

Format a date according to parameters

Kind: global function

ParamTypeDefaultDescription
dateDate
formatStringFormat of the date.
isUtcBooleantrueIf timezone should be utc or not

generateCleanId() ⇒ String

Create an md5 hash based on concatentating passed String Values Since we might be passing one or more (n) strings, we will use arguments

Kind: global function Returns: String - result the concatenated cleaned string

generateFacebookImageLink(Facebook) ⇒ String

Generate a link for the Facebook profile photo based on the facebook ID

Kind: global function

ParamTypeDescription
FacebookStringprofile ID

generateId() ⇒ String

Create an md5 hash based on concatentating passed String Values This function will take multiple arguments that will be extracted via the arguments keyword

Kind: global function

generateIdForLinks(source) ⇒ String

Create an md5 hash based on concatentating passed String Values for links The function cleans the URIs before creating the MD5 hash

Kind: global function

ParamTypeDescription
sourceStringthe URI we want to clean out

generateIdFromLanguageCode(languageCode) ⇒ String

Lanaugage id generation is done on the value of the language. This function will generate the id from a language ISO code by doing a lookup first on the language valuye then generate the id from that one

Kind: global function

ParamTypeDescription
languageCodeStringThe language code

generateUUID() ⇒ String

Create an random UUID value

Kind: global function

getCountryCode(countryCode, country) ⇒ String

Get the language code and normalize as the well the displayName of the language

Kind: global function Returns: String - the ISO3 country code

ParamTypeDescription
countryCodeStringthe ISO2 country code
countryStringthe country name

getCountryName(countryCode) ⇒ String

Get the country name given the country ISO3 code provided

Kind: global function Returns: String - The country name

ParamTypeDescription
countryCodeStringThe ISO3 Country Code

getLanguageCode(source) ⇒ String

Get the language code and normalize as the well the displayName of the language

Kind: global function Returns: String - the langauage ISO code

ParamTypeDescription
sourceStringthe language display name

getLanguageFromCode(source) ⇒ String

Get the language displayName from Code

Kind: global function

ParamTypeDescription
sourceStringthe langauage code

getLinkService(source, service) ⇒ String

Identify if the service provider of the link

Kind: global function

ParamTypeDescription
sourceStringthe link URI we wish to examine
serviceStringthe link service name

getLinkType(source) ⇒ String

Identify if the link is for a social website

Kind: global function

ParamTypeDescription
sourceStringthe link URI we wish to examine

getValueAtPath(object, path) ⇒ any

Simple wrapper for lodash get.

Kind: global function Returns: any - The value returned or undefined.

ParamTypeDescription
objectObjectThe object to query.
pathArray | StringPath of the property to get.

Example

{a: {b: 1}} => ['a', 'b'] => 1

minBy(array, path) ⇒ any

Return the min value (numerical or by character code) in array. If only array is passed it is assumed that array is of numbers or strings. If path is passed it is assumed that array is of objects, and value that path resolves to is used.

Kind: global function Returns: any - Min value or undefined.

ParamTypeDescription
arrayArrayDescription.
pathstringPath to prop in object.

Example

[1,2] => 1

normalizeString(source) ⇒ String

normalizeString a string from special characters, HTML tags and trailing dots and commas and lower case it

Kind: global function

ParamTypeDescription
sourceStringthe string we want to clean out

parseDate(date, month, day) ⇒ Date

Accepts a string or a Date object as input, check it's validity, and either return it as Date object, or returns null

Kind: global function

ParamTypeDescription
dateStringthe date we wish to transform
monthStringthe month if found to be added to the parsed date
dayStringthe day if found to be added to the parsed date

parseString(source) ⇒ String

Convert a value into a string by concatenating it with an empty space Known issue is that we will lose double precision when converting to string (check the tests)

Kind: global function

ParamTypeDescription
sourceStringthe string we wish to transform

splitList(input) ⇒ String | null

Split a list of items into an array

Kind: global function

ParamTypeDescription
inputString | nullstring to split

uniqueArray(target) ⇒ Array

Ensure array elements are unique

Kind: global function

ParamTypeDescription
targetArraythe target Array we will ensure its uniqueness
1.2.9

5 years ago

1.2.8

5 years ago

1.2.7

5 years ago

1.2.6

5 years ago

1.2.5

5 years ago

1.2.4

5 years ago

1.2.3

5 years ago

1.2.2

5 years ago

1.2.1

5 years ago

1.2.0

5 years ago

1.1.1

5 years ago

1.1.0

5 years ago

1.0.0

5 years ago