1.0.15 • Published 2 years ago

text-essence v1.0.15

Weekly downloads
13
License
MIT
Repository
github
Last release
2 years ago

Text Essence

Build Status codecov npm version

Remove non-alphanumeric characters from any Unicode string. Optionally, also remove diacritical marks.

Essence

TextEssence.essence gives you the essence of a string, i.e., only the alphanumerical characters, converted to lower case:

const TextEssence = require('text-essence');
let city = TextEssence.essence('Saint-Étienne');
// 'saintétienne'

Optionally, you can also remove diacritical marks:

const TextEssence = require('text-essence');
let aggressiveTextEssence = new TextEssence({ removeDiacriticalMarks: true });
let city = aggressiveTextEssence.essence('Saint-Étienne');
// 'saintetienne'

Identical

TextEssence.identical lets you check whether two strings are essentially the same:

const TextEssence = require('text-essence');
let sadlyFalse = 'Saint-Étienne' === 'Saint–Étienne';
// false
let happilyTrue = TextEssence.identical('Saint-Étienne', 'Saint–Étienne');
// true

Of course, you have the option to ignore diacritical marks. This should increase recall, while probably harming precision:

const TextEssence = require('text-essence');
let aggressiveTextEssence = new TextEssence({ removeDiacriticalMarks: true });
let sadlyFalse = 'Saint-Étienne' === 'Saint-Etienne';
// false
let happilyTrue = aggressiveTextEssence.identical('Saint-Étienne', 'Saint-Etienne');
// true

Essential Hash

TextEssence.essentialHash gives you the hash of the essence of a string:

const TextEssence = require('text-essence');
let hash = TextEssence.essentialHash('Saint-Étienne');
// 'bb35a91d6f2bd7ba807fdd240cc838bdd3b20fe1a6fdedba60d941fbe8d5c10f'

By default, it uses the sha256 algorithm, but you can pick a different one:

const TextEssence = require('text-essence');
let sha1TextEssence = new TextEssence({ hashAlgorithm: 'sha1' });
let hash = sha1TextEssence.essentialHash('Saint-Étienne');
// '03f938949514a56c863252d3559d0fd92d40720e'
1.0.15

2 years ago

1.0.14

2 years ago

1.0.13

3 years ago

1.0.12

3 years ago

1.0.11

4 years ago

1.0.10

4 years ago

1.0.9

4 years ago

1.0.8

4 years ago

1.0.7

4 years ago

1.0.6

5 years ago

1.0.5

5 years ago

1.0.4

6 years ago

1.0.3

6 years ago

1.0.2

6 years ago

1.0.1

6 years ago

1.0.0

6 years ago