1.0.3 • Published 4 years ago

accentize v1.0.3

Weekly downloads
86
License
MIT
Repository
github
Last release
4 years ago

Accentize

badge of version

Transform any string into an accentized regex to match any pattern for filtering, querying, searching, etc...

match a string sample

test match

filter array sample

filter array

Installation

$ yarn add accentize

or with npm

$ npm install --save accentize

Usage

Turn string into a accentized regex:

const accentize = require('accentize');

accentize("hello world")
// returns the regex: /\s*[hⓗhĥḣḧȟḥḩḫẖħⱨⱶɥ][eⓔeèéêềếễểẽēḕḗĕėëẻěȅȇẹệȩḝęḙḛɇɛǝ][lⓛlŀĺľḷḹļḽḻſłƚɫⱡꝉꞁꝇ][lⓛlŀĺľḷḹļḽḻſłƚɫⱡꝉꞁꝇ][oⓞoòóôồốỗổõṍȭṏōṑṓŏȯȱöȫỏőǒȍȏơờớỡởợọộǫǭøǿɔꝋꝍɵ]\s*\s*[wⓦwẁẃŵẇẅẘẉⱳ][oⓞoòóôồốỗổõṍȭṏōṑṓŏȯȱöȫỏőǒȍȏơờớỡởợọộǫǭøǿɔꝋꝍɵ][rⓡrŕṙřȑȓṛṝŗṟɍɽꝛꞧꞃ][lⓛlŀĺľḷḹļḽḻſłƚɫⱡꝉꞁꝇ][dⓓdḋďḍḑḓḏđƌɖɗꝺ]\s*/i

Test if a string is a accent version of other:

const accentize = require('accentize');

let accentizedString = accentize("hello world") 
accentizedString.test("hello world") // true
accentizedString.test("hèllô wórld") // true
accentizedString.test("hêlló world") // true

Filter a array where strings can be accent versions:

const accentize = require('accentize');

let  arrayToFilter  = [{name: "João Luís"}, {name: "Mária Ríta"}, {name: "Ísis Môàna"}]
let accentizedString = accentize("joao luis")

let  filteredArray  =  arrayToFilter.filter(user  =>  user.name.match(accentizedString))
// [{"name":"João Luís"}]

With MongoDB queries:

Suppose you have a Mongo DB with users, represented with the following array:

db.getCollection("users") 
// 
[
  {_id: "5ef689", name: "João Luís"},
  {_id: "5efkl9", name: "Mária Ríta"}, 
  {_id: "5ef6a8", name: "Ísis Môàna"}
]

You want to make a query to find user with the name Ísis Môàna, by searching for isis moana. Then use:

const accentize = require('accentize');

db.users.find({"name": accentize("isis moana")})
// {_id: "5ef6a8", name: "Ísis Môàna"}

db.users.find({"name": accentize("is oana", true)}) // would work with the second param true, the findAll param

Notice the second param as true in the second example, it is the findAll param, to include .* in the accentized regex, so the database will find everything that matches, making a query similar to %LIKE%.

Why

Usually when we deal with accents the most common approach is to remove the accent from words and then compare/do what you have to do, like:

let normalizedString = someFunctionToRemoveAccent("hèllô wórld") // returns hello world
normalizedString === "hello world" // true	

What accentize does is kinda the opposite, it transforms a regular string into a regex that will match any versions of that string with accents:

// ES6
import accentize from 'accentize';
// commonjs
const accentize = require('accentize');

let accentizedString = accentize("hello world") 
// returns the regex: /\s*[hⓗhĥḣḧȟḥḩḫẖħⱨⱶɥ][eⓔeèéêềếễểẽēḕḗĕėëẻěȅȇẹệȩḝęḙḛɇɛǝ][lⓛlŀĺľḷḹļḽḻſłƚɫⱡꝉꞁꝇ][lⓛlŀĺľḷḹļḽḻſłƚɫⱡꝉꞁꝇ][oⓞoòóôồốỗổõṍȭṏōṑṓŏȯȱöȫỏőǒȍȏơờớỡởợọộǫǭøǿɔꝋꝍɵ]\s*\s*[wⓦwẁẃŵẇẅẘẉⱳ][oⓞoòóôồốỗổõṍȭṏōṑṓŏȯȱöȫỏőǒȍȏơờớỡởợọộǫǭøǿɔꝋꝍɵ][rⓡrŕṙřȑȓṛṝŗṟɍɽꝛꞧꞃ][lⓛlŀĺľḷḹļḽḻſłƚɫⱡꝉꞁꝇ][dⓓdḋďḍḑḓḏđƌɖɗꝺ]\s*/i

accentizedString.test("hello world") // true
accentizedString.test("hèllô wórld") // true
accentizedString.test("hêlló world") // true

With the accentized regex you can test any variant of a string with accent using the normalized string!

Works with case insensitivity too:

let stringToFind = "hÉllò Wôrld"
let accentizedString = accentize("hello world")
accentizedString.test(stringToFind) // true

Params

function accentize(stringToAccentize, findAll) { ... }

stringToAccentize: the string that will be accentized

findAll: if is true the accentized regex will have the .* operator between words, so it can match more precisely multi word strings.