Spelt NPM | npm.io

Spelt

JavaScript english spellchecker written in TypeScript.

Installation

Install the spell checker via NPM

npm i --save spelt

Install one of the dictionaries

British English dictionary

npm i --save spelt-gb-dict

American English dictionary

npm i --save spelt-us-dict

Canadian English dictionary

npm i --save spelt-ca-dict

Australian English dictionary

npm i --save spelt-au-dict

Usage

// import the lib
import spelt from "spelt";
// import one of the dictionaries
import {dictionary} from "spelt-gb-dict";
// build dictionary
const check = spelt({
	dictionary:dictionary,
	// can be either "gb" or "us"
	distanceThreshold:0.2
	// when a correction found with this distance
	// we'll stop looking for another
	// this would improve performance
});

console.log(check("heve"));

The above code would output:

{
	// the raw input
	raw:"heve",
	// correct or not
	correct:false
	// corrections array sorted by string distance
	[
		{
			// possible correction
			correction:"have",
			// distance from the input word
			distance:0.4
		},
		// .... other possible corrections
	]
}

How it works

how it works

String Distance

I've noticed that a lot of spellcheckers are using the levenshtein distance (LD), I don't think it's the appropriate solution, since it doesn't take moving a two letters around in consideration.

For example: 1. the distance between abcde and abcxx is 2. 2. the distance between abcde and abced is also 2.

But on the first case we introduced two new letters, and removed two letters! while on the second case we just moved the e and d around without introducing or removing any letter.

So in short, I don't see the levenshtein distance as an appropriate solution for a spellchecker.

I've wrote my own string distance calculator and you can find it here.

Performance

Spellchecking a book: Processing H.G Wells Novel The Time Machine with (1000s of misspellings introduced took about 8 seconds), in a rate of 4K words/second.
Spellchecking Wikipedia list: Processing about 4 thousands words, all misspelt, took about 3.5 seconds with a rate of 2.3K word/second.

This is not very impressive, but I'm working on it. However, it's far better than Norvig's spellchecker.

Accuracy

Running on wikipedia's list, with a distance threshold of 0, It was able to find the accurate correction in the first 5 suggestions on 85% of the cases.

License

The MIT License

spellchecker spell check checker

strdistance

@infinitebrahmanuniverse/nolb-spel @everything-registry/sub-chunk-2805

0.0.3

9 years ago

0.0.1

9 years ago