boltsearch v1.0.0
Welcome
Bolt is a fuzzy search library with a focus on speed and memory efficiency.
Features
- Fuzzy match
- Highlighting
- Multiple fields
- Multiple search terms
- Weighted search
- Multilingual support
- TypeScript support
- Zero dependency
- Tiny bundle
Demo
https://harrisonlo.github.io/boltsearch
Quickstart
npm
npm install boltsearch
yarn
yarn add boltsearch
Follow the simple example
import { prepare, search, highlight } from 'boltsearch'
const words = ['lightning', 'bolt']
const prepared = words.map(word => ({ text: prepare(word) }))
const results = search('bo', prepared, { key: 'text' })
highlight(results[0]) // <b>bo</b>lt
API
Function | Description |
---|---|
prepare(string): prepared | returns the prepared object for search |
search(string, targets, options): results | returns an array of ranked results |
highlight(result, openTag?, closeTag?): string | returns an HTML string |
Search options (example usage)
{
key: string
keys: string[]
weights: number[]
threshold: number (0 - 100)
limit: number
}
Keys
Use a dot notation to indicate nested fields.
{
keys: ['parent.child']
}
Weights
Make sure there is an equal number of weights and keys.
{
keys: ['title', 'description'],
weights: [100, 10]
}
i.e. title
is weighted 100, description
is weighted 10
Threshold
Set the minimum score of returned results. Scores are numbered from 0 to 100.
{
threshold: 10
}
Limit
Set the maximum number of returned results.
{
limit: 50
}
Languages
- Latin-based characters are normalized by default.
- Chinese and Japanese punctuation marks are recognized for basic "tokenization."
Learn more
Under the hood
Instead of creating a hashmap-like index of all your strings, like most other search libraries, Bolt uses an alternative approach to search based on the preparation of char codes. In JavaScript, and thus the browser, when strings are compared, they are first converted to char codes. In the context of search, this operation can quickly add up.
By preparing the char codes of search targets beforehand, Bolt eliminated a much repeated step in the V8 engine, while still able to identify fuzzy matches very quickly. It also uses much less memory as one character only maps to one number. Try out the demo, where the 'Simple' example has over 775K characters, and the 'Complex' example with multiple keys has over 200K characters.
Trade-off
Bolt is optimized for large lists of small to medium sized strings, ideally done on the client-side. Beyond a certain target size, the "char code" approach will start to incur more latency compared to the usual indexing approach. That said, both approaches can be used at the same time, where the server returns a subcollection of documents and the client quickly ranks them using Bolt.