Boltsearch NPM

Welcome

Bolt is a fuzzy search library with a focus on speed and memory efficiency.

Features

Fuzzy match
Highlighting
Multiple fields
Multiple search terms
Weighted search
Multilingual support
TypeScript support
Zero dependency
Tiny bundle

Demo

https://harrisonlo.github.io/boltsearch

Quickstart

npm

npm install boltsearch

yarn

yarn add boltsearch

Follow the simple example

import { prepare, search, highlight } from 'boltsearch'

const words = ['lightning', 'bolt']

const prepared = words.map(word => ({ text: prepare(word) }))

const results = search('bo', prepared, { key: 'text' })

highlight(results[0]) // <b>bo</b>lt

API

Function	Description
prepare(string): prepared	returns the prepared object for search
search(string, targets, options): results	returns an array of ranked results
highlight(result, openTag?, closeTag?): string	returns an HTML string

Search options (example usage)

{
  key: string
  keys: string[]
  weights: number[]
  threshold: number (0 - 100)
  limit: number
}

Keys

Use a dot notation to indicate nested fields.

{
  keys: ['parent.child']
}

Weights

Make sure there is an equal number of weights and keys.

{
  keys: ['title', 'description'],
  weights: [100, 10]
}

i.e. title is weighted 100, description is weighted 10

Threshold

Set the minimum score of returned results. Scores are numbered from 0 to 100.

{
  threshold: 10
}

Limit

Set the maximum number of returned results.

{
  limit: 50
}

Languages

Latin-based characters are normalized by default.
Chinese and Japanese punctuation marks are recognized for basic "tokenization."

Learn more

Under the hood

Instead of creating a hashmap-like index of all your strings, like most other search libraries, Bolt uses an alternative approach to search based on the preparation of char codes. In JavaScript, and thus the browser, when strings are compared, they are first converted to char codes. In the context of search, this operation can quickly add up.

By preparing the char codes of search targets beforehand, Bolt eliminated a much repeated step in the V8 engine, while still able to identify fuzzy matches very quickly. It also uses much less memory as one character only maps to one number. Try out the demo, where the 'Simple' example has over 775K characters, and the 'Complex' example with multiple keys has over 200K characters.

Trade-off

Bolt is optimized for large lists of small to medium sized strings, ideally done on the client-side. Beyond a certain target size, the "char code" approach will start to incur more latency compared to the usual indexing approach. That said, both approaches can be used at the same time, where the server returns a subcollection of documents and the client quickly ranks them using Bolt.

fuzzy search speed memory

5 years ago

5 years ago

5 years ago

5 years ago

5 years ago