0.15.0 ā€¢ Published 11 days ago

hunch v0.15.0

Weekly downloads
13
License
SEE LICENSE IN LI...
Repository
github
Last release
11 days ago

šŸ”Ž Hunch

Compiled search for your static Markdown files.

Quick links to the docs: all docs, configuration, query params, results types, indexing examples, using examples.

Hunch supports these search features:

  • Full text lookup docs
  • Exact phrase matching docs
  • Fuzzy search docs
  • Include matched words for highlighting docs
  • Return only partial snippet docs
  • Search specific fields docs
  • Return specific fields docs
  • Prefix search docs
  • Search suggestions docs
  • Boosting metadata properties docs
  • Ranking docs
  • Facet Limiting docs
  • Facet Matching docs
  • Pagination docs
  • Stop-Words docs
  • Sort by alternate strategy docs

Hunch compiles a search index to store as a JSON file, which you load and use wherever you want to perform search: AWS Lambda, Cloudflare Worker, even directly in the browser!

Install

The usual ways:

npm install hunch

Generate the index

Use it as a CLI tool:

hunch
# shorthand for
hunch --config hunch.config.js

Or use it in code:

import { generate } from 'hunch'
const index = await generate({
  input: './site',
  // other options
})

Query the index

You'll need to load the index from the generated JSON file. In environments with disk access, that's could be as simple as:

import { readFile } from 'node:fs/promises'
const index = JSON.parse(await readFile('./dist/hunch.json', 'utf8'))
// or with upcoming JavaScript, eventually you could do
import index from './dist/hunch.json' assert { type: 'json' }

Then you create a search instance using Hunch, and query it:

import { hunch } from 'hunch'
const search = hunch({ index })
const results = search({ q: 'we get signal' })
/*
results = {
  items: [ ... ],
  page: { ... },
  facets: { ... },
}
*/

Overview

Many modern websites are backed by static Markdown files with some YAML-like metadata at the top, e.g. this file 2022-12-29/cats-and-dogs.md:

---
title: About Cats & Dogs
summary: Where I talk about pets.
published: 2022-12-29
tags: [ cats, dogs ]
series: Animals
---

Fancy words about cats and dogs.

As part of your deployment step, you would use Hunch to generate a pre-computed search index as a JSON file:

hunch --config hunch.config.js

A simple configuration file would specify the content folder (where the Markdown files are), the output filepath to write the JSON file, and other configuration details:

// hunch.config.js
export default {
  // Define the folder to scan.
  input: './site',
  // Define where to write the index file.
  output: './dist/hunch.json',
  // Property names of metadata to treat as "collections", like "tags" or "authors".
  facets: {
    // If it's just a flat string there's nothing to configure.
    series: true,
    // If it's more, like an array, you'll need to specify how Hunch
    // should treat the values. (See documentation for more details.)
    tags: {
      type: 'array',
    }
  },
  // All the facet fields are searchable by default, but you need
  // to specify additional searchable fields.
  searchableFields: [
    'title',
    'summary',
  ],
  // Fields that are not searchable that you want available for access
  // need to be specified. These fields are stored in the index JSON, but
  // not used by Hunch.
  storedFields: [
    'published',
  ],
}

To make a search using this index, you would create a Hunch instance with the index, and then query it:

// Load the generated JSON file in one way or another:
import { readFile } from 'node:fs/promises'
const index = JSON.parse(await readFile('./dist/hunch.json'))

// Create an instance of Hunch using that data:
import { hunch } from 'hunch'
const search = hunch({ index })

// Then query it:
const results = search({
  q: 'fancy words',
  facetMustMatch: { tags: [ 'cats' ] },
  facetMustNotMatch: { tags: [ 'rabbits' ] },
})
/*
results = {
  items: [
    {
      title: 'About Cats & Dogs',
      tags: [ 'cats', 'dogs' ],
      summary: 'Where I talk about pets.',
      published: '2022-12-29',
      series: 'Animals',
      _id: '2022-12-29/cats-and-dogs.md',
      _content: 'Fancy words about cats and dogs.',
    }
  ],
  page: {
    number: 0,
    size: 1,
    total: 1,
  },
  facets: {
    series: {
      Animals: {
        all: 3,
        search: 1,
      },
    },
    tags: {
      cats: {
        all: 5,
        search: 1
      }
      dogs: {
        all: 3,
        search: 1
      }
    },
  },
}
*/

URL Query docs

If you are using Hunch as an API with a URL query parameter interface, such as AWS Lambda, Cloudflare Worker, or even the browser, you can easily transform those query parameters into a Hunch query object:

// from the main
import { fromQuery } from 'hunch'
// or from the named export
import { fromQuery } from 'hunch/from-query'
const query = normalize({
  q: 'fancy words',
  'facet[tags]': 'cats,-rabbits',
})
/*
query = {
  q: 'fancy words',
  facetMustMatch: { tags: [ 'cats' ] },
  facetMustNotMatch: { tags: [ 'rabbits' ] },
}
*/

Additional Notes

Behind the scenes this libary uses MiniSearch for text searching, so look at that documentation if you need anything more esoteric.

āš ļø The output JSON file is an amalgamation of a MiniSearch index and other settings, optimized to save space. There is no guarantee as to the output structure or contents between Hunch versions: you must compile with the same version that you search with!

Some things left to do:

  • Stemming (undecided if I'll support this...)

License

Published and released under the Very Open License.

0.15.0

11 days ago

0.14.0

2 months ago

0.14.1

2 months ago

0.13.0

7 months ago

0.12.0

8 months ago

0.11.0

11 months ago

0.10.1

11 months ago

0.10.2

11 months ago

0.10.3

11 months ago

0.10.0

11 months ago

0.9.0

11 months ago

0.8.0

1 year ago

0.7.1

1 year ago

0.7.0

1 year ago

0.3.0

1 year ago

0.1.2

1 year ago

0.2.0

1 year ago

0.1.1

1 year ago

0.5.0

1 year ago

0.3.2

1 year ago

0.4.0

1 year ago

0.3.1

1 year ago

0.6.1

1 year ago

0.6.0

1 year ago

0.1.0

1 year ago

0.0.5

1 year ago

0.0.3

12 years ago

0.0.2

12 years ago

0.0.1

12 years ago

0.0.0

12 years ago