search-challenge-cli v1.0.3
Zendesk (Melbourne) Coding Challenge
Search CLI
An efficient command line search tool for arrays of JSON objects.
Below you will find some information on how to setup and perform common tasks.
Installation
npm install -g search-challenge-cli
alternatively, you can download this repository and install from source.
git clone https://github.com/ChrisMcDonaldJ/search-challenge-cli.git
cd search-challenge-cli
npm install
CLI Usage
Usage: search-challenge-cli search <text> <file> <index>
Options:
--key <key> specify unique key of each object, by default this is '_id'.
--indexStrategy <indexStrategy> specify how you wish to index data, by default this is 'PrefixIndexStrategy'.
--caseSensitive specify if you want the searches to be case sensitive.
Features
Tokenization
Tokenization is the process of breaking text (e.g. sentences) into smaller, searchable tokens (e.g. words or parts of words).
Stemming
Stemming is the process of reducing search tokens to their root (or "stem") so that searches for different forms of a word will still yield results. For example "search", "searching" and "searched" can all be reduced to the stem "search".
Stop Words
Stop words are very common (e.g. a, an, and, the, of) and are often not semantically meaningful.
Index strategy
There are three index strategies.
PrefixIndexStrategy
indexes for prefix searches. (e.g. the term "cat" is indexed as "c", "ca", and "cat" allowing prefix search lookups).
search-challenge-cli search 'Lee Davidson' ./src/data/users.json name --indexStrategy PrefixIndexStrategy
AllSubstringsIndexStrategy
indexes for all substrings. In other word "c", "ca", "cat", "a", "at", and "t" all match "cat".
search-challenge-cli search 'Lee Davidson' ./src/data/users.json name --indexStrategy AllSubstringsIndexStrategy
ExactWordIndexStrategy
indexes for exact word matches. For example "bob" will match "bob jones" (but "bo" will not).
search-challenge-cli search 'Francisca Rasmussen' ./src/data/users.json name --indexStrategy ExactWordIndexStrategy