1.1.0 • Published 1 year ago
bibliothecary v1.1.0
Bibliothecary
A string search library implementing typical operators found in academic databases (boolean operators, NEAR, wildcards). This library lies in the heart of Snowball.
Installation
npm i bibliothecary
Usage
import { Query } from 'bibliothecary';
const intro = "A string search library implementing typical operators found in academic databases (boolean operators, NEAR, wildcards)."
const results = new Query('academic AND search AND library').search(intro);
/* => [
{ term: 'academic', text: 'academic', start: 64, length: 8 },
{ term: 'search', text: 'search', start: 9, length: 6 },
{ term: 'lib*', text: 'library', start: 16, length: 7 }
]*/
const shouldBeFalse = new Query('academic AND search AND NOT library').search(intro); // => false
A quick primer on search operators
Strings
word1 word2
matches strings that mentions any ofword1
andword2
."word1 word2"
matches strings that contains exactly the phrase"word1 word2"
.
Search operators
word1 AND word2
matches strings that mentions bothword1
andword2
.word1 OR word2
matches strings that mentions any ofword1
andword2
.NOT word1
matches strings that does not mentionword1
anywhere.word1 NEAR/n word2
requires thatword1
andword2
aren
words or less apart.word1 ONEAR/n word2
does the same asNEAR/n
, while also requiresword1
to appear beforeword2
.
Wildcards
?
matches any one letter, e.g.,wor?
matchesword
andwork
.*
matches any number of letters, e.g.,wor*
matchesword
andworry
.
Combining operators
- Operators can be combined freely, e.g.,
"word1 B" AND NOT "word1 word3" OR (word1 ONEAR/3 word4) AND word5
. ()
can be used to group parts of the query.- Note that
NEAR
andONEAR
must have string literals on both sides and not other operators. That is,word1 NEAR/3 word2
is OK, but(word1 AND word2) NEAR/3 word3
is not.
Operator priorities
- Operators have different priorities. For example,
word1 OR word2 AND word3
will be interpreted asword1 OR (word2 AND word3)
, becauseAND
has higher priority thanOR
. - Operators priorities are ordered as such:
""?*
>()
>ONEAR
>NEAR
>NOT
>AND
>OR
.