@suptxt/quickset v0.4.6
QuickSet
A fast and performant Least Frequently Used (LFU) sorted set implementation for working with reasonably sized integers (unsigned). Trades memory for performance, optimised for frequently updating and counting a relatively small set of integers (integers ranging from 0 to 2^16) or extracting unique integers from a large pool of numbers in one go (integers ranging from 0 to 2^28). Sorted in natural ascending order of integers by default.
Use cases
- Unique integer extraction
- Duplicate integer counting
- Finding top-k most frequent items in one or many lists
- Nearest neighbour finding based on frequency of occurrence
- Sorting random integers for binary operations
- A lightweight key/value dictionary
Documentation
See the full API documentation for in-depth settings and examples.
See the ObservableHQ collection for hands-on examples.
How it works
Once initialised, QuickSet allocates a TypedArray based on the expected range of integers (numbers between 0 and 2^28) and frequency of occurrence (counts between 0 and 2^32).
Additionally, it keeps track of how often individual integers are added to the set, providing a top-k window of most frequent integers.
Two modes are provided for establishing top-k ranks, minsum and winsum.
Both eject the least frequent integer from their ranks upon integer insertion, yielding a ranked 'window' that guarantees the k-most occurring elements of the set to 'bubble up' (ejecting the Least Frequently Used or LFU).
Whereas minsum overwrites the least frequent integer (i.e. random access), winsum keeps integers sorted in decreasing order of occurrence (slightly more computationally expensive).
This makes QuickSet a fast alternative to counting and sorting all elements in a given set, preventing costly sorting operations and returning a ranked window of most frequent integers up till a point of your choosing.
This enables working with frequently occurring items 'earlier' compared to processing and sorting the input data in full, especially if the underlying integers follow a non-uniform distribution.
Quickstart
npm install @suptxt/quicksetAfter installing, create a QuickSet by calling:
let set = new QuickSet()This instantiates a new set with default parameters and a top-k window of 0-length, which may need additional configuring to suit your needs. As a rule of thumb:
- If you are interested in using unweighted set operations only, use
addorputfor single anduniquefor bulk insertions. - If you want to assign weights to integers, use
sumfor single andbatchfor bulk insertions. Updates to the top-k window are only made when theslotparameter is set.
Methods can be mixed and matched to your liking, but may yield unwanted results if used without caution:
add,putanduniqueoverwrite previous values and do not update the top-k window on integer insertionsumandbatchmaintain previous values and do update the top-k window on integer insertion
See the tombstoning example for why this is useful.
Tips
- Reuse a single instance
- Randomly switch between modes
- Use multiple QuickSets with a small integer span
- Maintain a
new Map()for reverse value lookups - Set
freqto a value higher than 1 for top-k window speed-ups - Subtract the minimum and adjust the
spanparameter to the new maximum expected integer to save on memory when working with a set of large integers - Use multiple QuickSets with custom offsets to increase the maximum integer range
Caveats
- Decreased performance on large sets (>2^24 uniformly distributed integers)
- Only limited top-k slots available (<64)
- No set size parameter yet
- No type checking of unsigned integers
- Reverse iteration not yet implemented
- Backing array resizing not yet implemented
Benchmarks
On a Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz-1.99 GHz with 16GB RAM, the average time to extract unique keys from 5 runs of uniformly distributed random integers:
| Random keys | instance | ms | factor |
|---|---|---|---|
| 2^28 | native | size exceeded | - |
| = 268 435 456 | QuickSet | 4592 | - |
| 2^24 | native | 6095 | - |
| = 16 777 216 | QuickSet | 212 | 28x |
| 2^16 | native | 4.4 | - |
| = 65 536 | QuickSet | 1.3 | 3x |
See also
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago