0.1.3 • Published 12 months ago

@n2flowjs/nbase v0.1.3

Weekly downloads
-
License
MIT
Repository
github
Last release
12 months ago

NBase - Neural Vector Database

 _   _       ____                 
| \ | |     | __ )  __ _ ___  ___ 
|  \| |_____|  _ \ / _` / __|/ _ \
| |\  |_____| |_) | (_| \__ \  __/
|_| \_|     |____/ \__,_|___/\___|

NBase is a high-performance vector database for efficient similarity search, designed for machine learning embeddings and neural search applications.

Features

  • Scalable Vector Storage: Store and manage millions of high-dimensional vectors
  • Optimized Search Algorithms: Fast approximate nearest neighbor search
    • HNSW (Hierarchical Navigable Small World) graphs for logarithmic search time
    • LSH (Locality-Sensitive Hashing) for fast similarity search
    • Partitioned search for large-scale databases
  • Multi-dimensional Support: Handles vectors of different dimensions
  • Vector Compression: Reduces memory usage while maintaining search quality
  • Rich Query Options: Filter, rerank, and customize search parameters
  • Persistence: Save and load your vector database to/from disk
  • REST API: Simple HTTP interface for adding vectors and searching

Installation

npm i @n2flowjs/nbase

Quick Start

const { Database } = require('@n2flowjs/nbase');

// Initialize the database
const db = new Database({
  vectorSize: 1536,  // OpenAI's text-embedding-ada-002 size
  indexing: {
    buildOnStart: true
  }
});

// Add vectors
await db.addVector('doc1', [0.1, 0.2, ...], { title: 'Document 1' });
await db.addVector('doc2', [0.3, 0.4, ...], { title: 'Document 2' });

// Search for similar vectors
const results = await db.search([0.15, 0.25, ...], {
  k: 5,
  includeMetadata: true,
  useHNSW: true
});

console.log(results);
// [
//   { id: 'doc1', dist: 0.12, metadata: { title: 'Document 1' } },
//   { id: 'doc2', dist: 0.45, metadata: { title: 'Document 2' } },
//   ...
// ]

API Documentation

Database

The main interface for interacting with NBase.

const db = new Database(options);

Options

  • vectorSize: Default size of vectors (default: 1536)
  • clustering: Options for vector clustering
  • partitioning: Options for database partitioning
  • indexing: Options for index creation (HNSW, LSH)
  • persistence: Options for saving/loading the database
  • monitoring: Options for performance monitoring

Methods

  • addVector(id, vector, metadata?): Add a vector to the database
  • bulkAdd(vectors): Add multiple vectors in one operation
  • findNearest(query, k, options): Find k nearest neighbors
  • search(query, options): Alias for findNearest
  • deleteVector(id): Delete a vector
  • getVector(id): Retrieve a vector
  • getMetadata(id): Retrieve metadata for a vector
  • updateMetadata(id, data): Update metadata for a vector
  • extractRelationships(threshold, options): Find relationships between vectors within partitions
  • buildIndexes(): Build search indexes
  • save(): Save the database to disk
  • close(): Close the database and release resources

Search Options

const results = await db.search(queryVector, {
  k: 10,                   // Number of results to return
  filter: (id) => true,    // Function to filter results
  includeMetadata: true,   // Include metadata in results
  distanceMetric: 'cosine', // Distance metric to use
  useHNSW: true,           // Use HNSW index for search
  rerank: false,           // Rerank results for diversity
  rerankingMethod: 'diversity', // Method for reranking
  partitionIds: ['p1', 'p2'], // Specific partitions to search
  efSearch: 100,           // HNSW search parameter
});

Performance Optimization

For best performance:

  1. Choose the right index: HNSW provides the best search performance for most use cases
  2. Adjust efSearch: Higher values improve recall at the cost of speed
  3. Use partitioning: For large datasets, enable partitioning to reduce memory usage
  4. Filter wisely: Complex filters may slow down search
  5. Dimension reduction: Consider reducing vector dimensions if possible

REST API

NBase includes a built-in HTTP server:

const { Server } = require('@n2flowjs/nbase');
const server = new Server({ port: 1307 });
server.start();

Endpoints

  • POST /vectors: Add a vector
  • GET /vectors/:id: Get a vector
  • DELETE /vectors/:id: Delete a vector
  • POST /search: Search for similar vectors
  • GET /health: Check server health
  • POST /search/metadata: Search with metadata filtering
  • POST /search/relationships: Extract relationships between vectors
  • POST /search/communities: Finds communities (clusters) of vectors based on a distance threshold across loaded partitions.

Advanced Usage

For more advanced usage examples, check the examples directory in the repository.

Performance Benchmarks

Benchmarks comparing NBase with other vector databases can be found in the test/benchmarks directory.

v0.1.3Time (ms)Speedup Factor
Standard Search37.011.00x
HNSW Search39.120.95x
HNSW Search (After Reload)4.248.73x

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

0.1.8

11 months ago

0.1.7

11 months ago

0.1.6

11 months ago

0.1.5

12 months ago

0.1.4

12 months ago

0.1.3

12 months ago

0.1.2

12 months ago

0.1.1

12 months ago

0.1.0

12 months ago