@n2flowjs/nbase NPM

NBase - Neural Vector Database

 _   _       ____                 
| \ | |     | __ )  __ _ ___  ___ 
|  \| |_____|  _ \ / _` / __|/ _ \
| |\  |_____| |_) | (_| \__ \  __/
|_| \_|     |____/ \__,_|___/\___|

NBase is a high-performance vector database for efficient similarity search, designed for machine learning embeddings and neural search applications.

Features

Scalable Vector Storage: Store and manage millions of high-dimensional vectors
Optimized Search Algorithms: Fast approximate nearest neighbor search
- HNSW (Hierarchical Navigable Small World) graphs for logarithmic search time
- LSH (Locality-Sensitive Hashing) for fast similarity search
- Partitioned search for large-scale databases
Multi-dimensional Support: Handles vectors of different dimensions
Vector Compression: Reduces memory usage while maintaining search quality
Rich Query Options: Filter, rerank, and customize search parameters
Persistence: Save and load your vector database to/from disk
REST API: Simple HTTP interface for adding vectors and searching

Installation

npm i @n2flowjs/nbase

Quick Start

const { Database } = require('@n2flowjs/nbase');

// Initialize the database
const db = new Database({
  vectorSize: 1536,  // OpenAI's text-embedding-ada-002 size
  indexing: {
    buildOnStart: true
  }
});

// Add vectors
await db.addVector('doc1', [0.1, 0.2, ...], { title: 'Document 1' });
await db.addVector('doc2', [0.3, 0.4, ...], { title: 'Document 2' });

// Search for similar vectors
const results = await db.search([0.15, 0.25, ...], {
  k: 5,
  includeMetadata: true,
  useHNSW: true
});

console.log(results);
// [
//   { id: 'doc1', dist: 0.12, metadata: { title: 'Document 1' } },
//   { id: 'doc2', dist: 0.45, metadata: { title: 'Document 2' } },
//   ...
// ]

API Documentation

Database

The main interface for interacting with NBase.

const db = new Database(options);

Options

vectorSize: Default size of vectors (default: 1536)
clustering: Options for vector clustering
partitioning: Options for database partitioning
indexing: Options for index creation (HNSW, LSH)
persistence: Options for saving/loading the database
monitoring: Options for performance monitoring

Methods

addVector(id, vector, metadata?): Add a vector to the database
bulkAdd(vectors): Add multiple vectors in one operation
findNearest(query, k, options): Find k nearest neighbors
search(query, options): Alias for findNearest
deleteVector(id): Delete a vector
getVector(id): Retrieve a vector
getMetadata(id): Retrieve metadata for a vector
updateMetadata(id, data): Update metadata for a vector
extractRelationships(threshold, options): Find relationships between vectors within partitions
buildIndexes(): Build search indexes
save(): Save the database to disk
close(): Close the database and release resources

Search Options

const results = await db.search(queryVector, {
  k: 10,                   // Number of results to return
  filter: (id) => true,    // Function to filter results
  includeMetadata: true,   // Include metadata in results
  distanceMetric: 'cosine', // Distance metric to use
  useHNSW: true,           // Use HNSW index for search
  rerank: false,           // Rerank results for diversity
  rerankingMethod: 'diversity', // Method for reranking
  partitionIds: ['p1', 'p2'], // Specific partitions to search
  efSearch: 100,           // HNSW search parameter
});

Performance Optimization

For best performance:

Choose the right index: HNSW provides the best search performance for most use cases
Adjust efSearch: Higher values improve recall at the cost of speed
Use partitioning: For large datasets, enable partitioning to reduce memory usage
Filter wisely: Complex filters may slow down search
Dimension reduction: Consider reducing vector dimensions if possible

REST API

NBase includes a built-in HTTP server:

const { Server } = require('@n2flowjs/nbase');
const server = new Server({ port: 1307 });
server.start();

Endpoints

POST /vectors: Add a vector
GET /vectors/:id: Get a vector
DELETE /vectors/:id: Delete a vector
POST /search: Search for similar vectors
GET /health: Check server health
POST /search/metadata: Search with metadata filtering
POST /search/relationships: Extract relationships between vectors
POST /search/communities: Finds communities (clusters) of vectors based on a distance threshold across loaded partitions.