@equationalapplications/core-llm-wiki
Platform-agnostic TypeScript engine for hybrid LLM memory. Features episodic fact extraction, semantic vector search, and multi-agent architectures over SQLite. Bring your own adapter.
GitHub · ScopeLab · WikiDemo · Changelog · Issues
Inspired by Andrej Karpathy's LLM Wiki memory spec.
Features
- Platform-agnostic — Zero runtime dependencies; works with any SQLite driver via the
SQLiteAdapterinterface - Semantic search — Vector embeddings via your LLM's
embedfunction, ranked by cosine similarity - Keyword fallback — MiniSearch in-memory index for offline/degraded scenarios when embeddings unavailable
- Retrieval tuning — Per-call overrides for
maxResults,preFilterLimit,hybridWeight,tierWeights, andincludeZeroWeightEntities - Multi-entity reads — Search across multiple
entity_idnamespaces in one pass with per-entity score multipliers (tierWeights); optionalfactScoresandmetadatafor explainability - Immutable vs mutable facts — Use
WikiFact.source_typeto distinguish document-sourced facts (immutable_document) from derived or user-provided facts (librarian_inferred,user_stated,user_confirmed). Immutable document facts are not rewritten byrunLibrarian()orrunHeal()and can only be removed byforget()or re-ingesting. - Full-featured memory — Facts, tasks, events, maintenance jobs (librarian, heal, reembed, prune)
- Type-safe — Built with TypeScript, full type exports
- Interoperability: Supports Open Knowledge Format (OKF) v0.1 import and export.
- Per-entity seeded ontology — Optional Strict, Emergent, or Off modes govern LLM graph extraction; seed taxonomies per entity and persist typed facts with inline edges.
Installation
npm install @equationalapplications/core-llm-wiki
Semantic Search with Embeddings
Provide an embed function in llmProvider to enable vector-based retrieval:
import { WikiMemory } from '@equationalapplications/core-llm-wiki';
const wikiMemory = new WikiMemory(db, {
llmProvider: {
generateText: async ({ systemPrompt, userPrompt }) => {
// Your LLM call for extracting facts, tasks
return 'Model output';
},
embed: async (text: string) => {
// Your embedding service (e.g., OpenAI, Cohere, local)
const response = await fetch('https://your-app.example.com/api/embed', {
method: 'POST',
body: JSON.stringify({ text }),
});
const { embedding } = await response.json();
return embedding; // number[]
},
},
});
await wikiMemory.setup();
// Query with semantic matching
const memory = await wikiMemory.read('user-123', 'What should I do this weekend?');
// Returns facts semantically similar to the query, not lexical matches
// E.g., fact "Saturday hiking trip" ranks high even though no lexical overlap
When embed is unavailable, read() silently falls back to MiniSearch keyword search. If an embedding attempt throws, read() falls back and calls onRetrievalFallback if provided:
const wikiMemory = new WikiMemory(db, {
llmProvider: {
generateText: async () => { /* ... */ },
embed: undefined, // or throws on network error
},
onRetrievalFallback: (error) => {
console.warn('Embedding retrieval unavailable, using keyword search:', error);
},
});
// read() returns MiniSearch results, onRetrievalFallback not called (embed absent is expected)
// read() returns MiniSearch results, onRetrievalFallback called (embed threw)
Configuration
All WikiConfig fields are optional:
const wikiMemory = new WikiMemory(db, {
llmProvider: { /* ... */ },
config: {
tablePrefix: 'llm_wiki_', // default: 'llm_wiki_'
maxResults: 10, // default: 10
autoLibrarianThreshold: 20, // default: 20 — events before librarian auto-runs
autoHealThreshold: 100, // default: 100 — events before heal auto-runs
maxChunkLength: 12000, // default: 12000 (char count per ingestDocument chunk)
chunkOverlap: 400, // default: 400 (overlap between chunks in characters)
chunkConcurrency: 1, // default: 1 (parallel LLM calls per ingestDocument)
pruneRetainSoftDeletedFor: 7, // default: 7 (days before hard-deleting soft-deleted facts)
pruneEventsAfter: 30, // default: 30 (days before hard-deleting old events)
orphanAfterDays: 30, // default: 30 (days before runHeal flags sourceless facts; null to disable)
staleInferredAfterDays: 60, // default: 60 (days before runHeal downgrades inferred facts; null to disable)
preFilterLimit: 50, // default: undefined — MiniSearch pre-filter before cosine scan; recommended for >500 facts
hybridWeight: 0.7, // default: undefined — blend semantic (1.0) ↔ keyword (0.0); pure semantic when unset
enableOutbox: false, // default: false — when true, entry/task mutations write to an internal SQLite outbox table for external sync (e.g. via @equationalapplications/prisma-outbox)
// Global prompt overrides — librarianSystemPrompt and healSystemPrompt apply to write() auto-runs;
// ingestSystemPrompt applies only to explicit ingestDocument() calls.
// ⚠ Overrides replace the entire default prompt, including the JSON output contract.
// See "JSON Output Contracts" in the Prompt Management & Overrides section below.
prompts: {
ingestSystemPrompt: `Extract core facts from this document: {{documentChunk}}\n\nReturn ONLY valid JSON: { "facts": [{ "title": "string", "body": "string", "tags": ["string"], "confidence": "certain|inferred|tentative" }] }. No markdown.`,
librarianSystemPrompt: `You are an expert curator. Synthesize these thoughts:\n{{events}}\n\nCurrent Facts:\n{{currentFacts}}\n\nReturn ONLY valid JSON: { "facts": [{ "title": "string", "body": "string", "tags": ["string"], "confidence": "certain|inferred|tentative" }], "tasks": [{ "description": "string", "priority": 0 }] }. No markdown.`,
healSystemPrompt: `Fix the memory graph based on these candidates: {{healCandidates}}\n\nReturn ONLY valid JSON: { "downgraded": ["factId"], "deleted": ["factId"], "newFacts": [{ "title": "string", "body": "string", "tags": ["string"], "confidence": "certain|inferred|tentative" }] }. No markdown.`,
},
},
});
Prompt Management & Overrides
Core maintenance tasks (ingestDocument, runLibrarian, runHeal) use system prompts to instruct the LLM. You can customize these prompts using {{mustache}} style variables to inject context dynamically.
JSON Output Contracts: Prompt overrides replace the entire default system prompt, including the JSON response schema the parser depends on. Your override must instruct the LLM to return raw JSON — no markdown. Required shapes:
Operation Required JSON shape ingestDocument{ "facts": [{ "title": "string", "body": "string", "tags": ["string"], "confidence": "certain|inferred|tentative" }] }runLibrarian{ "facts": [...], "tasks": [{ "description": "string", "priority": 5 }] }—priorityis an integer 0–10runHeal{ "downgraded": ["factId"], "deleted": ["factId"], "newFacts": [...] }
Global Overrides (Auto-Runs)
If your application relies on write() to automatically maintain the memory graph in the background (via autoLibrarianThreshold and autoHealThreshold), configure custom prompts globally at instantiation. This ensures the internal WriteService uses your domain-specific instructions when it triggers an auto-run.
const wikiMemory = new WikiMemory(db, {
llmProvider,
config: {
prompts: {
// Override must include the JSON output contract — it replaces the entire default prompt.
librarianSystemPrompt: `You are an expert curator. Synthesize these thoughts:\n{{events}}\n\nCurrent Facts:\n{{currentFacts}}\n\nReturn ONLY a valid JSON object: { "facts": [{ "title": "string", "body": "string", "tags": ["string"], "confidence": "certain|inferred|tentative" }], "tasks": [{ "description": "string", "priority": 0 }] }. No markdown.`,
},
},
});
// WriteService uses the global prompt whenever autoLibrarianThreshold is hit
await wikiMemory.write('user-123', { event_type: 'observation', summary: '...' });
Available {{variables}} per prompt type:
| Prompt | Variables |
|---|---|
ingestSystemPrompt |
{{documentChunk}} |
librarianSystemPrompt |
{{events}}, {{currentFacts}} |
healSystemPrompt |
{{healCandidates}}, {{documentAnchors}}, {{allTasks}}, {{recentEvents}} |
When a template contains {{variable}} tags, the matching data is hydrated directly into systemPrompt and a short fixed string is used as userPrompt. When a template has no {{}} tags, the raw data is appended as userPrompt — backward compatible with plain-string overrides.
Runtime Overrides (Manual Execution)
Pass promptOverride per-call for one-off instructions. Runtime overrides apply only to that single call — they do not persist for future auto-runs triggered by write().
// Override the base default AND global config for this single execution.
// Each override must include the JSON output contract (replaces the entire default prompt).
await wikiMemory.runLibrarian('user-123', {
promptOverride: `One-off extraction task:\n{{events}}\n\nReturn ONLY valid JSON: { "facts": [{ "title": "string", "body": "string", "tags": ["string"], "confidence": "certain|inferred|tentative" }], "tasks": [{ "description": "string", "priority": 0 }] }. No markdown.`,
});
await wikiMemory.runHeal('user-123', {
promptOverride: `Domain-specific healing: {{healCandidates}}\n\nReturn ONLY valid JSON: { "downgraded": ["factId"], "deleted": ["factId"], "newFacts": [{ "title": "string", "body": "string", "tags": ["string"], "confidence": "certain|inferred|tentative" }] }. No markdown.`,
});
await wikiMemory.ingestDocument('user-123', {
sourceRef: 'doc-1',
sourceHash: sha256(content),
documentChunk: content,
promptOverride: `Strict technical extraction: {{documentChunk}}\n\nReturn ONLY valid JSON: { "facts": [{ "title": "string", "body": "string", "tags": ["string"], "confidence": "certain|inferred|tentative" }] }. No markdown.`,
});
Important: If your app relies on
write()auto-runs and needs custom prompts for those runs, useconfig.promptsat construction time. RuntimepromptOverridevalues are never forwarded toWriteService-triggered internal runs.
Retrieval Tuning
Optimize read() performance and blend retrieval strategies:
const config = {
// Limit cosine similarity scoring to top-K MiniSearch keyword candidates
preFilterLimit: 50,
// Blend semantic and keyword scores (0.0 = pure keyword, 1.0 = pure semantic)
hybridWeight: 0.7,
// Max results returned per read
maxResults: 10,
};
const wikiMemory = new WikiMemory(db, {
config,
llmProvider: { /* ... */ },
});
// Per-call overrides (runtime controls for search dashboards, etc.)
const memory = await wikiMemory.read('user-123', 'my preferences', {
maxResults: 5,
preFilterLimit: 20,
hybridWeight: 0.5,
});
// Multi-entity with tier weights
const multiMemory = await wikiMemory.read(['tier_wisdom', 'tier_fact', 'tier_working'], 'my preferences', {
maxResults: 8,
tierWeights: {
tier_wisdom: 2, // high-confidence curated notes boosted 2×
tier_fact: 1, // neutral baseline
tier_working: 0.25, // recent but unvetted context downranked
},
// includeZeroWeightEntities: true — include 0-weight entities as bottom-ranked filler
});
// multiMemory.factScores — optional Record<factId, weightedScore> for returned facts; may be absent/undefined
// multiMemory.metadata — optional { query, entityIds, tierWeights }; may be absent/undefined
Hybrid scoring blends:
hybridWeight: 1.0→ all-semantic blend with semantic scores clamped to non-negative range (no keyword component)hybridWeight: 0.5→ balanced semantic + keyword (50/50 blend)hybridWeight: 0.0→ pure keyword ranking, skipsembed()entirely (no LLM API cost)
True cosine-range pure semantic ranking (including negative cosine values) is used when hybridWeight is left undefined.
Tier weights:
tierWeightsapplies a per-entity multiplier after semantic/keyword scoring:finalScore = retrievalScore × weight- Missing weights default to
1.0. Negative weights clamp to0. Non-finite weights default to1.0. tierWeights[entity] = 0skips that entity's scored retrieval branch (no compute cost).includeZeroWeightEntities: trueincludes zero-weight entities as bottom-ranked filler instead of skipping them.factScoresis present for array-shapedentityIdcalls only when the query is non-empty and at least one fact is scored; empty-query ("recent facts") reads leave it absent even whenentityIdis an array. Plain string calls never expose it.metadatais present for all array-shaped calls regardless of query.maxResultsapplies globally across all requested entities.- Tasks are capped at
min(20 × entityCount, 200); events atmin(10 × entityCount, 100)for multi-entity reads.
Pre-filtering optimization:
When preFilterLimit: 50 is set with 1000 facts, cosine similarity is computed only for the top 50 MiniSearch keyword matches, reducing O(N) scoring to O(50).
Pluggable Vector Retrieval
When your entity corpus grows, in-process cosine similarity scoring becomes a bottleneck. The optional VectorRanker interface lets you delegate semantic ranking to sqlite-vec, sqlite-vss, or an external vector database while WikiMemory handles embedding validation, hybrid scoring, and tier-2 row hydration.
VectorRanker purpose
VectorRanker provides an optional injection point for approximate nearest-neighbor (ANN) ranking:
export interface VectorRanker {
/**
* Return semantic scores for facts in scope, sorted by similarity.
* - `entityId`: restricts results to one entity
* - `queryVec`: the embedded query (Float32Array or number[])
* - `candidateIds` (optional): when set, rank only within this set (MiniSearch pre-filter mode)
* - `limit`: requested top-K count
*/
rankBySimilarity(args: VectorRankerRankArgs): Promise<VectorRankerSemanticResult[]>;
/**
* Optional hook called after embedding persistence (upsert, reembed, delete).
* Implementations use this to keep external indexes (sqlite-vec, remote ANN) in sync.
*/
onEmbeddingPersisted?(event: {
entityId: string;
factId: string;
vector: Float32Array | null; // null = embedding removed
}): void | Promise<void>;
}
When no ranker is configured, WikiMemory uses built-in JS cosine similarity — the same behavior as today. When a ranker is supplied and embeddings preconditions are met (embed available, dimensions match, no mismatches), WikiMemory delegates scoring to the ranker and blends results with keyword scores.
Example: sqlite-vec adapter
import { WikiMemory } from '@equationalapplications/core-llm-wiki';
import type { VectorRanker, VectorRankerRankArgs, VectorRankerSemanticResult } from '@equationalapplications/core-llm-wiki';
// Minimal sqlite-vec adapter (pseudo-code)
const sqliteVecRanker: VectorRanker = {
async rankBySimilarity(args: VectorRankerRankArgs): Promise<VectorRankerSemanticResult[]> {
const { entityId, queryVec, candidateIds, limit } = args;
// Build KNN query using sqlite-vec's distance functions.
// sqlite-vec returns cosine distance (0 = identical, 2 = opposite) ascending.
// Invert to semanticScore: higher = more similar, matching VectorRanker contract.
let sql = `SELECT id, (1.0 - distance) AS semanticScore FROM vec_facts
WHERE entity_id = ? AND deleted_at IS NULL`;
const params: any[] = [entityId];
// Apply pre-filter if provided
if (candidateIds) {
sql += ` AND id IN (${candidateIds.map(() => '?').join(',')})`;
params.push(...candidateIds);
}
// KNN search (example syntax; adjust for your sqlite-vec version)
sql += ` ORDER BY vec MATCH vec_neighbor(?) LIMIT ?`;
params.push(queryVec, limit);
const rows = await db.getAllAsync<{ id: string; semanticScore: number }>(sql, params);
return rows; // sorted descending by semanticScore (closest distance → highest similarity)
},
async onEmbeddingPersisted(event) {
const { entityId, factId, vector } = event;
if (vector) {
// Upsert into sqlite-vec table
await db.runAsync(
`INSERT OR REPLACE INTO vec_facts (id, entity_id, vec) VALUES (?, ?, ?)`,
[factId, entityId, vector]
);
} else {
// Delete when embedding is removed
await db.runAsync(`DELETE FROM vec_facts WHERE id = ?`, [factId]);
}
},
};
const wikiMemory = new WikiMemory(db, {
llmProvider: { /* ... */ },
vectorRanker: sqliteVecRanker,
});
// read() now uses sqlite-vec for scoring instead of JS cosine
const memory = await wikiMemory.read('user-123', 'my preferences');
Fallback policies
When rankBySimilarity rejects (e.g., ANN service outage, misconfiguration), WikiMemory applies a recovery policy:
export type VectorRankerFallback =
| 'js-cosine' // (default) Score candidates in-process with JS cosine — same as no ranker
| 'keyword' // Skip semantic ranking; return keyword-only results
| 'empty' // Semantic facts list empty for this read; tasks/events still included
| 'throw'; // Reject read() with the ranker error
const wikiMemory = new WikiMemory(db, {
llmProvider: { /* ... */ },
vectorRanker: sqliteVecRanker,
vectorRankerFallback: 'js-cosine', // default
onVectorRankerFallback: (info) => {
console.warn(
`Ranker failed (policy: ${info.policy}); error:`,
info.error
);
},
});
'js-cosine'(default): Seamless degradation; same behavior as if no ranker was configured.'keyword': Useful when semantic ranking is optional; keyword search proceeds normally.'empty': Return no facts for this query (but tasks/events still load); useful for strict consistency.'throw': Propagate the error and fail the read.
onEmbeddingPersisted eventual consistency
If vectorRanker.onEmbeddingPersisted returns a pending Promise, the hook may resolve asynchronously. This supports ANN indexes that rebuild on a schedule (e.g., sqlite-vec triggers on transaction commit) or external services with eventual consistency.
Best practice:
- If your adapter has synchronous guarantees (in-process sqlite-vec, same transaction), await the promise.
- If your adapter is eventually consistent (remote ANN, async rebuild), document the lag and document that queries may miss recently-added facts until the index refreshes.
- The SQLite blob remains the source of truth;
WikiMemoryalways writes embeddings toembedding_blobfirst before calling the hook.
Hybrid scoring with ranker
When both vectorRanker and hybridWeight are configured, WikiMemory still applies hybrid blending after the ranker returns scores:
const wikiMemory = new WikiMemory(db, {
config: {
hybridWeight: 0.7, // 70% semantic, 30% keyword
},
vectorRanker: sqliteVecRanker,
});
// ranker returns semanticScore; WikiMemory blends with MiniSearch keyword score
const memory = await wikiMemory.read('user-123', 'my preferences', {
hybridWeight: 0.5, // per-call override to 50/50 blend
});
Note on semantics:
- Leave
hybridWeightundefined for true pure-semantic cosine-range scoring. - Set
hybridWeight: 1for an all-semantic variant that clamps negative semantic scores to 0.
For details on hybrid scoring formulas and trade-offs, see Retrieval Tuning above.
Spec and issue reference
- Full spec:
docs/superpowers/specs/2026-05-07-pluggable-vector-retrieval.md - GitHub issue: #15
Vector Cache
Parsed embedding vectors from full-scan read() calls are cached in memory, keyed by entity ID (max 16 entities, max 500 vectors per entity). This avoids redundant Float32Array parsing on repeated queries for the same entity. When the 16-entity limit is reached, the oldest-inserted entity is evicted to make room; if an entity exceeds 500 facts, its vectors are not cached at all for that read.
After heavy read workloads or on memory-constrained runtimes, you can release the entire cache explicitly:
// Release all cached embedding vectors
wikiMemory.clearVectorCache();
The cache is also automatically invalidated on any mutation (runLibrarian, runHeal, runPrune, runReembed, ingestDocument, importDump, forget).
Entity Status
WikiMemory exposes the in-flight job state for a single entity through two complementary APIs.
getEntityStatus(entityId)
Synchronous point-in-time snapshot:
const status = wikiMemory.getEntityStatus('user-42');
// { ingesting: boolean, librarian: boolean, heal: boolean }
Use this when you only need the current value (e.g. inside a request handler).
subscribeEntityStatus(entityId, callback)
Push-based change notification — the callback fires synchronously once with the current status, then again on every transition where any of the three booleans flips. There is no polling and no duplicate snapshots.
const unsubscribe = wikiMemory.subscribeEntityStatus('user-42', (status) => {
console.log(status); // { ingesting, librarian, heal }
});
// Later:
unsubscribe(); // idempotent — safe to call more than once
Notes:
- The first invocation happens before
subscribeEntityStatusreturns. Treat it as the initial render value. - Each emission may be a fresh object literal. Do not rely on referential equality between callbacks; equality of the three booleans is the contract.
- A throwing callback is caught (logged via
console.error) and does not block other subscribers or the underlying job. - Subscriptions are scoped to a single
entityId. There is no wildcard or "all entities" form.
Per-Entity Seeded Ontology
Control how librarian and ingest passes classify facts and extract graph relationships. The system defaults to off so existing deployments behave unchanged.
The Three Modes
| Mode | Behavior |
|---|---|
off (default) |
No ontology guidance. LLM output and persistence match pre-ontology behavior: okf_type stays null on LLM-created facts; maintenance passes do not create edges. OKF import still populates okf_type and edges independently. |
strict |
The LLM must use only node_types and edge_types from the entity manifest. Invalid okf_type falls back to an untyped fact with no edges; invalid individual edges are dropped while a valid okf_type and matching edges are kept. |
emergent |
Same validation as Strict, plus the LLM may return ontology_updates with new node/edge types. Updates are append-only (deduped by type string) and take effect before facts from the same response are validated. |
Mode resolution per entity: persisted DB row mode (when present) → seedManifests[entityId].mode (when no row but a seed exists) → WikiConfig.ontology.mode → 'off'.
WikiConfig
Set a global default mode and bootstrap manifests for known entities at construction time:
const wikiMemory = new WikiMemory(db, {
llmProvider,
config: {
ontology: {
mode: 'strict', // global default when an entity has no per-entity override
seedManifests: {
'team-alpha': {
mode: 'emergent', // optional per-entity override
manifest: {
node_types: [
{ type: 'person', description: 'An individual or user.' },
{ type: 'project', description: 'An ongoing initiative.' },
],
edge_types: [
{
type: 'contributes_to',
source_type: 'person',
target_type: 'project',
description: 'Person working on a project.',
},
],
},
},
},
},
},
});
seedManifests entries are written to SQLite on first access when no row exists for that entity.
Public API
Read or seed an entity's ontology at runtime:
// Read effective mode + manifest (DB row, then seedManifests fallback)
const ontology = await wikiMemory.getOntologyManifest('team-alpha');
// { mode: 'emergent', manifest: { node_types: [...], edge_types: [...] } }
// null when no row and no seed entry
// Seed or replace manifest; optional per-entity mode override
await wikiMemory.setOntologyManifest('team-alpha', {
node_types: [{ type: 'person', description: 'An individual.' }],
edge_types: [{
type: 'reports_to',
source_type: 'person',
target_type: 'person',
description: 'Reporting hierarchy.',
}],
}, { mode: 'strict' });
Fact Shape Extensions
In Strict and Emergent modes, librarian and ingest JSON may include typed facts with inline edges:
{
"facts": [{
"title": "Jane reports to Bob",
"body": "Jane reports to Bob Smith.",
"tags": [],
"confidence": "certain",
"okf_type": "person",
"edges": [{ "edge_type": "reports_to", "target_title": "Bob Smith" }]
}]
}
okf_typemaps to anode_types[].typeentry (case-insensitive lookup; canonical manifest casing is persisted).edgesare resolved bytarget_titlewithin the same maintenance transaction and persisted viaEdgeRepository.- Invalid
okf_typefalls back tonullwith no edges for that fact. Invalid individual edges are dropped; validokf_typeand matching edges are still persisted.
See the design spec: docs/superpowers/specs/2026-06-23-per-entity-seeded-ontology-design.md.
OKF Import/Export
The core package integrates with @equationalapplications/core-okf to seamlessly adapt wiki data dumps to and from Open Knowledge Format (OKF) v0.1 bundles.
Exporting an OKF Bundle
Convert an existing wiki dump into a flat array of OKF files, ready to be written to disk or zipped:
import { formatOkfBundle } from '@equationalapplications/core-llm-wiki';
const dump = await wiki.exportDump(['entity-123']);
const { files } = formatOkfBundle(dump);
// files: Array<{ path: string; content: string }>
// e.g., [{ path: 'entities/entity-123/facts/fact_abc.md', content: '---\n...' }]
Importing an OKF Bundle
Parse raw OKF files back into a MemoryDump that the wiki can ingest:
import { parseOkfBundle } from '@equationalapplications/core-llm-wiki';
// Assuming you read OKF files for this entity (e.g. under `entities/entity-123/`) from disk/zip into OkfFile[] shape
const dump = parseOkfBundle('entity-123', files, {
defaultSchema: 'fact',
typeMapping: {
'custom_type': 'fact',
'archived': 'ignore', // Skips these concepts
},
});
await wiki.importDump(dump, { merge: true });
Routing Precedence: Concepts are routed into either the entries (facts) or tasks tables based on a three-step fallback:
OkfImportOptions.typeMappingexplicitly mapping an OKFtypeto'fact','task', or'ignore'.- Directory convention (e.g., files in
/facts/become facts,/tasks/become tasks). - The
OkfImportOptions.defaultSchema(defaults to'fact').
WikiEdge and Markdown Links
A WikiEdge represents a markdown cross-link found inside a concept body, resolved to a source_id and target_id.
Edges automatically round-trip during OKF import and export. Because the markdown body is the source of truth for edges in the OKF spec, edges are extracted during parseOkfBundle() and persisted via the EdgeRepository on importDump() — there is no separate edge export step required. The edges array is included in bundles returned by getMemoryBundle() / exportDump() (not by read()).
The okf_type Field
Facts and tasks include a nullable okf_type column. This preserves the literal OKF type string from an imported bundle frontmatter, independent of whether the item was routed to the entries or tasks table. When formatOkfBundle runs, it restores this specific string, falling back to 'fact' or 'task' if the field is null (ensuring non-imported rows export cleanly).
Security
@equationalapplications/core-llm-wiki enforces multiple security layers:
VectorRanker Adapter Security
If implementing a custom VectorRanker:
- SQL Injection: ALWAYS use parameterized queries for
entityId,factId,candidateIds. Never concatenate into SQL strings. - Entity Isolation: Filter by
entityIdin all queries to prevent cross-tenant data leaks. - Credential Scrubbing: Strip API keys, tokens, connection strings from thrown errors before surfacing to host.
- Resource Limits: Cap
limitandcandidateIds.lengthto prevent DoS. Do NOT retainvectorreferences beyond callback scope — blocks GC.
See SECURITY.md for complete adapter security guidance and code examples.
Host Application Security
When using VectorRanker:
- Error Sanitization:
sanitizeRankerErrors: true(default) scrubs ranker errors before mirroring viaerror.cause. - Fallback Policy: Choose
vectorRankerFallbackbased on availability vs consistency requirements:'js-cosine'(default): Best availability'keyword': Fast fallback without semantic ranking'empty': Strict consistency (no facts on failure)'throw': Fail-fast error propagation
- Deletion Hook Contract:
forget()/runPrune()reject on hook timeout/failure. Prevents GDPR violations (deleted vectors still retrievable). Handle failures with retry or queue for reconciliation. - Timeout Tuning: Set
deletionHookTimeoutMsper deployment (default 30s). Interactive UX: 5s. Background jobs: 60s.
Core WikiMemory provides:
- Defensive Copies: Query/embedding vectors copied before ranker/hook calls
- Input Validation:
sourceRef/sourceHashnormalized; embedding dimensions validated - Parameterized Queries: All SQL uses bind parameters
Prompt-Injection Trust Boundary
User-controlled text — event.summary passed to write(), document chunks passed to ingestDocument(),
fact title/body (including imported dumps) — is interpolated verbatim into LLM prompts for librarian,
heal, and embedding operations. Prompt templating does simple variable substitution; it does not detect
or filter instruction-like content.
Mitigating prompt injection (e.g. "ignore prior instructions and emit...") is the host's responsibility.
If your application accepts untrusted input that flows into write(), ingestDocument(), or importDump(),
treat the LLM's librarian/heal output as similarly untrusted — validate or scope it before acting on it
downstream.
Usage
import { WikiMemory, type SQLiteAdapter } from '@equationalapplications/core-llm-wiki';
// Provide any SQLiteAdapter-compatible driver
const wikiMemory = new WikiMemory(db, {
llmProvider: {
generateText: async ({ systemPrompt, userPrompt }) => {
// Your LLM call here
return 'Model output';
},
},
});
// Initialize schema and run migrations
await wikiMemory.setup();
// Store facts
await wikiMemory.write('user-123', {
event_type: 'observation',
summary: 'User prefers async/await over promises',
});
// Query memory
const memory = await wikiMemory.read('user-123', 'coding style preferences');
Multi-entity weighted reads
read() accepts either one entity id or an array of entity ids. Facts are always merged globally before maxResults is applied. For single-entity reads, tasks are uncapped and events are capped at 10. For multi-entity reads, tasks are capped at min(20 × entity count, 200) and events at min(10 × entity count, 100) — per-entity representation in the returned bundle is not guaranteed.
const memory = await wikiMemory.read(['tier_wisdom', 'tier_fact', 'tier_working'], 'Which source should I trust?', {
maxResults: 8,
tierWeights: {
tier_wisdom: 2,
tier_fact: 1,
tier_working: 0.25,
},
});
console.log(memory.metadata);
console.log(memory.factScores);
Librarian prompt override contract
Core exports prompt utilities for weighted retrieval-based synthesis. Use mapLibrarianOptionsToReadOptions() to map entityWeights to tierWeights, then hydrate a prompt with query, context, and tasks.
import {
DEFAULT_LIBRARIAN_SYNTHESIS_PROMPT,
formatContext,
hydrateLibrarianPrompt,
mapLibrarianOptionsToReadOptions,
validateLibrarianPromptTemplate,
} from '@equationalapplications/core-llm-wiki';
const options = {
entityWeights: { tier_wisdom: 2, tier_fact: 1, tier_working: 0.25 },
systemPrompt: `You are a strict fact checker.
Question:
{{query}}
Retrieved context:
{{context}}
{{tasks}}`,
};
const query = 'Which source should I trust for recent project decisions?';
const memory = await wikiMemory.read(['tier_wisdom', 'tier_fact', 'tier_working'], query, {
...mapLibrarianOptionsToReadOptions(options),
maxResults: 8,
});
const template = options.systemPrompt ?? DEFAULT_LIBRARIAN_SYNTHESIS_PROMPT;
const warnings = validateLibrarianPromptTemplate(template, {
custom: options.systemPrompt != null,
taskCount: memory.tasks.length,
});
for (const warning of warnings) console.warn(warning);
const finalPrompt = hydrateLibrarianPrompt(template, {
query,
context: formatContext(memory, { includeEntityIds: true, includeFactScores: true }),
tasks: formatContext({ facts: [], tasks: memory.tasks, events: [] }, { format: 'plain' }),
});
Platform Random Source
Wiki record IDs must be cryptographically random. The core engine resolves a random source in this order:
crypto.randomUUID()(Web / Node 19+)crypto.getRandomValues()(Web / Node / polyfilled global)- A source injected via
configureRandomSource()(e.g.expo-cryptoon Hermes/React Native)
Web and Node are unchanged — global crypto wins when present. React Native / Hermes typically has no crypto global; use a platform package or inject your own implementation:
import { configureRandomSource } from '@equationalapplications/core-llm-wiki';
import { getRandomValues } from 'expo-crypto';
// Call once at module load, before any wiki writes
configureRandomSource(getRandomValues);
@equationalapplications/expo-llm-wiki does this automatically on import (main entry and /factory subpath). If you use @equationalapplications/core-llm-wiki directly on React Native without the expo package, you must call configureRandomSource() yourself or polyfill globalThis.crypto.getRandomValues.
Adapter Interface
Implement SQLiteAdapter to use your platform's SQLite driver:
export interface SQLiteAdapter {
execAsync(sql: string): Promise<void>;
runAsync(sql: string, params?: unknown[]): Promise<{ changes: number; lastInsertRowId: number }>;
getAllAsync<T>(sql: string, params?: unknown[]): Promise<T[]>;
getFirstAsync<T>(sql: string, params?: unknown[]): Promise<T | null>;
withTransactionAsync<T>(fn: () => Promise<T>): Promise<T>;
closeAsync(): Promise<void>;
}
@equationalapplications/expo-llm-wiki provides a pre-built adapter for Expo/React Native. For web and Node.js, implement the interface yourself — examples below.
Browser (sql.js):
import initSqlJs from 'sql.js';
import type { SQLiteAdapter } from '@equationalapplications/core-llm-wiki';
const SQL = await initSqlJs({ locateFile: (f) => `/wasm/${f}` });
const sqlDb = new SQL.Database();
const adapter: SQLiteAdapter = {
async execAsync(sql) { sqlDb.run(sql); },
async runAsync(sql, params = []) {
sqlDb.run(sql, params as any[]);
// sql.js doesn't expose lastInsertRowId; hardcode 0 since WikiMemory uses internal ID generation
return { changes: sqlDb.getRowsModified(), lastInsertRowId: 0 };
},
async getAllAsync<T>(sql, params = []) {
const stmt = sqlDb.prepare(sql);
stmt.bind(params as any[]);
const rows: T[] = [];
while (stmt.step()) rows.push(stmt.getAsObject() as T);
stmt.free();
return rows;
},
async getFirstAsync<T>(sql, params = []) {
const stmt = sqlDb.prepare(sql);
stmt.bind(params as any[]);
const row = stmt.step() ? stmt.getAsObject() as T : null;
stmt.free();
return row;
},
async withTransactionAsync(fn) {
sqlDb.run('BEGIN');
try { const r = await fn(); sqlDb.run('COMMIT'); return r; }
catch (e) { sqlDb.run('ROLLBACK'); throw e; }
},
async closeAsync() { sqlDb.close(); },
};
Node.js (better-sqlite3):
import Database from 'better-sqlite3';
import type { SQLiteAdapter } from '@equationalapplications/core-llm-wiki';
const db = new Database('wiki.db');
const adapter: SQLiteAdapter = {
async execAsync(sql) { db.exec(sql); },
async runAsync(sql, params = []) {
const info = db.prepare(sql).run(...(params as any[]));
return { changes: info.changes, lastInsertRowId: Number(info.lastInsertRowid) };
},
async getAllAsync<T>(sql, params = []) {
return db.prepare(sql).all(...(params as any[])) as T[];
},
async getFirstAsync<T>(sql, params = []) {
return (db.prepare(sql).get(...(params as any[])) ?? null) as T | null;
},
async withTransactionAsync(fn) {
db.exec('BEGIN');
try { const r = await fn(); db.exec('COMMIT'); return r; }
catch (e) { db.exec('ROLLBACK'); throw e; }
},
async closeAsync() { db.close(); },
};
How It Works
flowchart TD
A["read(entityId | entityId[], query, options?)"] --> B{hybridWeight = 0?}
B -->|Yes| C["MiniSearch only<br/>(skip embed)"]
B -->|No| D{embed available?}
D -->|No| C
D -->|Yes| F["Embed query"]
F -->|throws| E["onRetrievalFallback<br/>callback"]
E --> C
F -->|succeeds| G{preFilterLimit<br/>active?}
G -->|Yes| H["MiniSearch pre-filter<br/>top K candidates"]
H --> I["Phase 1: Cosine score<br/>top K candidates"]
G -->|No| J["Phase 1: Cosine score<br/>all facts"]
J --> K["Cache vectors<br/>in-memory<br/>(full scan only)"]
K --> L{hybridWeight = 1?}
I --> L
L -->|Yes| M["Pure semantic<br/>ranking"]
L -->|No| N["Hybrid blend:<br/>semantic + keyword<br/>via MiniSearch"]
M --> O["Phase 2: Fetch full rows<br/>top maxResults"]
N --> O
C --> P["MiniSearch ranking"]
P --> O
O --> R["Track access"]
R --> Q["Return MemoryBundle"]
The flowchart shows:
- Fast-path when
hybridWeight = 0(pure keyword, no embed cost) - Fallback chain when embed unavailable (MiniSearch silently) or throws (
onRetrievalFallbackcallback, then MiniSearch) - Pre-filtering to limit cosine scoring to top-K keyword matches (O(N) → O(K))
- Two-phase SELECT: phase 1 scores all/filtered facts with minimal columns, phase 2 fetches full rows for winners
- Hybrid scoring to blend semantic and keyword rankings
- Vector caching on full scans only; reads with
preFilterLimitactive skip cache population
Monorepo Ecosystem
| Package | Purpose |
|---|---|
| @equationalapplications/core-llm-wiki | Persistent episodic memory |
| @equationalapplications/expo-llm-wiki | Persistent episodic memory for Expo/React Native |
| @equationalapplications/react-llm-wiki | Persistent episodic memory for Web |
| @equationalapplications/prisma-outbox | Sync SQLite outbox events to Prisma |
| @equationalapplications/core-llm-tools | Gemini tool schemas and capability injector |
| @equationalapplications/core-okf | Zero-dependency Open Knowledge Format (OKF) v0.1 primitives — parse and produce interoperable knowledge bundles. |
License
MIT
Made with by Equational Applications LLC. https://equationalapplications.com/