npm.io
0.12.29 • Published yesterdayCLI

a2a-memory

Licence
UNLICENSED
Version
0.12.29
Deps
8
Size
19.5 MB
Vulns
0
Weekly
0

@a2a/memory - Persistent AI Memory for Claude Code

v0.12.29 — #43 검색 결정성 fix + #57/#58 ContextRank Phase 1~3: searchByVector int8 quantization 복원(dequantizeEmbedding 단일 source — 벡터 검색 무력·flaky 해소) + PostToolUse 노이즈 16패턴 차단(#45) + refine 실행(MERGE/ARCHIVE reversible + activeFilter 작업셋 제외 + a2a-memory refine CLI) + prune_queue L1 승인 게이트(#58) + SMOKE_MARKER content 가드 제거(#43c). (v0.12.27: S27 retrieval 회복 + ContextRank Phase 0 shadow ADR-047/048 · v0.12.26: QA v8.0 timestamp 정규화 + vector_clocks orphan cleanup · v0.12.25: QA dead-weight 정리 + config show 마스킹 · v0.12.23: Per-Memory Usage Ledger #40/ADR-046 · earlier features)

Persistent AI memory for Claude Code. Automatically captures, stores, and retrieves knowledge across coding sessions with realtime prompt-level memory injection.

Features

Core Capabilities

  • Realtime Context Injection - Inject relevant memories on every prompt via UserPromptSubmit hook
  • Session Extraction - Extract memories from Claude Code sessions (JSONL parsing)
  • Local-First DB - SQLite with FTS5 full-text search + vector similarity
  • Auto-Capture Hooks - 5 Claude Code hooks for automatic memory lifecycle
  • Hybrid Search - FTS + Vector + Recency ranking (Reciprocal Rank Fusion)
  • Adaptive RAG Router - Query complexity classification (simple/semantic/complex)
  • Cross-Encoder Reranker - Precision re-ranking with ms-marco-MiniLM
  • Lifecycle Management - Quality scoring, TTL-based cleanup, memory tiering (Hot/Warm/Cold), CleanupScheduler
  • Auto Memory Sync - PostToolUse Hook에서 Claude memory 파일 자동 동기화
AI & Embeddings
  • E5 Embedding (384D) - Local semantic embedding via e5-small-v2 (ONNX, ~16ms/query)
  • Local TF-IDF (64D) - Lightweight hash-based embedding (fallback)
  • OpenAI Embedding (1536D) - Cloud embedding via OpenAI API
  • LLM Integration - AI-powered extraction and classification (OpenAI, Anthropic)
  • Vector Quantization - Float32 and Int8 scalar quantization for compression
Intelligence
  • 4-Way Dedup (Mem0 Pattern) - ADD/UPDATE/DELETE/NOOP for memory deduplication
  • Skill Crystallization - Repeated patterns auto-crystallize into reusable skills
  • Proficiency Tracking - ACT-R cognitive model for skill level tracking
  • PreCompact Checkpoint - Preserve critical context before context window compression
Team & Sync
  • Server Sync - Synchronize with A2A server via REST API
  • Team Collaboration - Share memories across team members (CRDT vector clocks)
  • E2E Encryption - AES-256-GCM encryption for sensitive content
  • Scheduled Sync - Automatic periodic sync with configurable interval
  • OS Keychain - Secure API key storage (macOS/Linux/Windows)
Developer Experience
  • CLAUDE.md Sync - Sync CLAUDE.md sections to memory DB
  • CLI (19 commands) - Full command-line interface for all operations
  • Logging - JSON Lines hook logs with rotation and level filtering
  • i18n - Internationalization support (Korean, English)
  • Sensitive Info Filter - Auto-redaction of API keys, passwords, tokens (21+ patterns)

Quick Start

# Install globally
npm install -g a2a-memory

# Initialize (creates DB + registers Claude Code hooks)
a2a-memory setup

# Extract memories from existing sessions
a2a-memory extract --limit 20

# Check status
a2a-memory status

# Search memories
a2a-memory search "authentication error"

# Health check
a2a-memory health

CLI Commands

Command Description Options
a2a-memory setup Initialize plugin, create DB, register hooks -
a2a-memory status Show memory count, DB size, category breakdown -
a2a-memory extract Extract memories from session files --limit <n>, --project <path>
a2a-memory search <query> Hybrid search across all memories --limit <n>, --category <cat>
a2a-memory list List memories with category/tier filters --category, --tier, --limit
a2a-memory add Manually add a memory (interactive) -
a2a-memory edit <id> Edit an existing memory (interactive) -
a2a-memory rm <id> Delete a memory -
a2a-memory config View and modify configuration get <key>, set <key> <value>
a2a-memory sync Synchronize with A2A server --push, --pull, --watch, --interval <ms>
a2a-memory team Team memory management join <teamId>, sync, members
a2a-memory embed Manage embeddings generate, stats
a2a-memory cleanup Remove low-quality/expired memories --dry-run
a2a-memory health System health check (DB, config, logs) --verbose
a2a-memory claude-sync Sync CLAUDE.md to memory DB --dry-run, --force
a2a-memory skill Manage crystallized skills list, inspect <id>
a2a-memory proficiency list Show proficiency levels for all skills -
a2a-memory proficiency inspect <id> Detailed proficiency analysis -
a2a-memory proficiency simulate Simulate proficiency scenarios -

Architecture

1. Claude Code Hooks

After a2a-memory setup, five hooks are registered:

Hook Trigger Action Performance
SessionStart Session begins Injects relevant memories via hybrid search ~200ms
PostToolUse After Write/Edit/Bash/Read/Grep/Glob Auto-captures significant actions + dedup check + Claude memory file sync ~50ms
UserPromptSubmit Every user prompt Realtime FTS search + context injection p50 < 30ms
PreCompact Before context compression Extracts key decisions/progress/TODOs as checkpoints ~100ms
SessionEnd Session ends Extracts memories + team sync + skill evaluation ~500ms
2. Memory Extraction

Parses ~/.claude/projects/<project>/<session>.jsonl files and extracts:

Category Examples
error_solution Error + resolution pairs
code_pattern Repeated tool usage patterns
decision Architectural decisions with reasoning
project_knowledge Project context and rules
convention Development conventions and preferences
learning Learned techniques and insights
skill Crystallized reusable skills
3. Hybrid Search with Adaptive RAG

Three-signal Reciprocal Rank Fusion (RRF):

Score = w1 * FTS_rank + w2 * Vector_similarity + w3 * Recency_score
  • FTS: SQLite FTS5 full-text search
  • Vector: E5 (384D), TF-IDF (64D), or OpenAI (1536D) embeddings
  • Recency: Time decay scoring
  • Skill Boost: 1.5x weight for crystallized skill memories

Adaptive Router classifies query complexity:

  • simple (1-2 keywords) → FTS only (< 5ms)
  • semantic (questions, descriptions) → FTS + Vector (< 30ms)
  • complex (code patterns, errors) → FTS + Vector + Reranker (< 100ms)

Cross-Encoder Reranker (optional):

  • ms-marco-MiniLM-L-6-v2 for precision re-ranking
  • top-20 → top-5 refinement
4. Memory Deduplication (Mem0 Pattern)

4-Way decision for every new memory:

Action Condition Result
ADD No similar memory exists Create new
UPDATE Similarity > 0.8 Merge with existing
DELETE New info invalidates old Replace
NOOP Similarity > 0.95 Skip (exact duplicate)
5. Storage & Indexing
  • SQLite database at ~/.a2a/memory.db
    • FTS5 full-text search index
    • Vector embeddings table (with quantization support)
    • WAL mode for concurrent access
    • Sync status tracking with CRDT vector clocks
  • Memory Tiering: Hot (0-7 days), Warm (8-30 days), Cold (30+ days)
  • Vector Quantization: Float32 (50% reduction), Int8 (75% reduction)
  • E2E Encryption: AES-256-GCM with PBKDF2 key derivation

Configuration

Config file: ~/.a2a/config.json

{
  "mode": "local",
  "autoCapture": {
    "enabled": false,
    "triggers": ["Write", "Edit", "Bash", "Read", "Grep", "Glob"],
    "significanceThreshold": 0.6
  },
  "autoInject": {
    "enabled": true,
    "maxMemories": 10,
    "maxTokens": 2000,
    "minQuality": 0.3
  },
  "realtimeInject": {
    "enabled": true,
    "maxMemories": 3,
    "maxTokens": 1000,
    "cacheSize": 20,
    "cacheTTLSeconds": 60,
    "timeoutMs": 100
  },
  "embedding": {
    "enabled": false,
    "provider": "local",
    "dimensions": 64
  },
  "lifecycle": {
    "ttlDays": 90,
    "maxMemories": 1000,
    "cleanupOnSessionEnd": false,
    "qualityThreshold": 0.3
  },
  "skillConversion": {
    "enabled": false,
    "evaluationInterval": 5,
    "minRepetitions": 3,
    "similarityThreshold": 0.85,
    "minConfidence": 0.7
  },
  "proficiency": {
    "enabled": true,
    "levelFormula": "activation",
    "maxLevel": 10
  },
  "autoSync": {
    "enabled": false,
    "pushOnSessionEnd": true,
    "pullOnSessionStart": true,
    "timeoutMs": 30000,
    "intervalMs": 1800000
  },
  "db": {
    "path": "~/.a2a/memory.db",
    "maxSizeMB": 100
  },
  "logging": {
    "enabled": false,
    "level": "info",
    "outputDir": "~/.a2a/logs",
    "maxFileSizeMB": 10,
    "maxFiles": 3
  }
}
Embedding Providers
Provider Dimensions Latency Install
local (TF-IDF) 64 < 1ms Built-in
e5 (e5-small-v2) 384 ~16ms npm install @huggingface/transformers
openai 1536 ~100ms Requires API key
# Enable E5 embeddings
a2a-memory config set embedding.enabled true
a2a-memory config set embedding.provider e5
a2a-memory config set embedding.dimensions 384

# Generate embeddings for existing memories
a2a-memory embed generate
Server Sync
# Configure server
a2a-memory config set server.url https://your-a2a-server.com
a2a-memory config set server.apiKey your-api-key

# One-time sync
a2a-memory sync --push    # local -> remote
a2a-memory sync --pull    # remote -> local

# Continuous sync (every 30 min)
a2a-memory sync --watch
a2a-memory sync --watch --interval 60000  # every 60s
Team Mode
# Join a team
a2a-memory team join my-team

# Sync team memories
a2a-memory team sync

# List team members
a2a-memory team members
CLAUDE.md Sync
# Sync CLAUDE.md sections to memory DB
a2a-memory claude-sync

# Preview without writing
a2a-memory claude-sync --dry-run

# Force re-sync
a2a-memory claude-sync --force

Programmatic API

import {
  // Core
  MemoryDatabase,
  ConfigManager,
  extractMemories,

  // Search
  HybridRanker,
  AdaptiveRouter,
  CrossEncoderReranker,
  createEmbeddingProvider,
  E5EmbeddingProvider,

  // Extraction
  DedupManager,

  // Sync
  A2AClient,
  MemorySynchronizer,
  TeamSynchronizer,
  SyncScheduler,

  // Lifecycle
  cleanupMemories,
  rebalanceTiers,

  // Encryption
  encryptContent,
  decryptContent,

  // Proficiency
  ACTREngine,
  ProficiencyTracker,

  // Claude.md
  syncClaudeMd,
} from 'a2a-memory';

// Database operations
const db = new MemoryDatabase('~/.a2a/memory.db');
db.initialize();

db.createMemory({
  content: 'JWT tokens expire after 30 minutes',
  category: 'convention',
  tier: 'semantic',
  tags: ['auth', 'jwt'],
});

// Hybrid search with E5 embeddings
const provider = createEmbeddingProvider({ provider: 'e5', dimensions: 384, enabled: true });
const ranker = new HybridRanker(db, provider);
const ranked = await ranker.search('auth error', { limit: 5 });

// Adaptive RAG routing
const router = new AdaptiveRouter();
const route = router.classify('how to fix JWT token expiration?');
// → { complexity: 'semantic', strategy: 'fts+vector' }

// Memory deduplication
const dedup = new DedupManager(db);
const decision = await dedup.decide(newContent, existingMemories);
// → { action: 'UPDATE', targetId: 'mem_123', reason: '...' }

// Server sync
const client = new A2AClient({
  serverUrl: 'https://a2a-api-production-8d17.up.railway.app',
  apiKey: 'your-api-key',
});
const sync = new MemorySynchronizer(db, client);
await sync.push();
await sync.pull();

// Anthropic Memory API adapter
import { toAnthropicFormat, fromAnthropicFormat } from 'a2a-memory';

const memory = db.getMemory('mem_123');
const anthropicFormat = toAnthropicFormat(memory);

Security

Sensitive Information Protection

Automatically filters 21+ patterns including:

  • Cloud provider keys (AWS, Google, Azure)
  • API keys and secrets (OpenAI, Stripe, GitHub)
  • Database connection strings
  • Private keys and certificates (RSA, EC, DSA, OPENSSH)
  • JWT tokens and session cookies
  • Service tokens (Slack, Discord, npm, PyPI)
  • Personal identifiable information (Korean SSN, Business registration)
Encryption

E2E Encryption (opt-in via config):

a2a-memory config set autoSync.encryption.enabled true
  • Algorithm: AES-256-GCM
  • Key derivation: PBKDF2-HMAC-SHA256 (600,000 iterations)
  • Storage: OS keychain integration (macOS, Linux, Windows)
  • Scope: Content + embeddings encrypted before transmission
API Key Storage

Uses OS-native secure storage:

  • macOS: Keychain (security command)
  • Linux: Secret Service API (secret-tool)
  • Windows: Credential Manager (via node-keytar)

Fallback: Encrypted file storage at ~/.a2a/credentials.enc

Requirements

  • Node.js >= 18.0.0
  • Claude Code (for hooks integration)
  • @huggingface/transformers (optional, for E5 embeddings)

License

MIT

Keywords