1.1.0 • Published 5 months ago
knowledge-base-indexer v1.1.0
Knowledge Base Indexer
A powerful tool for indexing, searching, and extracting content from local knowledge bases with LLM integration.
Features
- File Structure Indexing: Recursively scan directories and build a searchable index of files
- Fuzzy Search: Find relevant files using fuzzy search algorithms (similar to fzf)
- Content Extraction: Extract and process content from various file formats
- LLM Integration: Designed to work with LLMs for knowledge extraction and question answering
- CLI Tools: Command-line tools for indexing, searching, and content extraction
- Performance Options: In-memory file caching for faster processing of large knowledge bases
- API: Programmatic access for integration with other tools
Installation
# Install globally
npm install -g knowledge-base-indexer
# Or install locally
npm install knowledge-base-indexer
Usage
Command Line Interface
Indexing a Knowledge Base
# Index a directory
kb-index --dir /path/to/knowledge/base
# Index with custom options
kb-index --dir /path/to/knowledge/base --exclude "*.tmp,*.log" --watch
# Index with in-memory caching for faster processing
kb-index --dir /path/to/knowledge/base --in-memory
Searching the Knowledge Base
# Search for files
kb-search --query "machine learning"
# Search with fuzzy matching
kb-search --query "machine learning" --fuzzy --limit 10
# Search with in-memory caching for faster processing
kb-search --query "machine learning" --in-memory
Extracting Content
# Extract content from a file
kb-extract --file /path/to/file.md
# Extract content and answer a question
kb-extract --query "What is machine learning?" --context-files "ml-intro.md,deep-learning.md"
# Extract with in-memory caching for faster processing
kb-extract --file /path/to/file.md --in-memory
Programmatic Usage
const { indexKnowledgeBase, searchKnowledgeBase, extractContent } = require('knowledge-base-indexer');
// Index a knowledge base
const index = await indexKnowledgeBase('/path/to/knowledge/base', {
inMemory: true // Enable in-memory caching for faster processing
});
// Search for files
const results = await searchKnowledgeBase(index, 'machine learning', { inMemory: true });
// Extract content
const content = await extractContent('/path/to/file.md', { inMemory: true });
// Answer a question using content from relevant files
const answer = await extractKnowledge('What is machine learning?', { files: relevantFiles, inMemory: true });
Supported File Formats
- Markdown (.md)
- Text (.txt)
- PDF (.pdf)
- HTML (.html, .htm)
- JSON (.json)
- YAML (.yml, .yaml)
- And more...
LLM Integration
Knowledge Base Indexer is designed to work with Large Language Models:
- File Discovery: LLMs can use the indexer to find relevant files based on user queries
- Content Extraction: Extract content from files to provide context to LLMs
- Knowledge Synthesis: LLMs can synthesize information from multiple sources
- Question Answering: Answer questions using knowledge extracted from files
License
MIT