1.0.0 • Published 4 months ago
codebase-indexer v1.0.0
Codebase Indexer
A tool for indexing, searching, and analyzing large codebases for LLM integration.
Features
- Codebase Structure Indexing: Recursively scan directories and build a searchable index of files
- Code Analysis: Parse source code to identify functions, classes, methods, and other structures
- Semantic Search: Find relevant code using fuzzy search algorithms
- Context Extraction: Extract code with surrounding context for better understanding
- LLM Integration: Format code and context for optimal LLM consumption
- Performance Optimization: In-memory caching for faster processing
Installation
# Clone the repository
git clone https://github.com/yourusername/codebase-indexer.git
cd codebase-indexer
# Install dependencies
npm install
# Install globally (optional)
npm install -g .
Usage
Command Line Interface
Indexing a Codebase
# Index a codebase
codebase-index --dir /path/to/your/codebase
# Index with custom options
codebase-index --dir /path/to/your/codebase --exclude "node_modules/**,dist/**,*.test.js" --watch
# Index with in-memory caching for faster processing
codebase-index --dir /path/to/your/codebase --in-memory
Searching a Codebase
# Search for files
codebase-search --dir /path/to/your/codebase --query "user authentication"
# Find a function definition
codebase-search --dir /path/to/your/codebase --function "getUserData"
# Find a class definition
codebase-search --dir /path/to/your/codebase --class "UserManager"
# Search with fuzzy matching and limit results
codebase-search --dir /path/to/your/codebase --query "auth" --fuzzy --limit 5
Extracting Code
# Extract code from a file
codebase-extract --file /path/to/your/codebase/src/user.js
# Extract a function with context
codebase-extract --dir /path/to/your/codebase --function "getUserData"
# Extract a class with context
codebase-extract --dir /path/to/your/codebase --class "UserManager"
# Answer a query using codebase knowledge
codebase-extract --dir /path/to/your/codebase --query "How does user authentication work?"
Programmatic Usage
const { indexCodebase, searchCodebase, extractFunction } = require('codebase-indexer');
// Index a codebase
const index = await indexCodebase('/path/to/your/codebase', {
inMemory: true // Enable in-memory caching for faster processing
});
// Search for files
const results = await searchCodebase(index, 'user authentication', { fuzzy: true });
// Find a function definition
const functions = await findFunctionDefinition(index, 'getUserData');
// Extract a function with context
const extractedFunctions = await extractFunction(index, 'getUserData', {
contextLines: 3
});
// Extract knowledge to answer a query
const knowledge = await extractKnowledge(index, 'How does user authentication work?', {
maxFiles: 3,
maxFunctions: 5
});
MCP Server for LLM Integration
The codebase indexer includes an MCP (Model Context Protocol) server for direct integration with LLMs.
# Start the MCP server
node mcp/server.js
This exposes the following tools to LLMs:
index_codebase
: Index a codebase directorysearch_codebase
: Search an indexed codebasefind_function
: Find a function definitionfind_class
: Find a class definitionextract_code
: Extract code from a fileextract_knowledge
: Extract knowledge from codebase to answer a querylist_folder_structure
: List the folder structure of a codebaselist_file_functions
: List all functions in a file
Supported Languages
The codebase indexer currently supports the following languages:
- JavaScript/TypeScript (full support)
- Python (basic support)
- Java (basic support)
- C# (basic support)
- Go (basic support)
- Ruby (basic support)
- PHP (basic support)
License
MIT
1.0.0
4 months ago