1.0.0 • Published 9 months ago
codebase-indexer v1.0.0
Codebase Indexer
A tool for indexing, searching, and analyzing large codebases for LLM integration.
Features
- Codebase Structure Indexing: Recursively scan directories and build a searchable index of files
- Code Analysis: Parse source code to identify functions, classes, methods, and other structures
- Semantic Search: Find relevant code using fuzzy search algorithms
- Context Extraction: Extract code with surrounding context for better understanding
- LLM Integration: Format code and context for optimal LLM consumption
- Performance Optimization: In-memory caching for faster processing
Installation
# Clone the repository
git clone https://github.com/yourusername/codebase-indexer.git
cd codebase-indexer
# Install dependencies
npm install
# Install globally (optional)
npm install -g .Usage
Command Line Interface
Indexing a Codebase
# Index a codebase
codebase-index --dir /path/to/your/codebase
# Index with custom options
codebase-index --dir /path/to/your/codebase --exclude "node_modules/**,dist/**,*.test.js" --watch
# Index with in-memory caching for faster processing
codebase-index --dir /path/to/your/codebase --in-memorySearching a Codebase
# Search for files
codebase-search --dir /path/to/your/codebase --query "user authentication"
# Find a function definition
codebase-search --dir /path/to/your/codebase --function "getUserData"
# Find a class definition
codebase-search --dir /path/to/your/codebase --class "UserManager"
# Search with fuzzy matching and limit results
codebase-search --dir /path/to/your/codebase --query "auth" --fuzzy --limit 5Extracting Code
# Extract code from a file
codebase-extract --file /path/to/your/codebase/src/user.js
# Extract a function with context
codebase-extract --dir /path/to/your/codebase --function "getUserData"
# Extract a class with context
codebase-extract --dir /path/to/your/codebase --class "UserManager"
# Answer a query using codebase knowledge
codebase-extract --dir /path/to/your/codebase --query "How does user authentication work?"Programmatic Usage
const { indexCodebase, searchCodebase, extractFunction } = require('codebase-indexer');
// Index a codebase
const index = await indexCodebase('/path/to/your/codebase', {
inMemory: true // Enable in-memory caching for faster processing
});
// Search for files
const results = await searchCodebase(index, 'user authentication', { fuzzy: true });
// Find a function definition
const functions = await findFunctionDefinition(index, 'getUserData');
// Extract a function with context
const extractedFunctions = await extractFunction(index, 'getUserData', {
contextLines: 3
});
// Extract knowledge to answer a query
const knowledge = await extractKnowledge(index, 'How does user authentication work?', {
maxFiles: 3,
maxFunctions: 5
});MCP Server for LLM Integration
The codebase indexer includes an MCP (Model Context Protocol) server for direct integration with LLMs.
# Start the MCP server
node mcp/server.jsThis exposes the following tools to LLMs:
index_codebase: Index a codebase directorysearch_codebase: Search an indexed codebasefind_function: Find a function definitionfind_class: Find a class definitionextract_code: Extract code from a fileextract_knowledge: Extract knowledge from codebase to answer a querylist_folder_structure: List the folder structure of a codebaselist_file_functions: List all functions in a file
Supported Languages
The codebase indexer currently supports the following languages:
- JavaScript/TypeScript (full support)
- Python (basic support)
- Java (basic support)
- C# (basic support)
- Go (basic support)
- Ruby (basic support)
- PHP (basic support)
License
MIT
1.0.0
9 months ago