1.0.0 • Published 7 months ago

@docshield/modern-migration v1.0.0

Weekly downloads
-
License
ISC
Repository
-
Last release
7 months ago

Modern Migration

Node.js Version NPM Version License: ISC

Modern Migration is a Node.js package that helps transform unstructured data to structured data using LLMs with human review. It's designed for data migration scenarios where you need to standardize values with human oversight.

🌟 Features

  • LLM-powered transformations with Claude 3.5 Sonnet
  • Interactive review interface for human verification
  • Batch processing with configurable sizes
  • Rate limiting to respect API constraints
  • Type-safety with comprehensive TypeScript types
  • ID tracking to maintain relationships with source records
  • Real-time progress tracking
  • Audit trail of all transformations

šŸ“‹ Table of Contents

šŸš€ Installation

npm install modern-migration

Prerequisites

  • Node.js >= 18.0.0
  • An Anthropic API key for Claude access

šŸŽÆ Quick Start

import { ModernMigration } from 'modern-migration'

// 1. Create a migration instance
const migration = new ModernMigration({
  migrationName: 'state-codes',
  options: ['AL', 'AK', 'AZ', 'AR', 'CA' /* ... other codes */],
  prompt: 'Convert state name to two-letter code',
  confidenceThreshold: 0.95, // Optional, defaults to 0.95
  apiKey: process.env.ANTHROPIC_API_KEY,
})

// 2. Prepare your data with IDs
const items = [
  { id: '1', value: 'California' },
  { id: '2', value: 'New York' },
  { id: '3', value: 'Texas' },
]

// 3. Transform the values
const results = await migration.transform(items)
// Results:
// [
//   { id: '1', original: 'California', transformed: 'CA' },
//   { id: '2', original: 'New York', transformed: 'NY' },
//   { id: '3', original: 'Texas', transformed: 'TX' }
// ]

// 4. Use the results to update your database
await db.collection('states').bulkWrite(
  results.map(({ id, transformed }) => ({
    updateOne: {
      filter: { _id: id },
      update: { $set: { stateCode: transformed } },
    },
  })),
)

🧩 Core Concepts

Transform Pipeline

  1. Input with IDs: Provide an array of objects with id and value properties
  2. Batch Processing: Values are processed in configurable batches
  3. LLM Transformation: Claude processes each value with the provided prompt
  4. ID Association: Results are associated with original input IDs
  5. Review Interface: A web interface opens for human review of transformations
  6. Output: Returns results with original IDs, input values, and transformed values

Review Process

  1. The review interface automatically opens in your browser (default port: 3001)
  2. Review each batch of transformations
  3. Approve the batch when satisfied
  4. After all batches are approved, the transform method resolves with final results

āš™ļø Configuration

Basic Configuration

interface TransformConfig {
  migrationName: string // Unique identifier for this migration
  options: string[] // Valid output options
  prompt: string // LLM prompt for transformation
  confidenceThreshold?: number // Default: 0.95
  apiKey?: string // Anthropic API key
}

Advanced Configuration

Create a modernmigration.config.js:

module.exports = {
  outputDir: './migration-output',
  rateLimit: {
    requestsPerMinute: 50, // API rate limit
    retryAttempts: 3, // Retry attempts on failure
  },
  review: {
    port: 3001, // Web interface port
    reviewBatchSize: 10, // Number of items to show in review UI
    transformBatchSize: 10, // Number of items in each LLM request
  },
}

šŸ“š Examples

The project includes several examples demonstrating common use cases:

Running Examples

  1. Clone the repository:
git clone https://github.com/yourusername/modern-migration.git
cd modern-migration
  1. Install dependencies:
npm install
  1. Create a .env file:
ANTHROPIC_API_KEY=your-api-key-here
  1. Run examples:
# Run the basic state codes example
npm run example:state-codes

# Run other specialized examples
npm run example:company-names
npm run example:product-categorization
npm run example:address-parser
npm run example:transaction-categorizer

# Or run the default example with instructions
npm run examples

Available Examples

Modern Migration includes examples for various real-world use cases:

  • State Code Transformation: Convert state names to two-letter codes
  • Company Name Standardization: Normalize company names to official versions
  • Product Categorization: Assign taxonomy codes to product descriptions
  • Address Parsing: Parse and classify address formats
  • Transaction Categorization: Categorize financial transactions

Example: State Code Transformation

import { ModernMigration } from 'modern-migration'

const migration = new ModernMigration({
  migrationName: 'state-codes',
  options: ['AL', 'AK', 'AZ' /* ... */],
  prompt: 'Convert state name to two-letter code',
  confidenceThreshold: 0.95,
  apiKey: process.env.ANTHROPIC_API_KEY,
})

const stateNames = [
  'New York',
  'California',
  'Mass.',
  'Fla.',
  'Washington, D.C.', // Will be flagged for review
]

const results = await migration.transform(stateNames)

Example: Company Name Standardization

import { ModernMigration } from 'modern-migration'

const migration = new ModernMigration({
  migrationName: 'company-names',
  options: ['Apple', 'Microsoft', 'Google', 'IBM', 'Other' /* ... */],
  prompt: 'Standardize company names to their official names',
  confidenceThreshold: 0.92,
  apiKey: process.env.ANTHROPIC_API_KEY,
})

const companyNames = [
  'Apple Inc.',
  'MSFT',
  'International Business Machines',
  'The Adobe Company',
]

const results = await migration.transform(companyNames)

More examples can be found in the examples directory.

šŸ”§ Development

# Install dependencies
npm install

# Build the project
npm run build

# Run tests
npm test

# Start development server
npm run dev

# Format code
npm run format

# Run linter
npm run lint

Project Structure

modern-migration/
ā”œā”€ā”€ src/
│   ā”œā”€ā”€ config/      # Configuration management
│   ā”œā”€ā”€ transform/   # Core transformation logic
│   ā”œā”€ā”€ web/        # Web review interface
│   └── index.ts    # Main entry point
ā”œā”€ā”€ examples/       # Example implementations
ā”œā”€ā”€ tests/         # Test suite
└── dist/          # Compiled output

šŸ¤ Contributing

Contributions are welcome! Please read our Contributing Guide for details on our code of conduct and the process for submitting pull requests.

Development Process

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

šŸ“„ License

This project is licensed under the ISC License - see the LICENSE file for details.

šŸ™ Acknowledgments

  • Anthropic for Claude API
  • LangChain for LLM tooling
  • All contributors who have helped with code, bug reports, and suggestions

Built with ā¤ļø using Claude