Jstextfromimage NPM

JSTextFromImage

npm Version TypeScript License Downloads Node Version

A powerful TypeScript/JavaScript library for obtaining detailed descriptions of images using various AI models including OpenAI's GPT-4 Vision, Azure OpenAI, and Anthropic Claude. Supports image URLs with batch processing capabilities.

🌟 Key Features

🤖 Multiple AI Providers: Support for OpenAI, Azure OpenAI, and Anthropic Claude
🌐 URL Support: Process images from URLs
📦 Batch Processing: Process multiple images concurrently
📝 TypeScript First: Built with TypeScript for excellent type safety
🔄 Async/Await: Modern Promise-based API
🔑 Flexible Auth: Multiple authentication methods including environment variables
🛡️ Error Handling: Comprehensive error handling

📦 Installation

npm install jstextfromimage

🚀 Quick Start

You can use the services either with environment variables or direct initialization.

Using Environment Variables

import { openai, claude, azureOpenai } from 'jstextfromimage';

// Services will automatically use environment variables
const description = await openai.getDescription('https://example.com/image.jpg');

Direct Initialization

import { OpenAIService, ClaudeService, AzureOpenAIService } from 'jstextfromimage';

// OpenAI custom instance
const customOpenAI = new OpenAIService('your-openai-api-key');

// Claude custom instance
const customClaude = new ClaudeService('your-claude-api-key');

// Azure OpenAI custom instance
const customAzure = new AzureOpenAIService({
  apiKey: 'your-azure-api-key',
  endpoint: 'your-azure-endpoint',
  deploymentName: 'your-deployment-name'
});

OpenAI Service

import { openai } from 'jstextfromimage';

// Single image analysis
const description = await openai.getDescription('https://example.com/image.jpg', {
  prompt: "Describe the main elements of this image",
  maxTokens: 500,
  model: 'gpt-4o'
});

// Batch processing
const imageUrls = [
  'https://example.com/image1.jpg',
  'https://example.com/image2.jpg',
  'https://example.com/image3.jpg'
];

const results = await openai.getDescriptionBatch(imageUrls, {
  prompt: "Analyze this image in detail",
  maxTokens: 300,
  concurrency: 2,
  model: 'gpt-4o'
});

// Process results
results.forEach(result => {
  if (result.error) {
    console.error(`Error processing ${result.imageUrl}: ${result.error}`);
  } else {
    console.log(`Description for ${result.imageUrl}: ${result.description}`);
  }
});

Claude Service

import { claude } from 'jstextfromimage';

// Single image analysis
const description = await claude.getDescription('https://example.com/artwork.jpg', {
  prompt: "Analyze this artwork, including style and composition",
  maxTokens: 1000,
  model: 'claude-3-sonnet-20240229'
});

// Batch processing
const artworkUrls = [
  'https://example.com/artwork1.jpg',
  'https://example.com/artwork2.jpg'
];

const analyses = await claude.getDescriptionBatch(artworkUrls, {
  prompt: "Provide a detailed art analysis",
  maxTokens: 800,
  concurrency: 2,
  model: 'claude-3-sonnet-20240229'
});

Azure OpenAI Service

import { azureOpenai } from 'jstextfromimage';

// Single image analysis
const description = await azureOpenai.getDescription('https://example.com/scene.jpg', {
  prompt: "Describe this scene in detail",
  maxTokens: 400,
  systemPrompt: "You are an expert in visual analysis."
});

// Batch processing
const sceneUrls = [
  'https://example.com/scene1.jpg',
  'https://example.com/scene2.jpg'
];

const analyses = await azureOpenai.getDescriptionBatch(sceneUrls, {
  prompt: "Analyze the composition and mood",
  maxTokens: 500,
  concurrency: 3,
  systemPrompt: "You are an expert cinematographer."
});

💡 Configuration

Default Values

// OpenAI defaults
{
  model: 'gpt-4o',
  maxTokens: 300,
  prompt: "What's in this image?",
  concurrency: 3  // for batch processing
}

// Claude defaults
{
  model: 'claude-3-sonnet-20240229',
  maxTokens: 300,
  prompt: "What's in this image?",
  concurrency: 3
}

// Azure OpenAI defaults
{
  maxTokens: 300,
  prompt: "What's in this image?",
  systemPrompt: "You are a helpful assistant.",
  concurrency: 3
}

Local File Support

import { openai } from 'jstextfromimage';

// Single local file
const description = await openai.getDescription('/path/to/local/image.jpg', {
  prompt: "Describe this image",
  maxTokens: 300,
  model: 'gpt-4o'
});

// Mix of local files and URLs in batch processing
const images = [
  '/path/to/local/image1.jpg',
  'https://example.com/image2.jpg',
  '/path/to/local/image3.png'
];

const results = await openai.getDescriptionBatch(images, {
  prompt: "Analyze each image",
  maxTokens: 300,
  concurrency: 2
});

Environment Variables

# OpenAI
OPENAI_API_KEY=your-openai-api-key

# Claude
ANTHROPIC_API_KEY=your-claude-api-key

# Azure OpenAI
AZURE_OPENAI_API_KEY=your-azure-api-key
AZURE_OPENAI_ENDPOINT=your-azure-endpoint
AZURE_OPENAI_DEPLOYMENT=your-deployment-name

Options Interfaces

// Base options for all services
interface BaseOptions {
  prompt?: string;
  maxTokens?: number;
  concurrency?: number; // For batch processing
}

// OpenAI specific options
interface OpenAIOptions extends BaseOptions {
  model?: string;
}

// Claude specific options
interface ClaudeOptions extends BaseOptions {
  model?: string;
}

// Azure OpenAI specific options
interface AzureOpenAIOptions extends BaseOptions {
  systemPrompt?: string;
}

// Azure OpenAI configuration
interface AzureOpenAIConfig {
  apiKey?: string;
  endpoint?: string;
  deploymentName?: string;
  apiVersion?: string;
}

// Batch processing results
interface BatchResult {
  imageUrl: string;
  description: string;
  error?: string;
}

🔍 Error Handling Examples

// Single image with error handling
try {
  const description = await openai.getDescription(imageUrl, {
    maxTokens: 300
  });
  console.log(description);
} catch (error) {
  console.error('Failed to process image:', error);
}

// Batch processing with retry
async function processWithRetry(imageUrls: string[], maxRetries = 3) {
  const results = await openai.getDescriptionBatch(imageUrls, {
    maxTokens: 300,
    concurrency: 2
  });
  
  // Handle failed items with retry
  const failedItems = results.filter(r => r.error);
  let retryCount = 0;
  
  while (failedItems.length > 0 && retryCount < maxRetries) {
    const retryUrls = failedItems.map(item => item.imageUrl);
    const retryResults = await openai.getDescriptionBatch(retryUrls, {
      maxTokens: 300,
      concurrency: 1 // Lower concurrency for retries
    });
    
    // Update results with successful retries
    retryResults.forEach(result => {
      if (!result.error) {
        const index = results.findIndex(r => r.imageUrl === result.imageUrl);
        if (index !== -1) {
          results[index] = result;
        }
      }
    });
    
    retryCount++;
  }
  
  return results;
}

🛠️ Development

# Install dependencies
npm install

# Run tests
npm test

# Build the project
npm run build

# Run linting
npm run lint

🤝 Contributing

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -am 'feat: add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

💬 Support

For support, please open an issue on GitHub.

openai gpt-4 claude azure azure-openai computer-vision image-to-text ai machine-learning batch-processing local-files image-analysis typescript vision-ai

@anthropic-ai/sdk @azure/identity @azure/openai axios dotenv mime-types openai p-limit

8 months ago

8 months ago

8 months ago

8 months ago

8 months ago

8 months ago

8 months ago

8 months ago

8 months ago

8 months ago

8 months ago

8 months ago

8 months ago

8 months ago

8 months ago