npm.io
0.1.1 • Published yesterday

@purposeinplay/payload-image-translate

Licence
MIT
Version
0.1.1
Deps
3
Size
327 kB
Vulns
1
Weekly
0

@purposeinplay/payload-image-translate

Localize text-bearing image assets (banners, promotions, tournament art) for Payload CMS 3 — regenerate an image per locale with the baked-in text translated and everything else preserved. The visual analog of payload-ai-translate.

Generation never touches the live field directly: every render becomes a review candidate that an editor approves before it goes live.

How it works

A generative image-edit engine with an OCR safety net:

source media (en)
  ├─ extract    Claude vision reads the text runs + per-run style + bounding box
  ├─ translate  exact target strings (built-in Claude translator, or reuse ai-translate)
  ├─ frame      REGION: regenerate only a window around the text, composite back
  │             (text position locked; everything else stays byte-identical) —
  │             or full-frame pad/crop-back when no usable text box
  ├─ render     gpt-image-1.5 images.edit (input_fidelity: high), exact-string prompt
  ├─ verify     Claude vision checks spelling + edge-clipping → retry (≤3)
  └─ candidate  staged for editor review; on approve, written to the localized field

The model is never asked to translate — it renders the exact pre-translated string, and the result is OCR-verified, so the misspellings that generative image models routinely produce are caught and retried. Ultra-wide banners (4:1+) go through the region-edit path, which anchors the text at its original position and never regenerates the rest of the artwork.

Quick start

import { imageTranslatePlugin } from '@purposeinplay/payload-image-translate';
import {
  createAnthropicVisionProviders,
  createOpenAIImageProvider,
} from '@purposeinplay/payload-image-translate/providers';

const vision = createAnthropicVisionProviders({ apiKey: process.env.ANTHROPIC_API_KEY! });

export default buildConfig({
  plugins: [
    imageTranslatePlugin({
      collections: {
        'promotions-v2': { fields: ['featured_image'] }, // must be `localized: true`
        'banners-v2': { fields: ['image'] },
      },
      sourceLocale: 'en',
      targetLocales: ['de', 'es', 'fr', 'it', 'pt', 'ru', 'tr', 'ja', 'ko', 'zh'],
      mediaCollectionSlug: 'media',
      costPerRenderUsd: 0.1, // budget meter; baseline against your real bill
      engine: {
        extractor: vision.extractor,
        translator: vision.translator,
        verifier: vision.verifier,
        imageEditor: createOpenAIImageProvider({ apiKey: process.env.OPENAI_API_KEY! }),
      },
    }),
  ],
});

Then, in the consumer: make each target upload field localized: true, run payload generate:importmap (registers the editor UI), and create + run a DB migration (the plugin adds a reviews collection). Full walkthrough in docs/getting-started.md.

Editors localize from a "Localize image" button next to the image field: pick languages → watch a progress board → review and approve per locale.

There is also a freestanding Image Translation Studio admin view for ad-hoc assets that don't live on a document: upload or pick any Media image, generate per-locale variants, compare against the source with a wipe slider, download, and optionally save to the Media library (nothing is persisted until you save). See docs/studio.md.

Documentation

Doc What's in it
Getting started Install, env, providers, the consumer setup steps
Configuration Complete ImageTranslatePluginConfig reference
Editor guide How editors use the button → progress → review/approve flow
Studio The freestanding Image Translation Studio view (upload → generate → compare → save)
Engine Pipeline internals, region-edit vs full-frame, providers, OCR verify-retry, headless usage
API endpoints The REST endpoints under /api/image-translate/*
Architecture Review-candidate state machine + document lifecycle
Security & cost SSRF, access/IDOR, input caps, cost guards
Roadmap: rich-text image localization Spec for the deferred rich-text version

Requirements

Peer dependencies: payload ^3, @payloadcms/ui ^3, react 18/19, zod. Runtime: @anthropic-ai/sdk, openai, sharp. Needs ANTHROPIC_API_KEY and OPENAI_API_KEY (or your own providers — see engine).

Publish gotcha: main/exports point at src/*.ts for in-repo dev; the real dist/*.js paths live in publishConfig. Publish with pnpm publish, not npm publish — npm does not apply publishConfig field-overrides.

Headless (no Payload)

import { translateImage } from '@purposeinplay/payload-image-translate';
const { renders } = await translateImage(sourceImage, targetLocales, engine);

Status & limitations

Validated on real assets across all 10 locales (incl. CJK/Cyrillic/Turkish). Works on localized upload fields. Images embedded inside rich text are not yet supported — that's a deferred, separately-architected version (see the roadmap spec). Other known areas: scene drift on brand-critical elements (use the approval gate), content-policy refusals on some imagery, and costPerRenderUsd is an operator estimate — baseline it against your real OpenAI bill. See security & cost.

Keywords