@purposeinplay/payload-image-translate
Localize text-bearing image assets (banners, promotions, tournament art) for
Payload CMS 3 — regenerate an image per locale with the baked-in text translated and
everything else preserved. The visual analog of
payload-ai-translate.
Generation never touches the live field directly: every render becomes a review candidate that an editor approves before it goes live.
How it works
A generative image-edit engine with an OCR safety net:
source media (en)
├─ extract Claude vision reads the text runs + per-run style + bounding box
├─ translate exact target strings (built-in Claude translator, or reuse ai-translate)
├─ frame REGION: regenerate only a window around the text, composite back
│ (text position locked; everything else stays byte-identical) —
│ or full-frame pad/crop-back when no usable text box
├─ render gpt-image-1.5 images.edit (input_fidelity: high), exact-string prompt
├─ verify Claude vision checks spelling + edge-clipping → retry (≤3)
└─ candidate staged for editor review; on approve, written to the localized field
The model is never asked to translate — it renders the exact pre-translated string, and the result is OCR-verified, so the misspellings that generative image models routinely produce are caught and retried. Ultra-wide banners (4:1+) go through the region-edit path, which anchors the text at its original position and never regenerates the rest of the artwork.
Quick start
import { imageTranslatePlugin } from '@purposeinplay/payload-image-translate';
import {
createAnthropicVisionProviders,
createOpenAIImageProvider,
} from '@purposeinplay/payload-image-translate/providers';
const vision = createAnthropicVisionProviders({ apiKey: process.env.ANTHROPIC_API_KEY! });
export default buildConfig({
plugins: [
imageTranslatePlugin({
collections: {
'promotions-v2': { fields: ['featured_image'] }, // must be `localized: true`
'banners-v2': { fields: ['image'] },
},
sourceLocale: 'en',
targetLocales: ['de', 'es', 'fr', 'it', 'pt', 'ru', 'tr', 'ja', 'ko', 'zh'],
mediaCollectionSlug: 'media',
costPerRenderUsd: 0.1, // budget meter; baseline against your real bill
engine: {
extractor: vision.extractor,
translator: vision.translator,
verifier: vision.verifier,
imageEditor: createOpenAIImageProvider({ apiKey: process.env.OPENAI_API_KEY! }),
},
}),
],
});
Then, in the consumer: make each target upload field localized: true, run
payload generate:importmap (registers the editor UI), and create + run a DB
migration (the plugin adds a reviews collection). Full walkthrough in
docs/getting-started.md.
Editors localize from a "Localize image" button next to the image field: pick languages → watch a progress board → review and approve per locale.
There is also a freestanding Image Translation Studio admin view for ad-hoc assets that don't live on a document: upload or pick any Media image, generate per-locale variants, compare against the source with a wipe slider, download, and optionally save to the Media library (nothing is persisted until you save). See docs/studio.md.
Documentation
| Doc | What's in it |
|---|---|
| Getting started | Install, env, providers, the consumer setup steps |
| Configuration | Complete ImageTranslatePluginConfig reference |
| Editor guide | How editors use the button → progress → review/approve flow |
| Studio | The freestanding Image Translation Studio view (upload → generate → compare → save) |
| Engine | Pipeline internals, region-edit vs full-frame, providers, OCR verify-retry, headless usage |
| API endpoints | The REST endpoints under /api/image-translate/* |
| Architecture | Review-candidate state machine + document lifecycle |
| Security & cost | SSRF, access/IDOR, input caps, cost guards |
| Roadmap: rich-text image localization | Spec for the deferred rich-text version |
Requirements
Peer dependencies: payload ^3, @payloadcms/ui ^3, react 18/19, zod.
Runtime: @anthropic-ai/sdk, openai, sharp. Needs ANTHROPIC_API_KEY and
OPENAI_API_KEY (or your own providers — see engine).
Publish gotcha:
main/exportspoint atsrc/*.tsfor in-repo dev; the realdist/*.jspaths live inpublishConfig. Publish withpnpm publish, notnpm publish— npm does not applypublishConfigfield-overrides.
Headless (no Payload)
import { translateImage } from '@purposeinplay/payload-image-translate';
const { renders } = await translateImage(sourceImage, targetLocales, engine);
Status & limitations
Validated on real assets across all 10 locales (incl. CJK/Cyrillic/Turkish). Works
on localized upload fields. Images embedded inside rich text are not yet
supported — that's a deferred, separately-architected version (see the
roadmap spec). Other known
areas: scene drift on brand-critical elements (use the approval gate), content-policy
refusals on some imagery, and costPerRenderUsd is an operator estimate — baseline
it against your real OpenAI bill. See security & cost.