1.0.9 • Published 4 weeks agoCLI

@optima-chat/video-translate-tools

Licence

MIT

Version

1.0.9

Deps

Size

789 kB

Vulns

Weekly

Summary Dependency Versions

@optima-chat/video-translate-tools

Path-B styled subtitle rendering + BGM ducking for HeyGen-translated videos. Pairs with the gen video-translate CLI from @optima-chat/optima-gen-cli.

gen video-translate --video-url $URL --lang $LANG -o ./out
curl <caption_url> -o ./out/caption.srt
video-translate render-ass --srt ./out/caption.srt --lang $TAG --out ./out/subs.ass
video-translate mux        --raw ./out/translate_*.mp4 --ass ./out/subs.ass [--bgm ./bgm.wav] -o ./final.mp4

Install (in optima-ai-shell Dockerfile)

RUN apt-get install -y fonts-noto-core && \
    npm install -g @optima-chat/video-translate-tools@latest && \
    cp -r $(video-translate fonts-dir)/. /usr/share/fonts/video-translate/ && \
    fc-cache -f

CLI Reference

`render-ass`

Parse SRT → ASS path-B styled subtitles.

Flag	Required	Description
`--srt <path>`		HeyGen SRT file (BOM auto-stripped)
`--lang <en\|ms\|vi\|th>`		Target lang. Determines font + max chars per line
`--out <path>`		Output ASS path
`--translations <path>`		Save / read JSON intermediate. If file exists, used as-is; user can hand-edit to add `word` for KW highlight
`--style <name>`		Subtitle style preset (default `classic`). See Styles. Unknown value warns + falls back to `classic`

`mux`

Burn ASS onto HeyGen mp4. Optional BGM ducking.

Flag	Required	Description
`--raw <path>`		HeyGen-output mp4 (has lip-synced voice)
`--ass <path>`		ASS subtitle file
`--out <path>`		Final mp4
`--bgm <path>`		If set, mix BGM under voice via asplit sidechain ducking
`--fonts-dir <path>`		Override fonts dir (defaults to bundled `packages/video-translate-tools/fonts/`)

`fonts-dir`

Print bundled fonts directory absolute path. Used by Dockerfile to register fonts.

Languages

Tag	Font	Max chars/line	Source
`en`	Bangers	32	npm bundle (this pkg)
`ms`	Bangers	32	npm bundle (this pkg)
`vi`	Noto Sans	30	`apt-get install fonts-noto-core`
`th`	Sarabun	42	npm bundle (this pkg)

Thai uses 42 chars/line because Thai has no spaces between words — smartSplit would otherwise cut mid-word.

--style <name> picks a preset. Default classic = the original look, so callers that don't pass --style are unchanged. A style only changes colour / outline / shadow / keyword treatment + the Latin (en/ms) font; the per-language font fallback above always applies, so th/vi never tofu.

name	en/ms font	vi font	Look
`classic` (default)	Bangers	Noto Sans	White + black outline, pink-outline keyword
`pop-soft`	Bangers	Noto Sans	classic + soft drop shadow (depth)
`pop-3d`	Bangers	Noto Sans	Magenta hard 3D offset shadow + yellow keyword
`pop-hl`	Bangers	Noto Sans	classic + bright-yellow filled keyword
`anton`	Anton	Anton	Tall condensed bold + soft shadow + yellow keyword
`luckyguy`	Luckiest Guy	Noto Sans	Rounded comic + magenta 3D shadow + yellow keyword

Bundled display fonts (fonts/): Bangers, Anton, Luckiest Guy. anton is the only display font that covers Vietnamese diacritics cleanly; the all-caps fonts (Bangers, Luckiest Guy) render vi as ugly mixed-case so vi falls back to Noto Sans for those styles.

System deps

Node ≥ 20
ffmpeg with libass (apt-get install ffmpeg on Ubuntu)
For Vietnamese subs: fonts-noto-core apt pkg (standalone Noto Sans family)
For Thai subs: bundled in this npm package (Sarabun-Regular.ttf)

Note: fonts-noto-cjk does NOT provide standalone Noto Sans — it ships only Noto Sans CJK * families. Use fonts-noto-core.

Scope

v1 ships subtitle rendering + BGM-ducked muxing only. Voice override (custom voice selection from HeyGen library) is deferred to v2 — default flow uses HeyGen auto-clone of original speaker. See SPEC v3.1 §Scope.

License

MIT