0.4.0 • Published 7 months ago

@commenthol/ai-translate v0.4.0

Weekly downloads
-
License
MIT
Repository
gitlab
Last release
7 months ago

npm-badge types-badge

ai-translate

Translate text documents with ollama (or other models)

Kudos 👏 to https://github.com/wolfreka/ollama-translator

CLI

NOTE: Select a model which is suitable for translations.

Getting started

  1. Install the package globally

    npm i -g @commenthol/ai-translate
  2. If running the translator locally install ollama. See Ollama Download.

    Run ollama as system service or with ollama serve.

    Pull the default model.

    ollama pull qwen2.5:7b
  3. Translate your first document:

    ai-translate --from English --input input.md --to French --output french.md

    or in short form

    ai-translate -f en -i input.md -to fr -o french.md

Using a different model

  1. Let's use Mistral AI for now. (You can also use Anthropic, OpenAI)

    # see help
    ai-translate set --help
    
    ai-translate set provider mistral
    ai-translate set model ministral-8b-2410
    ai-translate set apiKey YOUR_API_KEY_HERE
    ai-translate set chunkSize 3000
    
    # check your config with
    ai-translate set

    For a "local" config use the -c DIR option. This then creates a .ai-translate.json file in that folder.

    {
      "provider": "mistral",
      "model": "ministral-8b-2410",
      "apiKey": "YOUR_API_KEY_HERE",
      "chunkSize": 3000
    }
  2. Then translate your document:

    ai-translate -f en -i input.md -to fr -o french.md

Cli Help

$ ai-translate --help

AI Translator

Usage:
  ai-translate [flags]
  ai-translate [command] [flags]

Commands:
  set                 set config value

Flags:
  -h, --help            Help for ai-translate
  -v, --version         Show version information
  -c, --config DIR      Use config file .ai-translate.json in DIR
  -f, --from LANG       Source language
  -t, --to LANG         Target language; LANG is English language name or
                        supported BCP47 codes (ar, de, en, es, fr, ja, pt, ru,
                        vi, zh-CN, zh-TW)
  -i, --input FILE      input file
  -o, --output FILE     output file
      --format FORMAT   specify input format (cpp, go, java, js, php, proto,
                        python, rst, ruby, rust, scala, swift, markdown, latex,
                        html, sol)
Examples:
  Translate input.md from Spanish to output.md in English
    ai-translate -f Spanish -t English -i input.md -o output.md

  Pipe from stdin to stdout using the config in the local folder
    echo "translate" | ai-translate -f en -t en -c .

Use "ai-translate [command] --help" for more information about a command.
$ ai-translate set --help

Set ai-translate configuration

Writes config to \`.ai-translate.json\`
If --config flag is omitted then global config is used.

Usage:
  ai-translate [flags] set KEY VALUE

Flags:
  -c, --config DIR    Use config file .ai-translate.json in DIR

Available KEYs:
  provider      set provider (ollama, mistral, anthropic, openai, deepseek);
                default="ollama"
  model         set model from provider; default="qwen2.5:7b"
  apiKey        set api key
  baseUrl       baseUrl for model
  temperature   model temperature; default=0.1
  maxRetries    max. number of retries; default=10
  chunkSize     number of chunks used in text-splitter; default=1000

API

Node streaming API using langchainJS.

import { createReadStream, createWriteStream } from 'node:fs'
import { pipeline } from 'node:stream/promises'
import {
  modelFactory,
  AiTranslateTransform,
  recursiveChunkTextSplitter,
  TextSplitterStream
} from '@commenthol/ai-translate'

const model = modelFactory({
  provider: 'ollama',
  model: 'qwen2.5:7b',
  temperature: 0.1,
  maxRetries: 10
})

// create fs stream reader and writer
const reader = createReadStream(new URL('./input.md', import.meta.url))
const writer = createWriteStream(new URL('./output.md', import.meta.url))

// define the input format - alternatively use `getFormatByExtension('.md')`
// to get format from input file extension
const format = 'markdown'
// define a textSplitter - you can use any langchainJS compatible TextSplitter 
const textSplitter = recursiveChunkTextSplitter({ format, chunkSize: 2000 })

// instantiate the splitter and translate streams
const splitter = new TextSplitterStream({ textSplitter })
const translator = new AiTranslateTransform({
  model,
  format,
  sourceLanguage: 'en',
  targetLanguage: 'es'
})

// run...
await pipeline(reader, splitter, translator, writer)

License

MIT licensed