1.2.0 β€’ Published 2 months ago

kokoro-js v1.2.0

Weekly downloads
-
License
Apache-2.0
Repository
github
Last release
2 months ago

Kokoro TTS

Kokoro is a frontier TTS model for its size of 82 million parameters (text in/audio out). This JavaScript library allows the model to be run 100% locally in the browser thanks to πŸ€— Transformers.js. Try it out using our online demo!

Usage

First, install the kokoro-js library from NPM using:

npm i kokoro-js

You can then generate speech as follows:

import { KokoroTTS } from "kokoro-js";

const model_id = "onnx-community/Kokoro-82M-v1.0-ONNX";
const tts = await KokoroTTS.from_pretrained(model_id, {
  dtype: "q8", // Options: "fp32", "fp16", "q8", "q4", "q4f16"
  device: "wasm", // Options: "wasm", "webgpu" (web) or "cpu" (node). If using "webgpu", we recommend using dtype="fp32".
});

const text = "Life is like a box of chocolates. You never know what you're gonna get.";
const audio = await tts.generate(text, {
  // Use `tts.list_voices()` to list all available voices
  voice: "af_heart",
});
audio.save("audio.wav");

Or if you'd prefer to stream the output, you can do that with:

import { KokoroTTS, TextSplitterStream } from "kokoro-js";

const model_id = "onnx-community/Kokoro-82M-v1.0-ONNX";
const tts = await KokoroTTS.from_pretrained(model_id, {
  dtype: "fp32", // Options: "fp32", "fp16", "q8", "q4", "q4f16"
  // device: "webgpu", // Options: "wasm", "webgpu" (web) or "cpu" (node).
});

// First, set up the stream
const splitter = new TextSplitterStream();
const stream = tts.stream(splitter);
(async () => {
  let i = 0;
  for await (const { text, phonemes, audio } of stream) {
    console.log({ text, phonemes });
    audio.save(`audio-${i++}.wav`);
  }
})();

// Next, add text to the stream. Note that the text can be added at different times.
// For this example, let's pretend we're consuming text from an LLM, one word at a time.
const text = "Kokoro is an open-weight TTS model with 82 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, Kokoro can be deployed anywhere from production environments to personal projects. It can even run 100% locally in your browser, powered by Transformers.js!";
const tokens = text.match(/\s*\S+/g);
for (const token of tokens) {
  splitter.push(token);
  await new Promise((resolve) => setTimeout(resolve, 10));
}

// Finally, close the stream to signal that no more text will be added.
splitter.close();

// Alternatively, if you'd like to keep the stream open, but flush any remaining text, you can use the `flush` method.
// splitter.flush();

Voices/Samples

!TIP You can find samples for each of the voices in the model card on Hugging Face.

American English

NameTraitsTarget QualityTraining DurationOverall Grade
af_heart🚺❀️A
af_alloy🚺BMM minutesC
af_aoede🚺BH hoursC+
af_bella🚺πŸ”₯AHH hoursA-
af_jessica🚺CMM minutesD
af_kore🚺BH hoursC+
af_nicole🚺🎧BHH hoursB-
af_nova🚺BMM minutesC
af_river🚺CMM minutesD
af_sarah🚺BH hoursC+
af_sky🚺BM minutes 🀏C-
am_adam🚹DH hoursF+
am_echo🚹CMM minutesD
am_eric🚹CMM minutesD
am_fenrir🚹BH hoursC+
am_liam🚹CMM minutesD
am_michael🚹BH hoursC+
am_onyx🚹CMM minutesD
am_puck🚹BH hoursC+
am_santa🚹CM minutes 🀏D-

British English

NameTraitsTarget QualityTraining DurationOverall Grade
bf_alice🚺CMM minutesD
bf_emma🚺BHH hoursB-
bf_isabella🚺BMM minutesC
bf_lily🚺CMM minutesD
bm_daniel🚹CMM minutesD
bm_fable🚹BMM minutesC
bm_george🚹BMM minutesC
bm_lewis🚹CH hoursD+
1.2.0

2 months ago

1.1.1

3 months ago

1.1.0

3 months ago

1.0.1

3 months ago

1.0.0

3 months ago