0.0.1 • Published 10 months ago
@kajidog/voicevox-client v0.0.1
@kajidog/voicevox-client
VOICEVOX client library for text-to-speech synthesis with advanced queue management and cross-platform support.
Features
- Cross-platform Support - Works in both Node.js and browser environments
- Advanced Queue Management - Efficient processing of multiple synthesis requests
- Smart Prefetching - Pre-generates upcoming audio for smooth playback
- Automatic Text Segmentation - Handles long texts by splitting them into manageable segments
- Speaker Management - Supports per-segment speaker assignment
- File Generation - Generate audio files with automatic downloads in browsers
Installation
npm install @kajidog/voicevox-clientRequirements
- Node.js 18.0.0 or higher (for Node.js environments)
- Modern browser with Web Audio API support (for browser environments)
- VOICEVOX Engine or compatible engine running
Quick Start
Node.js Environment
import { VoicevoxClient } from '@kajidog/voicevox-client';
// Initialize client
const client = new VoicevoxClient({
url: 'http://localhost:50021', // VOICEVOX engine URL
defaultSpeaker: 1, // Default speaker ID (optional)
defaultSpeedScale: 1.0, // Default speed (optional)
});
// Simple text-to-speech
await client.speak('こんにちは');
// Multiple texts
await client.speak(['こんにちは', '今日はいい天気ですね']);
// Per-segment speaker control
await client.speak([
{ text: 'こんにちは', speaker: 1 },
{ text: 'お元気ですか?', speaker: 3 },
]);
// Generate audio file
const filePath = await client.generateAudioFile('こんにちは', './output.wav');Browser Environment
import { VoicevoxClient } from '@kajidog/voicevox-client';
const client = new VoicevoxClient({
url: 'http://localhost:50021',
defaultSpeaker: 1,
});
// Play audio in browser
await client.speak('こんにちは');
// Generate and download audio file
const filename = await client.generateAudioFile('こんにちは', 'voice.wav');API Reference
VoicevoxClient
Constructor
new VoicevoxClient(config: VoicevoxConfig)Parameters:
config.url: VOICEVOX engine URLconfig.defaultSpeaker(optional): Default speaker IDconfig.defaultSpeedScale(optional): Default playback speed
Methods
speak()
Convert text to speech and play it.
speak(text: string | string[] | SpeechSegment[], speaker?: number, speedScale?: number): Promise<string>generateQuery()
Generate a voice synthesis query.
generateQuery(text: string, speaker?: number, speedScale?: number): Promise<AudioQuery>generateAudioFile()
Generate an audio file and return its path.
generateAudioFile(textOrQuery: string | AudioQuery, outputPath?: string, speaker?: number, speedScale?: number): Promise<string>getSpeakers()
Get available speakers list.
getSpeakers(): Promise<Speaker[]>getSpeakerInfo()
Get detailed speaker information.
getSpeakerInfo(uuid: string): Promise<SpeakerInfo>clearQueue()
Clear the current audio queue.
clearQueue(): Promise<void>Types
SpeechSegment
interface SpeechSegment {
text: string;
speaker?: number;
}VoicevoxConfig
interface VoicevoxConfig {
url: string;
defaultSpeaker?: number;
defaultSpeedScale?: number;
}License
ISC
0.0.1
10 months ago