Mtl-voxy NPM | npm.io

Voxy SDK Integration

Overview

This project is a real-time transcription system using Socket.io and plain JavaScript. It allows users to capture speech from a microphone, transcribe it in real-time, and export the transcript as a text file.

Features

Real-time speech-to-text transcription
Recording mode detection
Export transcription as a .txt file

Prerequisites

A modern web browser (Chrome, Firefox, Edge, or Safari)
A working microphone
A valid API_URL for WebSocket-based speech recognition
A valid email and password for authentication

Installation

npm install mtl-voxy

Explanation

Import Required Module:
```
import useVoxy from 'mtl-voxy';
```
Initializing the SDK Connection:
```
const voxyInstance = await useVoxy({
     apiUrl: '<API_URL>',
     email: '<EMAIL>',
     password: '<PASSWORD>',
     sampleRate: 16000, 
     chunkDuration: 0.1, 
     decibelThreshold: -30
 });
```
- sampleRate: Determines the number of audio samples captured per second.
- chunkDuration: Controls the duration (in seconds) of each audio segment sent to the server.
- decibelThreshold: Defines the minimum audio level required to be considered speech.

Handling Real-Time Transcription:
```
voxyInstance.getTranscription((text) => {
    document.getElementById("transcript").value += text;
});
```
- Receives transcription updates and appends them to the #transcript container.

Updating Recording Status:

voxyInstance.getRecordingStatus((isRecording) => {
    document.getElementById("record-btn").textContent = isRecording ? "Kaydı Durdur" : "Kaydı Başlat";
});

Changes the text of the record button based on the recording state.

Updating Mode Status:

voxyInstance.getModeStatus((mode) => {
    document.getElementById("mode").textContent = mode;
});

Displays the current mode in the UI.

Adding Event Listeners:

document.getElementById('record-btn').addEventListener('click', () => {
    voxyInstance.toggleRecording();
});

document.getElementById('export-btn').addEventListener('click', () => {
    voxyInstance.exportTranscriptionAsTxt();
});

The first event listener starts or stops recording when clicking the record-btn.
The second event listener exports the transcript when clicking export-btn.

`useVoxy` Parameters

Parameter	Type	Required	Default	Description
`apiUrl`	String	Yes	None	The WebSocket server URL for speech recognition.
`email`	String	Yes	None	The user's email for authentication.
`password`	String	Yes	None	The user's password for authentication.
`sampleRate`	Number	No	16000	Defines the number of samples per second in the audio stream.
`chunkDuration`	Number	No	0.1	Specifies the duration (in seconds) of each audio chunk sent for transcription.
`decibelThreshold`	Number	No	-30	Defines the minimum audio level required to be considered speech.