Audio-segments-detector NPM

Audio Segments Detector

A Node.js module for detecting speech segments in audio buffers. This module analyzes audio data and identifies segments based on energy levels, making it useful for speech detection, silence removal, and audio segmentation tasks.

Installation

npm install audio-segments-detector

Features

Detect speech segments in audio buffers
Support for various audio formats (automatically converted to WAV)
Configurable detection parameters
TypeScript support
Promise-based API
Memory efficient buffer-based processing

Usage

const { AudioSegmentsDetector } = require('audio-segments-detector');

// Basic usage with buffer
async function example() {
  try {
    const detector = new AudioSegmentsDetector();
    const audioBuffer = /* your audio buffer */;
    const segments = await detector.processBuffer(audioBuffer, 'mp3');
    console.log(segments);
    // Output: [{ start: 0, end: 1.5 }, { start: 2.1, end: 3.8 }, ...]
  } catch (error) {
    console.error('Error:', error);
  }
}

// Advanced usage with custom options
async function advancedExample() {
  try {
    const detector = new AudioSegmentsDetector({
      threshold: 0.02,
      minSilenceDuration: 700,
      samplingRate: 44100
    });
    
    const wavBuffer = /* your WAV buffer */;
    const segments = await detector.processBuffer(wavBuffer, 'wav');
    console.log(segments);
  } catch (error) {
    console.error('Error:', error);
  }
}

// Direct WAV buffer processing
async function processWavExample() {
  try {
    const detector = new AudioSegmentsDetector();
    const wavBuffer = /* your WAV buffer */;
    const segments = await detector.processWavBuffer(wavBuffer);
    console.log(segments);
  } catch (error) {
    console.error('Error:', error);
  }
}

API

Class: AudioSegmentsDetector

Constructor

new AudioSegmentsDetector([options])

Options

threshold (number, default: 0.01): Energy threshold for detection
minSilenceDuration (number, default: 500): Minimum silence duration in milliseconds
samplingRate (number, default: 16000): Sampling rate in Hz

Methods

processBuffer(buffer, inputFormat)

Processes any supported audio format buffer.

Parameters:
- buffer (Buffer): Audio buffer to process
- inputFormat (string): Format of the input buffer (e.g., 'mp3', 'wav', 'ogg')
Returns: Promise<Array>