1.0.5 • Published 8 months ago

audio-sentence-detector v1.0.5

Weekly downloads
-
License
MIT
Repository
github
Last release
8 months ago

Audio Sentence Detector

An advanced audio sentence detection library that uses voice activity detection, silence analysis, and acoustic features to segment audio into sentences.

Installation

npm install audio-sentence-detector

Usage

const AudioSentenceDetector = require('audio-sentence-detector');

// Create detector with custom options
const detector = new AudioSentenceDetector({
    minSilenceDuration: 0.5,
    silenceThreshold: 0.01
});

// Process audio buffer
const sentences = await detector.detect(audioBuffer);

Configuration Options

The AudioSentenceDetector constructor accepts an options object with the following parameters:

Basic Sentence Detection Options

OptionDefaultDescription
minSilenceDuration0.5Minimum duration of silence (in seconds) to be considered a sentence boundary
silenceThreshold0.01RMS threshold below which audio is considered silence
minSentenceLength1Minimum length of a sentence in seconds
maxSentenceLength15Maximum length of a sentence in seconds
windowSize2048Size of the analysis window in samples
idealSentenceLength5Ideal length of a sentence in seconds (used for probability calculations)
idealSilenceDuration0.8Ideal duration of silence between sentences
allowGapstrueWhether to allow gaps between sentences
minSegmentLength0Minimum length for merged segments
alignToAudioBoundariesfalseWhether to align sentences with audio file boundaries

Voice Detection Options

OptionDefaultDescription
fundamentalFreqMin85Minimum fundamental frequency for voice detection (Hz)
fundamentalFreqMax255Maximum fundamental frequency for voice detection (Hz)
voiceActivityThreshold0.4Threshold for voice activity detection
minVoiceActivityDuration0.1Minimum duration of voice activity (seconds)
energySmoothing0.95Smoothing factor for energy calculations
formantEmphasis0.7Emphasis factor for formant detection
zeroCrossingRateThreshold0.3Threshold for zero-crossing rate in voice detection

Debug Option

OptionDefaultDescription
debugfalseEnable debug logging

Return Value

The detect() method returns an array of sentence objects, each containing:

{
    index: number,          // Index of the sentence
    start: number,          // Start time in seconds
    end: number,           // End time in seconds
    duration: number,      // Duration in seconds
    probability: number    // Confidence score (0-1)
}

Example

const AudioSentenceDetector = require('audio-sentence-detector');

// Create detector with custom settings
const detector = new AudioSentenceDetector({
    minSilenceDuration: 0.3,
    silenceThreshold: 0.02,
    minSentenceLength: 1.5,
    maxSentenceLength: 10,
    debug: true
});

// Process audio file
const fs = require('fs');
const audioBuffer = fs.readFileSync('speech.wav');

try {
    const sentences = await detector.detect(audioBuffer);
    console.log('Detected sentences:', sentences);
} catch (error) {
    console.error('Error processing audio:', error);
}

License

MIT

1.0.5

8 months ago

1.0.4

8 months ago

1.0.3

8 months ago

1.0.2

8 months ago

1.0.1

8 months ago

1.0.0

8 months ago