Audio-works NPM

audio-works

NPM Downloads GitHub last commit

WIP

Library meant for Electron (ver. 11.x - 13.x) to determine note based on the signal received from mic or file.

Currently, tested determining sounds down to C1 ~ 32.7Hz (Less than 2Hz difference between C1 and B0) so at the moment it's accurate enough down to at least 2Hz differences.

Added volume measuring
Still ~~a lot~~ a little of garbage left in methods waiting for removal
Added possibility to change input audio device (+ automatically changes when current device gets disconnected)
~~Still a lot a little bit~~ Almost none of garbage left in methods waiting for removal
Added possibility to change input audio device (+ automatically updates list of devices on change / when current device gets disconnected)
Separated most micSetup and Renderer methods into modules
Changed objects into classes
Fixed bug with enabling mic after disabling it
Fixed bug with repeatedly changing input device while the mic is enabled resulting in problems with audioContext
~~Untangled logic, so it's a bit more simple and less convoluted now imo~~ I was so wrong
Added A-, B- and C-weighting classes
Changed audio volume measurement using weighting classes
Added methods returning nyquist frequency and band range of current audioHandler setup
Added possibility to change output device

TODO right now:

Add methods to frequencyMath
Untangle deviceHandler and other redundant methods etc.
Adding possibility to automatically switch to default available device if currently used one gets disconnected
General code refactor
Output audio (the latency is/will be +- 1 second so not great, but it's Node + Chromium ¯\_(ツ)_/¯
Changes in soundStorage module for storing and determining frequencies (in progress)
Anything else that will pop up later

ChangeLog:

Classes:

Test coverage

coverage

Setup, sample initialization

Electron's browser window should've contextIsolation set to false as well as nodeIntegration set to true.

window = new browserWindow({
  webPreferences: {
    contextIsolation: false,
    nodeIntegration: true,
  },
});

Then in rendering process sample initialization logging value of a correlated buffer from the default device could look like this:

const { AudioHandler, AudioEvents } = require("audio-works");

let mic = new AudioHandler();

mic.on(AudioEvents.audioProcessUpdate, (evt) => {
  console.log(evt.correlate());
}); // Event called from ScriptProcessor on new buffer chunk

await mic.setupStream(); // Start the mediaStreamSource

setTimeout(1000, () => {
  mic.end();
}); // close the stream after 1 second

It is also possible to output received signal by creating the html Audio object and setting it's srcObject property to the stream hold by the AudioHandler.

const { AudioHandler, AudioEvents } = require("audio-works");

let mic = new AudioHandler();
let audio = new Audio();

mic.on(AudioEvents.setupDone, (evt) => {
  audio.srcObject = evt.stream;
}); // Event emitted after setupStream()

await mic.setupStream(); // setupStream() is asynchronus but all the following
// actions can be done on emission of the "SetupDone" event
// so that await in such a case could be omitted

To change the device it's enough to pass an id of said device to the changeInput methods of the AudioHandler. List of devices can be accessed through DeviceHandler hold by the AudioHandler as deviceHandler property.

const { AudioHandler, Device } = require("audio-works");

let mic = new AudioHandler();

// if for some reason the device list is empty, which shouldn't really happen,
// then it can be easily fixed by calling `await mic.deviceHandler.updateDeviceList();`

// Retrieves a list of available devices
let inputs = mic.getDeviceList(Device.direction.input);
// Change default ('first available') input to the third one
mic.changeInput(inputs[2].id);

await mic.setupStream();

Classes

AudioSetup

File in GitHub
Main class responsible for setting up AudioHandler and AudioFileHandler holding two main obligatory nodes used by AudioContext which are AnalyserNode and GainNode. This class extends EventEmitter as after various steps instance dispatches related to them events.

Method	Arguments	Return value	Description
constructor	gain: Gain,analyser: Analyser	AudioSetup	Receives AudioNode and GainNode instances, saves them to class members stored as this.gain and this.analyser then immediately calls startAudioContext method
startAudioContext	N/A	void	Creates new AudioContext instance that is stored as this.audioContext, then Analyser and Gain nodes passed to the instances in constructor are created. With AnalyserNode set up sample rate and bin count are stored as class members inthis.sampleRate and _this.binCount_Finally event "AudioContextStarted" is dispatched notifying about finished initial setup.
streamSetup	MediaStreamSource: IAudioNode, ScripProcessor: IAudioNode	void	Connects Analyser to MediaStreamSource, then connects ScripProcessor to the Analyser. AudioContext.destination is being connected to both GainNode and ScripProcessor. Finally, ScripProcessor.onaudioprocess callback is defined which dispatches "AudioProcessUpdate" event holding instance of AudioSetup which holds the ScriptProcessor.
async streamClose	N/A	void	Disconnects GainNode and AnalyserNode, then closes AudioContext.
async streamPause	N/A	void	Suspends AudioContext. With Chromium backward compatibility it is heavily unreliable
async streamResume	N/A	void	Basically just a shorthand for await this.audioContext.resume()
BFD	dataContainer: Uint8Array	void	Shorthand for this.analyser.node.getByteFrequencyData(dataContainer)
BFDUint8	binCount: uint = this.binCount	Uint8Array	Shorthand call to this.BFD(...) that automatically creates Uint8Array of size passed to the method as binCount argument that defaults to this.binCount, that will be filled with data bygetByteFrequencyData and returns it afterwards.
FTD	buffer: Float32Array	void	Shorthand for this.analyser.node.getFloatTimeDomainData(Buffer)
FTDFloat32	buflen: uint = this.buflen	Float32Array	Shorthand call to this.FTD(...) that automatically creates Float32Array of size passed to the method as buflen argument that defaults to this.buflen, that will be filled with data bygetFloatTimeDomainData and returns it afterwards.
selfCheckAudioContext	N/A	bool	Checks state of AudioContext instance and starts it up again if it's currently in a closed state.

AudioHandler

File in GitHub
Extends AudioSetup as AudioContext is crucial for all the functionalities provided by this class. Handles live audio inputs like microphones or instruments connected to audio interfaces as well as output to any available devices.

Method	Arguments	Return value	Description
constructor	{ general: { buflen: int, curveAlgorithm: string }, gainNode: GainNode, analyserNode: AnalyserNode, correlationSettings: { rmsThreshold: double <0, 1), correlationThreshold: double <0, 1), correlationDegree: double <0, 1) } navigator: Optionalobject}	AudioHandler	Constructor receives object containing:general: object containing buffer length used in correlation and curveAlgorithm to which audio spectrum will be able to be castGainNode passed to the base class constructorAnalyserNode passed to the base class constructorcorrelationSettings: object holding values for Correlation class initializationAfter base class constructor call, setting buffer length and sound curve algorithm new DeviceHandler class is initialized and stored in member deviceHandler which will be used to access Audio IO devices.
async getMediaStream	deviceId: Optionalstring	MediaStream	Returns output of navigator.mediaDevices.getUserMedia() method to which is passed the constraint. By default, initially, no video, only first default audio device. Sets stream for provided device ID and if not specified, first available input audio device.
async setupStream	deviceId: Optionalstring	void	If no input device is available method throws 'No input audio input devices available'. If audioContext is closed it automatically starts a new one. Creates new instances of MediaStreamSource and ScriptProcessor sent further to the streamSetup method of base class. Stream from getMediaStream method is stored in class member this.stream. After that a Correlation instance is created and stored in this.correlation member. Sets stream for provided device ID and if not specified, first available input audio device.
nyquistFrequency	N/A	double	Returns AudioContext sample rate divided by two which is... the nyquist frequency.
getVolume	accuracy: int	double	An average of values stored in analysers ByteFrequencyData. The accuracy passed to the method represents decimal points of returned value.
getWeightedVolume	accuracy: int	double	Purely empirical and subjective method that aggregates all the bands from the byte frequency data cast into a weighting curve then passed through logarithm of base 10 and finally multiplied by ten... The accuracy passed to the method represents decimal points of returned value. Seems to work better (from human ear perspective) then getVolume method especially with addition of few operations to limit the output value (ie. see Sample usage of getWeightedVolume below), but then again, it's not a concrete measure as it's a subjective value.
correlate	N/A	double	Returns output of correlation (frequency in Hz) performed on float time domain data of the currently stored buffer.
async getDeviceList	direction: OptionalDevice.direction	ArrayDevice	Returns array of Device instances related to available audio IO devices. If no direction is specified all devices will be returned, otherwise only the devices in specified direction.
async pause	N/A	void	Calls base class method streamPause(), sets running member of class to false and emits event "StreamPause" at the end
async resume	N/A	void	Calls base class method streamResume(), sets running member of class to true and emits event "StreamResume" at the end

Sample object initialization

const { AudioHandler, Gain, Analyser } = require("audio-works");

let mic = new AudioHandler({ // All the values are optional.
    general: {               // Omitting some values in objects
        buflen: 8192,        // containing more properties will result
        curveAlgorithm: 'A'  // in assigment of default value
    },                       // only to the missing properties
    gainNode: new Gain({value: 1.5}),
    analyserNode: new Analyser({
        smoothingTimeConstant: 0.9,
        fftSize: 32768,
        minDecibels: -90,
        maxDecibels: -10
    }),
    correlation: {
        rmsThreshold: 0.01,
        correlationThreshold: 0.01,
        correlationDegree: 0.98
    }
});

Sample usage of getWeightedVolume

const vol = mic.getWeightedVolume(2); 
let volume = (vol / 200) * (vol / 2); // Let's take everything over 200dB as maximally "loud"
                                      // (alternatively can be written as vol^2 / 400)
volume = volume < 100 ? volume : 100;

AudioFileHandler

File in GitHub
Extends AudioHandler class therefore retains possibility to handle live audio input but adds methods meant for audio file decoding, creating standard BufferSources with primary goal of audio output, or obtaining pulse-code modulation data.

Method	Arguments	Return value	Description
constructor	initData: ObjectSameAsForAudioHandler,filePath: string,maxSmallContainerSize: uint = 35000	AudioFileHandler	Given that this class extends AudioHandler the initData argument is the object passed to the base class. Additionally, it accepts filePath argument which, as the name suggests, should be the path to a file which will be processed. Because there's a need to decode the files, and their content will have to be converted to ArrayBuffer, "maxSmallContainerSize" uint will choose the appropriate method to convert the data as different solutions work faster for different container sizes. i.e. `new Uint8Array(data).buffer` will be faster for smaller containers than a standard for loop with value reassignments by ~30% as long as the "data" container has less than 35 000 elements. With more elements to process the situation is reversed and standard loop becomes faster for large containers.
async decode	callback: function	AudioBuffer	Reads whole file, casts it to ArrayBuffer which is then passed along with a callback to the AudioContext method decodeAudioData that's returned from the method.
async getPCMData	data: AudioBuffer,channel: uint	Object{ data: AudioBuffer, pcm: Arrayint}	Data is supposed to be the output of AudioContext.decodeAudioData method which is actually the default value in case of no parameter passed to this method. Channel argument specifies which channel to read from the data. Returns object { data, pcm } where data is the original decode file data and pcm is the pulse-code modulation from the specified channel.
async initCorrelation	buflen = this.buflen: uint	void	Correlation object is created during the setup of audio stream in base class. This case does not apply to the FileHandler variant and so to create the Correlation instance inside the AudioFileHandler instance this method call is required.
process	pcm: pcm: Arrayint,action: function	void	Action argument is supposed to be a callback handling chunks of data. This method loops through the pcm data performing on each chunk of data specified action.
async processEvent	decoded: AudioBuffer,channel: uint	void	Decoded and channel arguments are the same ones used in getPCMData method as those are passed to it to retrieve the pcm data which is then passed to the process method with default callback simply emitting event "ProcessedFileChunk" that contains said chunk.
async processCallback	callback: function,decoded: AudioBuffer,channel: uint	void	Same method as processEvent with only difference of obligatory callback passed as the first argument that's going to be passed to the process method to handle the pcm data chunks.
async createSource	callback: function	AudioBufferSourceNode	Creates BufferSource node from the AudioContext, then calls this.decode(action) where if callback was defined the action is exactly the same callback, and in case of undefined callback it sets BufferSource buffer as the -soon to be- decoded file while also connecting it to AudioContext.destination. Finally, the method returns BufferSource instance created in the beginning.

Example of logging correlated data and playing the audio from a file:

const { AudioFileHandler, AudioEvents } = require("audio-works");

const fileHandler = new AudioFileHandler({}, "./audioFiles/sample.wav");
await fileHandler.initCorrelation(); // this call is needed as we don't
// call setupStream() method

// -- Event driven approach --
fileHandler.on(AudioEvents.processedFileChunk, (evt) => {
  // perform() is called directly on the correlation object stored
  // in fileHandler, unlike calling "correlate()" in AudioHandler,
  // as there's no mediaStream stored in the "stream" property,
  // therefore it requires to manually push the data chunk passed
  // to the listener in evt data to be correlated.
  console.log(fileHandler.correlation.perform(evt));
});

const audioSource = await fileHandler.createSource();
audioSource.start(0);

fileHandler.processEvent(); // start processing

// -- Callback approach --
const audioSource = await fileHandler.createSource();
audioSource.start(0);

fileHandler.processCallback((data) => {
  console.log(fileHandler.correlation.perform(data));
});
// While using callback processing starts immediately so there's
// no call like "processEvent()" in this case

Correlation

File in GitHub
Sole purpose of this class is performing autocorrelation on audio buffer, allowing a set-up of custom thresholds. The output of perform method is supposed to be a frequency of the sound (the fundamental frequency). This means it processes the signal in monophonic context.

Method	Arguments	Return value	Description
constructor	Object{ sampleRate: uint, rmsThreshold: double <0,1), correlationThreshold: double <0,1),correlationDegree: double <0,1),buflen: uint,returnOnThreshold: bool}	Correlation	Creates a Correlation instance setting up rms and correlation thresholds. Sample rate is require for the last step of the autocorrelation as based on this value the frequency will be calculated. It is possible and encouraged to pass only the buflen and sampleRate values as the remaining values can be automatically set to default. *Based on buffer length (buflen) value of the defaultCorrelationSampleStep* property is determined: For buffer length below 8192 by default the value is set to 1 otherwise to 2. The purpose of it is that with large buffers the accuracy is good enough while looping over every second** element/pair during the autocorrelation. This behaviour can be changed to standard looping over every element/pair bt simply passing value 1 to the perform method. It should be noted that with larger buffers not skipping any element results in higher latency where skipping every second pair boosts execution time by ~60-70% in case of buffers over 8192 samples compared to standard loop over every element/pair and in both scenarios the difference in results is around 4th decimal place therefore by default in case of larger buffers the algorithm sets defaultCorrelationSampleStep to 2.
perform	buf: Float32Array,defaultCorrelationSampleStep: uint = <1 or 2 depending on buffer size>	double	This method receives buffer with data that will be processed up to the length specified in the this.buflen member. If RMS will be too low, meaning the signal is too weakk, -1 will be returned. In case autocorrelation algorithm result will be higher than this.correlationThreshold the output will be the fundamental frequency of the passed buffer, otherwise it will return -1. As mentioned before, defaultCorrelationSampleStep determines the incrementation of data for loops going through the buffer. The higher the value the more values/pairs will be skipped. It shouldn't be set to value higher than 2. For smaller buffers (< 8192) it's set to 1, for larger ones it's set to 2 to minimize latency.
_checkRms	buf: Float32Array,defaultCorrelationSampleStep: uint = <1 or 2 depending on buffer size>	bool	Calculate sum of squares of all the values in the buffer and returns true if the square sum divided by amount of elements is higher than value specified in the constructor: this.rmsThreshold.

DeviceHandler

File in GitHub
Main purpose of this class is interaction with navigator.mediaDevices and for that reason it uses a private helper class Device. Device class instances returned from methods of this class are only the copies of actual stored objects to keep the data stored by the instance consistent regardless of user actions on obtained device data.

Method	Arguments	Return value	Description
constructor	callback: function, navigator: Optionalobject	DeviceHandler	Callback passed to the constructor will be called on every ondevicechange event triggered from navigator.mediaDevices. Optional navigator field should be initialized with window.navigator object or an object with the same interface.
async deviceChangeEvent	N/A	void	This method is called on every device change and is responsible for invoking the user callback passed previously to the constructor. It is called right after the invocation of updateDeviceList method.
async updateDeviceList	N/A	void	This method updates the list of cached audio devices. It is called at every "ondevicechange" event generated by the navigator.mediaDevices. The previous list is completely cleared before creating the new one.
getFullDeviceList	N/A	ArrayDevice	Returns an array of devices (Device class instances) available through navigator that contains MediaDeviceInfo as well as it's direction, input or output.
getDeviceList	requestedDirection: Device.direction	ArrayDevice	Returns an array of devices in requested direction (Device class instances) available through navigator that contains MediaDeviceInfo as well as it's direction, input or output. Should be used with Device.direction.(input or output) to not use raw strings
getCurrentOrFirst	N/A	Object{ in: Device, out: Device}	Returns a object containing a pair of devices - in (input) and out (output). If values this.currentInput and this.currentOutput are set than this devices will be the value in the object. In case current device is not set than a first available one in respective direction will be set up in place of the ones supposed to bo holded by the instance.
changeDevice	direction: Device.direction,deviceId: Optionalstring	void	In this method direction is a string stating the direction of the device that's going to be changed. If present than this.current-direction-device will be set to the device found in device list with requested id, or undefined in case of id that was not found. In case of no id passed to the method the first available device in requested direction will be chosen. Lastly the user defined callback handling device change will be called to which current device list of all available devices wil be passed along with the current input and output devices hold by the instance itself. Should be used with Device.direction.(input or output) to not use raw strings
changeInput	deviceId: string	void	Shorthand for await deviceHandlerInstance.changeDevice('input', e)
changeOutput	deviceId: string	void	Shorthand for await deviceHandlerInstance.changeDevice('output', e)
checkForInput	N/A	bool	Returns boolean, true if there's at least one available input device and false if there's none.
navigatorInput	N/A	UnionObject{ exact: string }, undefined	Returns a constraint for navigator used in audio stream setup stating exact input device. The device will be this.currentInput if set, or first available one. If no input devices are accessible undefined will be returned.

Device

File in GitHub
A class representing navigators mediaDevices. It has no methods, holding only
values: id: device id, label: device label, and dir: device direction
Array of instances of this class is returned from the getDeviceList method of DeviceHandler.
Along the device direction there are also two boolean flags related to it: isOutput and isInput for more convenient array checks and filtering.
Class contains copy method for more convenient deep copies.
For more convenient direction description instead of raw strings class contains a static object serving as enum which can be accessed as Device.direction.(input|output).
For more convenient device type description instead of raw strings class contains a static object serving as enum which can be accessed as Device.type.(audio|video), nonetheless only "audio" option is used/checked in the whole implementation.

SoundStorage

File in GitHub
Class supposed to serve as a storage for outputs of the Correlation class holding methods helping correct sound frequency estimations in short periods of time.

Method	Arguments	Return value	Description
constructor	bias = 0.03: double <0, 1)	SoundStorage	The only parameter for the constructor is bias which will be assigned to the this.biasThreshold member which purpose is removing outlier values during sound estimation. By default, it is set to 0.03. The lower the value the higher similarity sound values will have to have the most frequent value in this.freqArr for those to be taken into account during estimation.
add	fx: double	self	Adds single sound data from the Correlation to the this.freqArr member with 2 decimal points accuracy.
average	N/A	double	Returns rounded average of all the values in this.freqArr
most	Array	double	Returns most frequent value in given array
determine	N/A	double	Returns determined sound frequency based on the hold samples within this.freqArr. It is calculated by calculating a bias of most frequent value this.biasThreshold*. From there an average value is calculated based on all the values within the biased similarity to that most frequent value.
selfCheck	N/A	int	Returns current length of the array this.freqArr holding samples.
emptyData	N/A	self	Empties this.freqArr and returns the SoundStorage instance back.

SoundStorageEvent

File in GitHub
This class has the same purpose as SoundStorage extending it with a difference of utilizing EventEmitter allowing more diverse interactions with the storage.

Method	Arguments	Return value	Description
constructor	sampleTarget = 20: uint,sampleLimit = 40: uint,bias = 0.03: double <0, 1)	SoundStorageEvent	The bias has the same purpose as in SoundStorage. Introduced here sampleTarget is a value representing this.freqArr length at which "SampleTarget" event will be triggered. The sampleLimit works the way as sampleTarget dispatching "SampleLimit" event upon reaching defined this.freqArr length.
add	frequency: double	void	Checks if current this.freqArr requires an event emission. After that section a base class add(fx) method is called.
getCurrentBias	N/A	Object{ most: double, bias: double}	Returns current bias value based on user defined bias and most frequent sample value.
getOutliers	N/A