als-statistics v1.2.0
als-statistics
als-statistics is a lightweight JavaScript library for statistical analysis, outlier detection, moving averages, and data manipulation. Version 1.0.0 is designed for both browser and Node.js environments, offering an intuitive API for quick data exploration and advanced statistical operations.
Note: This is a major release (1.0.0), replacing earlier 0.6.x versions. Key improvements include:
- Simplified column types:
Column
for strings andRatioColumn
for numbers.- Enhanced row filtering with
filterRowsBy
at both column and table levels.- Automatic column type detection in tables.
- Static utilities for standalone use (
range
,newTable
,newColumn
).
📢 What's New in Version 1.1.0
Version 1.1.0 introduces key improvements and new statistical functions:
🔹 Improvements:
filterColumns
now supports filtering by numeric values.table.clone
no longer adds non-existent columns when cloning.
🔹 New Features:
- Cronbach's Alpha: Measures internal consistency and reliability of a scale.
- Pearson Correlation (Pearson's r): Computes the correlation between two columns (
pearsonPopulation
andpearsonSample
).
These additions make als-statistics
a powerful tool for reliability analysis and correlation studies.
Installation
Via npm
npm install als-statistics
Browser Usage
Include the script in your HTML:
<script src="node_modules/als-statistics/statistics.js"></script>
<script>
const { newColumn } = Statistics;
const col = newColumn([1, 2, 3]);
console.log(col.mean); // 2
</script>
Quick Start: Plug and Play
als-statistics lets you analyze data in two ways: standalone columns/tables or organized within a Statistics
instance. Here's how to get started.
Standalone Columns
Create columns independently for instant analysis:
const Statistics = require("als-statistics");
// Numeric column (RatioColumn)
const numbers = Statistics.newColumn([1, 2, 3, 4, 5]);
console.log(numbers.mean); // 3
console.log(numbers.median); // 3
// Categorical column (Column)
const categories = Statistics.newColumn(["A", "B", "A", "C"]);
console.log(categories.frequencies); // { A: 2, B: 1, C: 1 }
- How It Works:
Statistics.newColumn(values)
automatically selects:- Numbers →
RatioColumn
(full statistical toolkit). - Strings →
Column
(frequency analysis).
- Numbers →
- Ideal for quick, one-off calculations.
Standalone Tables
Manage multiple columns with a table:
const table = Statistics.newTable();
table.addColumn("Numbers", [10, 20, 30, 40]);
table.addColumn("Labels", ["X", "Y", "Z", "W"]);
// Access column data
console.log(table.columns["Numbers"].mean); // 25
console.log(table.columns["Labels"].frequencies); // { X: 1, Y: 1, Z: 1, W: 1 }
- How It Works:
addColumn(name, values)
detects the data type and creates the appropriate column.- Tables support filtering, cloning, and advanced operations (like descriptive statistics, transpose rows to columns and more).
Filtering Rows
Filter rows easily at the column or table level:
// Filter via a specific column
table.filterRowsBy("Numbers", v => v > 20); // Excludes rows where "Numbers" > 20
console.log(table.columns["Numbers"].values); // [10, 20]
console.log(table.columns["Labels"].values); // ["X", "Y"]
// Or filter directly on a column
const col = table.columns["Numbers"];
col.filterRowsBy(v => v === 10); // Excludes rows where "Numbers" === 10
console.log(col.values); // [20]
console.log(table.columns["Labels"].values); // ["Y"]
table.clearAllRowsFilters(); // Resets all filters
- Note:
filterRowsBy
excludes rows matching the condition. Usev => !condition
to keep matches.
Organizing with Statistics
Group multiple tables:
const stats = new Statistics();
const t1 = stats.addTable("Table1");
t1.addColumn("Data", [1, 2, 3]);
const t2 = stats.addTable("Table2");
t2.addColumn("Data", [4, 5, 6]);
// Aggregate metrics across tables
const means = stats.descriptive("mean");
console.log(means.columns["Table1"].values[0]); // 2
console.log(means.columns["Table2"].values[0]); // 5
Column Types: What You Get
Column
(Categorical Data)
- Use For: Strings (e.g., categories, labels).
- Created By:
Statistics.newColumn(["A", "B", "C"])
ortable.addColumn("Name", ["A", "B", "C"])
. - Key Features:
values
: Current data after filters.n
: Number of unfiltered rows.frequencies
: Counts of unique values (e.g.,{ A: 2, B: 1 }
).relativeFrequencies
: Proportions (e.g.,{ A: 0.67, B: 0.33 }
).sorted
: Alphabetically sorted values.percentile(p)
,median
,q1
,q3
: Percentile-based metrics (treats strings as ordered).filterRows(indexes)
: Excludes rows by index.filterRowsBy(fn)
: Excludes rows wherefn(value, index)
istrue
.clearRowsFilters(indexes)
,clearAllRowsFilters()
: Clears filters.clone(filtered = true)
: Creates a copy with filtered or original data.
Example:
const col = Statistics.newColumn(["A", "B", "A", "C"]);
col.filterRowsBy(v => v === "B"); // Excludes "B"
console.log(col.values); // ["A", "A", "C"]
console.log(col.frequencies); // { A: 2, C: 1 }
RatioColumn
(Numeric Data)
- Use For: Numbers (e.g., measurements, scores).
- Created By:
Statistics.newColumn([1, 2, 3])
ortable.addColumn("Name", [1, 2, 3])
. - Key Features (extends
Column
):- Basics:
sum
,mean
,min
,max
,range
- Variability:
- Population:
variancePopulation
,stdDevPopulation
,skewnessPopulation
,kurtosisPopulation
- Sample:
varianceSample
,stdDevSample
,skewnessSample
,kurtosisSample
- Population:
- Dispersion:
cv
,relativeDispersion
,iqr
- Advanced:
geometricMean
,harmonicMean
,flatness
,sumOfSquares
,normalizedValues
,zScores
,confidenceInterval95
,spectralPowerDensityArray
,spectralPowerDensityMetric
,noiseStability
- Outliers:
outliersIQR
,outliersZScore(threshold = 3, twoFactors = true)
- Methods:
weightedMean(weights)
,ma(windowSize)
,noice()
- Basics:
Example:
const col = Statistics.newColumn([1, 2, 3, 100]);
col.filterRowsBy(v => v > 10); // Excludes 100
console.log(col.mean); // 2
console.log(col.outliersIQR); // []
Table Features
Tables organize columns and provide powerful tools:
Creating & Adding Columns
const table = Statistics.newTable();
table.addColumn("Scores", [10, 20, 30, 40]);
table.addColumn("Grades", ["A", "B", "C", "D"]);
- Note: Column type (
Column
orRatioColumn
) определяется автоматически по данным.
Filtering Rows
table.filterRows([1]); // Excludes row index 1
console.log(table.columns["Scores"].values); // [10, 30, 40]
// Filter by a specific column
table.clearAllRowsFilters();
table.filterRowsBy("Scores", v => v > 20); // Excludes rows where "Scores" > 20
console.log(table.columns["Scores"].values); // [10, 20]
console.log(table.columns["Grades"].values); // ["A", "B"]
table.clearAllRowsFilters(); // Resets all filters
Computing New Columns
table.compute(({ Scores }) => Scores * 2, "Doubled");
console.log(table.columns["Doubled"].values); // [20, 40, 60, 80]
Transposing
const transposed = table.transpose();
console.log(transposed.columns["0"].values); // [10, "A"]
Comparing Columns
const comp = table.compare("Scores", "Doubled");
console.log(comp.correlationSample); // 1.0
📊 DBSCAN Clustering
The dbscan(eps, minPts)
method performs Density-Based Spatial Clustering of Applications with Noise (DBSCAN), a popular clustering algorithm.
🔹 Parameters:
eps
(default:0.4
) – Maximum distance between points to be considered neighbors.minPts
(default:3
) – Minimum number of neighbors required to form a cluster.
🔹 Usage:
const table = Statistics.newTable();
table.addColumn('A', [1, 2, 3, 4, 5]);
table.addColumn('B', [10, 20, 30, 40, 50]);
table.addColumn('C', [100, 200, 300, 400, 500]);
const dbscan = table.dbscan(0.5, 2);
console.log(dbscan.labels); // Example output: [1, 1, 2, 2, -1] (-1 means noise)
console.log(dbscan.clusters.length); // Number of clusters formed
🔹 How It Works:
- Computes pairwise distances between all columns based on correlation.
- Expands clusters using density connectivity.
- Assigns labels:
-1
: Noise1, 2, ...
: Cluster IDs
🔹 Use Cases:
- Grouping correlated features in datasets.
- Detecting noise/outliers in data.
- Automatic feature selection by finding similar attributes.
Aggregating Metrics
const means = table.descriptive("mean");
console.log(means.columns["Scores"].values[0]); // 25
Cloning and Selecting Columns
Clone a table to work with a subset of data or columns:
// Clone with all columns and current row filters
const fullClone = table.clone(true);
console.log(fullClone.columns["Scores"].values); // [10, 20, 30, 40]
// Clone with selected columns
const selectedClone = table.clone(true, ["Scores"]); // Only "Scores"
console.log(selectedClone.columns["Scores"].values); // [10, 20, 30, 40]
console.log(selectedClone.columns["Grades"]); // undefined
// Clone excluding columns
const excludedClone = table.clone(true, ["-Grades"]); // All except "Grades"
console.log(excludedClone.columns["Scores"].values); // [10, 20, 30, 40]
console.log(excludedClone.columns["Grades"]); // undefined
// Clone with regex filter
table.addColumn("Scores2", [1, 2, 3, 4]);
const regexClone = table.clone(true, [/^Scores/]); // Columns starting with "Scores"
console.log(regexClone.columns["Scores"].values); // [10, 20, 30, 40]
console.log(regexClone.columns["Scores2"].values); // [1, 2, 3, 4]
- Options for
clone(filtered, columnFilter)
:filtered
:true
(keep row filters) orfalse
(use original data).columnFilter
: Array of filters:"Name"
: Include this column."-Name"
: Exclude this column (if no includes specified)./pattern/
: Include columns matching the regex.
- Note: If explicit includes (e.g.,
"A"
) are used, exclusions (e.g.,"-B"
) are ignored.
Removing Columns
Delete unwanted columns:
table.deleteColumn("Grades");
console.log(table.columns["Grades"]); // undefined
Managing Multiple Tables with Statistics
Use Statistics
to group tables:
const stats = new Statistics();
stats.addTable("Sales").addColumn("Revenue", [100, 200, 300]);
stats.addTable("Costs").addColumn("Expenses", [50, 100, 150]);
stats.filterRows([1]); // Filters all tables
const statsMeans = stats.descriptive("mean");
console.log(statsMeans.columns["Sales"].values[0]); // 200
- Methods:
filterRows(indexes)
: Applies to all tables.clearRowsFilters(indexes)
,clearAllRowsFilters()
: Clears filters across tables.
Noise Analysis
Analyze noise in numeric data:
const col = Statistics.newColumn([1, 2, 3, 100]);
const noise = col.noice();
console.log(noise.noiseByZ(0.1)); // false (outlier present)
📏 Cronbach's Alpha
Cronbach's Alpha is a measure of internal consistency, commonly used to evaluate the reliability of a questionnaire or test.
🔹 Usage:
const table = Statistics.newTable();
table.addColumn('Q1', [4, 5, 3, 4, 5]);
table.addColumn('Q2', [3, 4, 2, 3, 4]);
table.addColumn('Q3', [5, 3, 4, 5, 2]);
const alpha = table.cronbachAlpha;
console.log(alpha.alpha); // Example: 0.74
🔹 Compute Alpha if an Item is Removed:
console.log(alpha.perColumn);
// { Q1: 0.65, Q2: 0.68, Q3: 0.71 } (example values)
If
alpha = 0
, it means that all responses are identical (zero variance).
📊 Pearson Correlation (Pearson's r)
Pearson’s correlation coefficient (r
) measures the strength of a linear relationship between two numeric columns.
🔹 Usage:
const table = Statistics.newTable();
table.addColumn('X', [10, 20, 30, 40, 50]);
table.addColumn('Y', [15, 25, 35, 45, 55]);
const correlation = table.compare('X', 'Y').pearsonSample; // or pearsonPopulation
console.log(correlation.r); // 1 (perfect positive correlation)
console.log(correlation.p); // 0 (statistical significance)
Note:
p-value
determines statistical significance:
- If
p < 0.05
, the correlation is significant.- If
p > 0.05
, the correlation may be random.
Utilities
Statistics.range(start, end, step)
: Generates an array of numbers.const r = Statistics.range(1, 5, 2); // [1, 3]
Notes
- Filtering:
filterRowsBy(colName, fn)
orcolumn.filterRowsBy(fn)
excludes rows wherefn
returnstrue
. Use!fn
to keep matches. When used via a table, the filter applies to all columns. - Caching: Metrics are cached and refresh automatically when filters change.
- Version 1.0.0: Unified
Column
/RatioColumn
system replaces older types.