@customerglu/api-retry-monitor NPM

API Retry Monitor

A robust API retry and monitoring library that adds intelligent retry capability with Slack reporting for any axios-based API calls. This package provides automatic random delay retries, failure tracking, and real-time alerts.

Installation

The package is available on GitHub Packages. Add it to your project with npm:

# Install:
npm install @customerglu/api-retry-monitor

Quick Start

Basic Integration

Replace your direct axios calls with the retry-enabled version:

// Before
const axios = require("axios");
const response = await axios.get("https://api.example.com/data");

// After
const { createApiRetryMonitor } = require("@customerglu/api-retry-monitor");
const apiMonitor = createApiRetryMonitor({
  slackWebhookUrl: process.env.SLACK_WEBHOOK_URL,
});
const axiosInstance = apiMonitor.createAxiosInstance();
const response = await axiosInstance.get("https://api.example.com/data");

Adding Context for Better Failure Tracking

// Add context to get better reporting
await axiosInstance.post("https://api.example.com/events", data, {
  context: {
    eventId: "unique-request-id",
    client: "clientName",
    eventType: "dataSync",
  },
});

Integration Examples

Express Application

const express = require("express");
const { createApiRetryMonitor } = require("@customerglu/api-retry-monitor");

const app = express();
const apiMonitor = createApiRetryMonitor({
  slackWebhookUrl: process.env.SLACK_WEBHOOK_URL,
});

// Create a reusable axios instance
const api = apiMonitor.createAxiosInstance();

app.get("/data", async (req, res) => {
  try {
    // Use with automatic retry handling
    const response = await api.get("https://api.example.com/data", {
      context: { requestId: req.id, client: req.query.client },
    });
    res.json(response.data);
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

// Clean shutdown
process.on("SIGTERM", () => {
  apiMonitor.shutdown();
  server.close();
});

Serverless Function

const { createApiRetryMonitor } = require("@customerglu/api-retry-monitor");

// Create outside handler for reuse between invocations
const apiMonitor = createApiRetryMonitor({
  slackWebhookUrl: process.env.SLACK_WEBHOOK_URL,
  alertWindowHours: 1,
});
const api = apiMonitor.createAxiosInstance();

exports.handler = async (event) => {
  try {
    const result = await api.post(
      "https://api.example.com/process",
      event.body,
      { context: { eventId: event.requestId } }
    );
    return { statusCode: 200, body: JSON.stringify(result.data) };
  } catch (error) {
    return { statusCode: 500, body: JSON.stringify({ error: error.message }) };
  }
};

Using with Any Promise-Based API Call

const { createApiRetryMonitor } = require("@customerglu/api-retry-monitor");

const apiMonitor = createApiRetryMonitor({
  maxRetries: 5,
  minDelayMs: 2000,
  maxDelayMs: 8000,
});

// Works with any promise-returning function, not just axios
const result = await apiMonitor.executeWithRetry(
  async () => {
    // Database query, API call, or any async operation
    const result = await someAsyncOperation();
    return result;
  },
  { eventId: "operation-id" }
);

Configuration Options

Create your api-retry-monitor with the following configuration options:

const apiMonitor = createApiRetryMonitor({
  // Required
  slackWebhookUrl: process.env.SLACK_WEBHOOK_URL, // Where to send alerts

  // Optional with defaults
  alertWindowHours: 1, // How often to send summary reports (in hours)
  burstThreshold: 5, // How many failures trigger an immediate alert
  burstWindowSeconds: 60, // Time window for burst detection
  flushThreshold: 20, // Max records before flushing to prevent memory leaks
  maxRetries: 10, // Maximum retry attempts before giving up
  minDelayMs: 1000, // Minimum retry delay (1 second)
  maxDelayMs: 10000, // Maximum retry delay (10 seconds)
});

Alert Types

The library provides two types of Slack alerts:

Burst Failure Alerts - Sent immediately when multiple failures occur in a short time
Scheduled Reports - Sent at regular intervals with failure statistics

Both include:

Total failure count
Successful retries count
Permanent failures count
Client summaries
Status code breakdowns
Event type statistics

Tips for Effective Implementation

Add Meaningful Context: Always include identifiers in the context object
Handle Graceful Shutdowns: Call apiMonitor.shutdown() when your application is shutting down
Configure Alert Windows: Set appropriate alertWindowHours based on your traffic and operation patterns
Set Practical Retry Limits: Balance between persistence and failing fast with maxRetries

Troubleshooting

No Slack Alerts: Verify your Slack webhook URL is correct and the channel exists
Memory Usage: If you notice high memory usage, lower the flushThreshold value
Excess Alerts: Increase burstThreshold or burstWindowSeconds to reduce alert frequency

Features

Random delay retry mechanism (1-10 seconds)
Maximum of 10 retry attempts
Failure tracking with memory management
Hourly reporting via Slack
Burst failure detection (5+ failures in 1 minute)
Support for Axios HTTP client

api retry monitoring slack

7 months ago

7 months ago

7 months ago