Universal-llm-completion NPM

Universal LLM Completion Module (TypeScript)

Overview

The Universal LLM Completion Module is a flexible and robust TypeScript solution for integrating Large Language Models (LLMs) into your applications. It provides a unified interface for both OpenAI's GPT models and Anthropic's Claude models, supporting both streaming and non-streaming responses, as well as handling tool calls.

Key features:

Unified TypeScript API for GPT and Claude models
Support for streaming and non-streaming responses
Handling of tool calls with type safety
Built-in cancellation mechanism
Compatible with both Server-Sent Events (SSE) and WebSocket connections
Full TypeScript support with robust type definitions

Installation

npm install universal-llm-completion

Usage

Basic Example

import express from 'express';
import { Server } from 'socket.io';
import { createServer } from 'http';
import { llmCompletion, cancelConnection, LLMCompletionOptions, LLMResponse } from 'universal-llm-completion';

const app = express();
const httpServer = createServer(app);
const io = new Server(httpServer);

// For Express (SSE)
app.post('/llm-stream', async (req: express.Request, res: express.Response) => {
  const { payload, uuid } = req.body;
  const options = {
    payload,
    clientObj: res,
    uuid,
    isSocket: false,
    apiKey: process.env.OPENAI_API_KEY // or ANTHROPIC_API_KEY
  };
  
  try {
    const response = await llmCompletion(options);
  } catch (error) {
    console.error('Error in LLM completion:', error);
    res.status(500).json({ error: 'An error occurred during LLM processing' });
  }
});

// For WebSocket
io.on('connection', (socket) => {
  socket.on('llm-request', async ({ payload, uuid }) => {
    const options = {
      payload,
      clientObj: socket,
      uuid,
      isSocket: true,
      apiKey: process.env.OPENAI_API_KEY // or ANTHROPIC_API_KEY
    };
    
    try {
      const response = await llmCompletion(options);
    } catch (error) {
      console.error('Error in LLM completion:', error);
      socket.emit('error', { message: 'An error occurred during LLM processing' });
    }
  });
});

// To cancel a stream
app.post('/cancel-stream', (req: express.Request, res: express.Response) => {
  const { uuid } = req.body;
  const cancelled = cancelConnection(uuid);
  res.json({ success: cancelled });
});

httpServer.listen(3000, () => {
  console.log('Server is running on port 3000');
});

API Reference

llmCompletion(options): Promise

Main function to interact with LLMs.

Parameters:

options (LLMCompletionOptions):
- payload (LLMPayload): The request payload in OpenAI format.
- clientObj (Express.Response | Socket): The client object (Express response object or WebSocket).
- uuid (string): A unique identifier for the request.
- isSocket (boolean): Whether the client is a WebSocket connection.
- apiKey (string): The API key for the LLM service.

Returns:

A Promise that resolves to an LLMResponse object containing:

completion (string): The generated text.
toolCalls (ToolCall[]): Any tool calls made during the completion.
usage (Usage): Token usage information.
cancelled (boolean): Whether the request was cancelled.

Types:

interface LLMPayload {
  model: string;
  messages: Message[];
  temperature?: number;
  max_tokens?: number;
  stream: boolean;
  tools?: FunctionTool[];
  tool_choice?: 'auto' | { type: 'function'; function: { name: string } };
  // ... other OpenAI parameters
}

interface Message {
  role: 'system' | 'user' | 'assistant' | 'function';
  content: string;
  name?: string;
  function_call?: {
    name: string;
    arguments: string;
  };
}

interface FunctionTool {
  type: 'function';
  function: {
    name: string;
    description: string;
    parameters: Record<string, unknown>;
  };
}

interface ToolCall {
  id: string;
  type: string;
  function: {
    name: string;
    arguments: string;
  };
}

interface Usage {
  prompt_tokens: number;
  completion_tokens: number;
  total_tokens: number;
}

cancelConnection(uuid: string): boolean

Function to cancel an ongoing stream.

Parameters:

uuid (string): The unique identifier of the request to cancel.

Returns:

Boolean indicating whether the cancellation was successful.

Detailed Usage

Streaming Responses

For streaming responses, set stream: true in the payload. The module will handle the streaming process internally:

For SSE (Server-Sent Events), it will send chunks of data to the client as they are received.
For WebSocket connections, it will emit 'llm-response' events with the chunks of data.

Example of handling streaming response on the client side:

// For SSE
const eventSource = new EventSource('/llm-stream');
eventSource.onmessage = (event) => {
  const data = JSON.parse(event.data);
  if (data.type === 'text') {
    console.log('Received text:', data.content);
  } else if (data.type === 'llm-tool-call-status') {
    console.log('Tool call status:', data);
  }
};

// For WebSocket
socket.on('llm-response', (data) => {
  const parsedData = JSON.parse(data);
  if (parsedData.type === 'text') {
    console.log('Received text:', parsedData.content);
  }
});

socket.on('llm-tool-call-status', (data) => {
  console.log('Tool call status:', data);
});

Handling Tool Calls

The module supports tool calls (function calling) for both OpenAI and Anthropic models. Here's an example of how to set up a payload with tool calls:

const payload: LLMPayload = {
  model: "gpt-4o",
  messages: [
    { role: "system", content: "You are a helpful assistant with access to tools." },
    { role: "user", content: "What's the weather like in New York?" }
  ],
  stream: true,
  tools: [
    {
      type: "function",
      function: {
        name: "get_current_weather",
        description: "Get the current weather in a given location",
        parameters: {
          type: "object",
          properties: {
            location: {
              type: "string",
              description: "The city and state, e.g. San Francisco, CA"
            },
            unit: {
              type: "string",
              enum: ["celsius", "fahrenheit"]
            }
          },
          required: ["location"]
        }
      }
    }
  ],
  tool_choice: "auto"
};

The module will handle the tool calls internally and include them in the response. You can implement the actual tool functionality in your application and respond to the LLM with the tool's output.

Error Handling

The module uses a try-catch block internally. If an error occurs, it will be logged and re-thrown. Make sure to implement proper error handling in your application, as shown in the examples above.

Configuration

To switch between different API keys (e.g., for rate limit handling), you can pass the apiKey in the LLMCompletionOptions. If a rate limit is reached (status 429), you can retry the request with a different API key.

Limitations

The module currently supports GPT and Claude models. Other LLMs may require additional implementation.
Image inputs are not supported in the current version.

Dependencies

This module uses axios for HTTP requests and eventsource-parser for handling server-sent events. These dependencies are automatically installed when you install this package.

API Keys

To use this module, you will need to provide your own API keys:

For OpenAI models: An OpenAI API key
For Anthropic models: An Anthropic API key

These should be set as environment variables in your application or passed directly to the llmCompletion function.

Optional Integrations

While this module doesn't require the official OpenAI or Anthropic SDKs, it is compatible with projects that use them. If you're already using these SDKs in your project, this module will not conflict with them.