2.6.0 • Published 6 months ago

llm-polyglot v2.6.0

Weekly downloads
-
License
MIT
Repository
github
Last release
6 months ago

llm-polyglot extends the OpenAI SDK to provide a consistent interface across different LLM providers. Use the same familiar OpenAI-style API with Anthropic, Google, and others.

Provider Support

Native API Support Status:

Provider APIStatusChatBasic StreamFunctions/Tool callingFunction streamingNotes
OpenAIDirect SDK proxy
AnthropicClaude models
GoogleGemini models + context caching
Azure🚧OpenAI model hosting
Cohere----Not supported
AI21----Not supported

Stream Types:

  • Basic Stream: Simple text streaming
  • Partial JSON Stream: Progressive JSON object construction during streaming
  • Function Stream: Streaming function/tool calls and their results

OpenAI-Compatible Hosting Providers:

These providers use the OpenAI SDK format, so they work directly with the OpenAI client configuration:

ProviderHow to UseAvailable Models
TogetherUse OpenAI client with Together base URLMixtral, Llama, OpenChat, Yi, others
AnyscaleUse OpenAI client with Anyscale base URLMistral, Llama, others
PerplexityUse OpenAI client with Perplexity base URLpplx-* models
ReplicateUse OpenAI client with Replicate base URLVarious open models

Installation

# Base installation
npm install llm-polyglot openai

# Provider-specific SDKs (as needed)
npm install @anthropic-ai/sdk    # For Anthropic
npm install @google/generative-ai # For Google/Gemini

Basic Usage

import { createLLMClient } from "llm-polyglot";

// Initialize provider-specific client
const client = createLLMClient({
  provider: "anthropic" // or "google", "openai", etc.
});

// Use consistent OpenAI-style interface
const completion = await client.chat.completions.create({
  model: "claude-3-opus-20240229",
  messages: [{ role: "user", content: "Hello!" }],
  max_tokens: 1000
});

Provider-Specific Features

Anthropic

The llm-polyglot library provides support for Anthropic's API, including standard chat completions, streaming chat completions, and function calling. Both input paramaters and responses match exactly those of the OpenAI SDK - for more detailed documentation please see the OpenAI docs: https://platform.openai.com/docs/api-reference

The anthropic sdk is required when using the anthropic provider - we only use the types provided by the sdk.

  bun add @anthropic-ai/sdk
const client = createLLMClient({ provider: "anthropic" });

// Standard completion
const response = await client.chat.completions.create({
  model: "claude-3-opus-20240229",
  messages: [{ role: "user", content: "Hello!" }]
});

// Streaming
const stream = await client.chat.completions.create({
  model: "claude-3-opus-20240229",
  messages: [{ role: "user", content: "Hello!" }],
  stream: true
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

// Tool/Function calling
const result = await client.chat.completions.create({
  model: "claude-3-opus-20240229",
  messages: [{ role: "user", content: "Analyze this data" }],
  tools: [{
    type: "function",
    function: {
      name: "analyze",
      parameters: {
        type: "object",
        properties: {
          sentiment: { type: "string" }
        }
      }
    }
  }]
});

Google (Gemini)

The llm-polyglot library provides support for Google's Gemini API including:

  • Standard chat completions with OpenAI-compatible interface
  • Streaming chat completions with delta updates
  • Function/tool calling with automatic schema conversion
  • Context caching for token optimization (requires paid API key)
  • Grounding support with Google Search integration
  • Safety settings and model generation config
  • Session management for stateful conversations
  • Automatic response transformation with source attribution

The Google generative-ai sdk is required when using the google provider:

  bun add @google/generative-ai

To use any of the above functionality, the schema matches OpenAI's format since we translate the OpenAI params spec into Gemini's model spec.

Basic Usage

const client = createLLMClient({ provider: "google" });

// Standard completion
const completion = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "Hello!" }],
  max_tokens: 1000
});

// With grounding (Google Search)
const groundedCompletion = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "What are the latest AI developments?" }],
  groundingThreshold: 0.7,
  max_tokens: 1000
});

// With safety settings
const safeCompletion = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "Tell me a story" }],
  additionalProperties: {
    safetySettings: [{
      category: "HARM_CATEGORY_HARASSMENT",
      threshold: "BLOCK_MEDIUM_AND_ABOVE"
    }]
  }
});

// With session management
const sessionCompletion = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "Remember this: I'm Alice" }],
  additionalProperties: {
    sessionId: "user-123"
  }
});

Context Caching

Context Caching is a feature specific to Gemini that helps cut down on duplicate token usage by allowing you to create a cache with a TTL:

// Create a cache
const cache = await client.cacheManager.create({
  model: "gemini-1.5-flash-8b",
  messages: [{ role: "user", content: "Context to cache" }],
  ttlSeconds: 3600 // Cache for 1 hour
});

// Use the cached context
const completion = await client.chat.completions.create({
  model: "gemini-1.5-flash-8b",
  messages: [{ role: "user", content: "Follow-up question" }],
  additionalProperties: {
    cacheName: cache.name
  }
});

Function/Tool Calling

const completion = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "Analyze this data" }],
  tools: [{
    type: "function",
    function: {
      name: "analyze",
      parameters: {
        type: "object",
        properties: {
          sentiment: { type: "string" }
        }
      }
    }
  }],
  tool_choice: {
    type: "function",
    function: { name: "analyze" }
  }
});

Error Handling

2.3.0

8 months ago

2.5.0

6 months ago

2.4.0

6 months ago

2.3.1

8 months ago

2.6.0

6 months ago

2.2.0

10 months ago

1.0.1

1 year ago

2.1.0

11 months ago

2.0.0

1 year ago

1.0.0--beta.2

1 year ago

1.0.0--beta.1

1 year ago

1.0.0

1 year ago

1.0.0--beta.0

1 year ago

0.0.1

1 year ago

0.0.3

1 year ago

0.0.2

1 year ago

0.0.1--alpha.3

1 year ago

0.0.1--alpha.2

1 year ago

0.0.1--alpha.1

1 year ago

0.0.1--alpha.0

1 year ago