npm.io
0.1.0-alpha.26 • Published 19h ago

@llm-ports/adapter-ollama

Licence
MIT
Version
0.1.0-alpha.26
Deps
2
Size
162 kB
Vulns
0
Weekly
109
Stars
2

llm-ports: TypeScript LLM Abstraction Layer for Multi-Provider AI Systems

Provider-agnostic LLM architecture for TypeScript.

Switch providers without changing code.
Avoid vendor lock-in.
Control cost.
Reuse prompts as capabilities.

Multi-provider routing • fallback chains • USD cost gating • capability factories • tool-use security • observability

License: MIT Status TypeScript

Current release: 0.1.0-alpha.26 — canonical messages: LLMMessage[] input on all four generation methods. The { instructions, prompt } shape is @deprecated in alpha.26 and will be removed in alpha.27 (~2 weeks). Migrate mechanically with toMessages(instructions, prompt) (~30 min for a 20-site consumer) or idiomatically with sys(text) + usr(content) helpers. Multi-turn workloads (chat, interview agents, coaching workflows) are now first-class. See the alpha.25 → alpha.26 migration guide.

Previous release: 0.1.0-alpha.25 — three additive features under an "Observability surface + reliability hardening" theme: refs?: Record<string, ArtifactRef> on every call (#53), runtimeFallback: "aggressive" preset (#54), streamed cost surfacing via onCost / onTokenUsage at stream completion (#55). See the alpha.24 → alpha.25 migration guide.


The Problem

Most LLM applications break in predictable ways:

  • SDK upgrades touch too many files
  • Switching providers requires refactoring
  • Prompt logic is duplicated across features
  • Cost and routing logic are scattered
  • Business logic becomes coupled to provider-specific SDKs

This is not just an SDK problem.

It is an architecture problem.


The Solution

llm-ports applies the ports-and-adapters pattern to LLM systems.

Only two files in your codebase should know the LLM SDK exists.

Everything else talks to a typed interface.

Instead of calling models directly, your application uses reusable capabilities:

  • classify
  • draft
  • score
  • summarize
  • extract
  • plan
  • analyze

The LLM stops being a dependency you manage.
It becomes infrastructure you configure.


What You Get

  • Multi-provider LLM routing across OpenAI, Anthropic, Ollama, Vercel AI SDK, and compatible providers
  • Fallback chains when a provider fails or exceeds budget
  • USD-based cost gating with hourly, daily, and monthly limits
  • Reusable prompt capabilities so prompts are defined once and reused everywhere
  • Validation recovery for structured output failures
  • Tool-use safety primitives for destructive or confirmation-required actions
  • Observability hooks for cost, latency, quality, and outcomes
  • TypeScript-first API with full type support
  • No runtime dependency on LangChain, LlamaIndex, or heavy frameworks

60 Second Setup

1. Configure providers in .env
LLM_PROVIDER_FAST=anthropic|<model>|cost:50/day
LLM_PROVIDER_SMART=anthropic|<model>|cost:200/day
LLM_TASK_ROUTE_TRIAGE=fast,smart
2. Create the port once
import { createRegistryFromEnv } from "@llm-ports/core";
import { createAnthropicAdapter } from "@llm-ports/adapter-anthropic";

export const llm = createRegistryFromEnv({
  adapters: {
    anthropic: createAnthropicAdapter({
      apiKey: process.env.ANTHROPIC_API_KEY!,
    }),
  },
}).getPort();
3. Use it anywhere, with no SDK imports
const result = await llm.generateText({
  taskType: "triage",
  prompt: "Classify this email...",
});

The registry:

  • selects the right model for the task
  • enforces cost limits
  • falls back through the provider chain on failure
  • records usage, cost, and latency

Capabilities: Reusable LLM Operations

Instead of duplicating prompt logic across files, define a capability once and reuse it.

import { createClassifier } from "@llm-ports/capabilities";
import { z } from "zod";

const IntentSchema = z.object({
  intent: z.enum(["question", "request", "complaint", "feedback", "other"]),
  urgency: z.enum(["low", "normal", "high"]),
  reasoning: z.string(),
});

export const classifyIntent = createClassifier({
  port: llm,
  schema: IntentSchema,
  schemaName: "user-intent",
  rubric: `
    question: asking for information
    request: wants something done
    complaint: reports a problem
    feedback: opinion only
    other: anything else
  `,
});

Now call it anywhere:

const result = await classifyIntent({ content: userMessage });

Example output:

{
  intent: "request",
  urgency: "high",
  reasoning: "The user is asking for a concrete action."
}

Why this matters:

  • Improve a prompt once, and every call site benefits
  • Keep behavior consistent across the system
  • Make debugging and evaluation easier
  • Keep business logic free from provider-specific SDK details

Architecture Overview

Before:

Application code
  ├─ direct SDK call
  ├─ direct SDK call
  ├─ direct SDK call
  └─ model router leaking SDK types

After:

Application code
  ↓
Capabilities
  ↓
LLM Port
  ↓
Adapters and Provider Registry
  ↓
LLM providers

The key shift:

Application code stops calling models directly. It calls capabilities.


Packages

Package Purpose
@llm-ports/core Port interfaces, registry, routing, cost gating, validation strategies, content blocks
@llm-ports/capabilities Reusable LLM operation factories
@llm-ports/adapter-openai OpenAI SDK adapter with baseURL support for compatible providers
@llm-ports/adapter-anthropic Anthropic SDK adapter
@llm-ports/adapter-google Google Gemini native adapter (@google/genai SDK) — full multimodal, bundled pricing
@llm-ports/adapter-ollama Ollama native adapter with model management
@llm-ports/adapter-vercel Vercel AI SDK adapter for migration and compatibility

@llm-ports/observability (quality tracking hooks, sinks, deterministic edit-diff helpers) is planned for v0.2.


Examples

Seven runnable examples in examples/, each its own pnpm workspace package with a README walking through the code:

Example What it shows
basic The smallest possible end-to-end. One adapter, one task type, one generateText call. The 60-second-setup demo.
multi-provider Fallback chain (Anthropic primary → OpenAI backup), USD cost gating per provider, capability factory.
email-triage The most common production use case, condensed into ~150 lines. Inbound email → classify (intent + urgency + sentiment) → policy gate → draft brand-voiced reply → queue for human review. Capability composition story.
streaming-chat Express server with three routes: POST /chat (one-shot), POST /chat/stream (Server-Sent Events), POST /chat/agent (tool-augmented). The most common LLM UX patterns in ~30 lines of glue.
extract-from-pdf Document extraction: raw OCR'd invoice text → fully-typed structured object via Zod. Demonstrates generateStructured, validation-retry-with-feedback, and the createExtractor factory.
agent-with-approval Tool-use agent with first-class security primitives. destructive, requiresConfirmation, maxOutputBytes flags + an approval-gate wrapper. The differentiation example.
migrate-from-vercel-ai Two migration paths for users on Vercel AI SDK: (a) wrap your existing model factories with @llm-ports/adapter-vercel, (b) replace @ai-sdk/* with native llm-ports adapters. Side-by-side before/after diffs.

Each example is runnable from the monorepo root:

pnpm --filter @llm-ports/example-<name> start

Set the relevant API key (ANTHROPIC_API_KEY, OPENAI_API_KEY) before running. Each example's README documents which keys it needs.


Supported Use Cases

Use llm-ports when you need:

  • multi-provider LLM routing
  • LLM fallback chains
  • TypeScript LLM abstraction
  • OpenAI and Anthropic provider switching
  • cost control for production LLM applications
  • reusable prompt capabilities
  • structured output validation and recovery
  • tool-use security in agent workflows
  • observability for LLM cost, latency, and quality
  • vendor-neutral AI architecture

When to Use This

Use llm-ports if:

  • you use 2 or more LLM providers
  • you may switch providers later
  • SDK upgrades have caused multi-file changes
  • prompt logic is duplicated
  • cost control matters
  • you want business logic decoupled from provider SDKs

Skip it if:

  • you have 1 or 2 LLM calls
  • you are only prototyping
  • you are intentionally building around one provider-specific feature
  • you want a full agent framework, memory layer, RAG framework, or hosted gateway

Tool How llm-ports relates
Vercel AI SDK Vercel unifies provider calls. llm-ports adds registry, fallback chains, USD cost gating, validation recovery, and capability factories on top.
LiteLLM LiteLLM is a Python-first HTTP proxy. llm-ports is TypeScript and runs in-process with no extra network hop.
Portkey Portkey is a commercial hosted gateway. llm-ports is MIT, in-process, and has no hosted dependency.
LangChain.js LangChain is a framework. llm-ports is a lightweight architecture and control layer.
LlamaIndex.TS LlamaIndex is retrieval-first. llm-ports handles LLM invocation, routing, fallback, and cost control.
Mastra Mastra is agent-first with built-in memory and workflow primitives. llm-ports provides lower-level LLM primitives beneath that layer.

Known Limitations in Alpha

llm-ports is pre-release. The core architecture is stable and the offline regression suite is comprehensive (250+ tests, latency p99 under 1 ms, no doc-rot detected across 110+ snippets). Some adapter and agent paths are still being hardened.

Fourteen medium-impact alpha-bake issues (#1, #3, #4, #5, #6, #9, #12, #14, #16, #19, #20, #21, #24, #32) shipped in 0.1.0-alpha.10.1.0-alpha.13 and are now closed. The alpha line completes the v0.1 surface: Gemini multi-turn runAgent + native responseSchema, runtime model discovery (LLMPort.listModels() across 4 adapters + Registry.checkPricingFreshness()), useStrictResponseFormat on adapter-openai for Cerebras strict-JSON, dangerouslyAllowBrowser opt-in on openai + anthropic, reasoningEffort parameter for o-series / gpt-5-nano / Groq gpt-oss-120b reasoning depth control, capability factories propagating reasoningEffort + signal + forceProviderAlias to the underlying port call, plus an expanded attemptValidationRepair pass that catches markdown-wrapped enums, trailing punctuation, stringified-JSON-as-object, and array-with-single-object misreads. The full per-surface inventory lives at the v0.1 status page.

What's still open:

  • Some compat-provider models (Groq, Together AI, Fireworks, Clarifai, SambaNova) may require a pricingOverrides entry to satisfy the registry's pricing-validation step. Bundled pricing tables cover OpenAI, Anthropic, Google, and Ollama by default. Worked examples for Clarifai's Qwen3.6 35B A3B FP8 and SambaNova's MiniMax-M2.7 are in the openai adapter docs.
  • Vercel adapter runAgent is single-turn only (multi-turn lands in v0.2).
  • Registry walks the chain on budget gating AND on runtime errors (alpha.7+, default predicate: ProviderUnavailableError). Configurable via runtimeFallback: "none" | "default" | { shouldFallback }. Streaming methods walk only on stream-creation failure, not mid-iteration.

If you hit something not listed here, please open an issue — the bug-report template captures the version + repro shape we need.


Installation

llm-ports is in alpha. All 7 packages plus the new @llm-ports/migrate codemod ship at v0.1.0-alpha.20.1. Stable v0.1 lands after a short alpha bake — see the v0.1 status page for what's stable today vs still being hardened.

npm install @llm-ports/core@0.1.0-alpha.20.1

Install adapters as needed:

npm install @llm-ports/adapter-anthropic@0.1.0-alpha.20.1
npm install @llm-ports/adapter-openai@0.1.0-alpha.20.1
npm install @llm-ports/adapter-google@0.1.0-alpha.20.1
npm install @llm-ports/adapter-ollama@0.1.0-alpha.20.1
npm install @llm-ports/adapter-vercel@0.1.0-alpha.20.1
npm install @llm-ports/capabilities@0.1.0-alpha.20.1

(Scoped under @llm-ports. Versioned together via changesets.)

Peer dependency: zod >=3.24.0 <5. Bring your own SDKs (@anthropic-ai/sdk, openai, ollama, ai).

Pinning during the alpha series

Recommended: pin to the exact alpha version, not the @alpha dist-tag, while we're still shape-locking. The @alpha tag tracks the latest published prerelease; a pnpm install or npm update can therefore jump you across breaking changes silently. Exact pins lock the version until you deliberately bump it, at which point you read MIGRATION.md and apply the per-release migration.

// package.json — recommended during alphas
{
  "dependencies": {
    "@llm-ports/core": "0.1.0-alpha.20.1",
    "@llm-ports/adapter-anthropic": "0.1.0-alpha.20.1"
  }
}

The @alpha tag is fine for experimentation:

npm install @llm-ports/core@alpha

When you bump, the @llm-ports/core postinstall emits a one-line banner pointing at the migration page. To upgrade across multiple alphas mechanically, use the bundled codemod:

npx @llm-ports/migrate@alpha alpha-19-to-alpha-20 --dry-run    # preview
npx @llm-ports/migrate@alpha alpha-19-to-alpha-20 --write      # apply

Documentation

Documentation site (auto-deployed from docs/ on every push to main):

https://baabakk.github.io/llm-ports/

Pages:

  • Getting Started
  • Concepts: ports, adapters, task routing, cost gating, content blocks, validation strategies
  • Guides: multi-provider routing, local-to-cloud, cost control, custom adapters, observability, security
  • Capabilities: one page per capability
  • Adapters: one page per adapter and feature matrix
  • Migration: from Vercel AI SDK, LangChain.js, and direct provider SDKs

Security

Tool use without a threat model is dangerous.

llm-ports treats security as a first-class part of the API:

  • destructive tool markers
  • confirmation-required actions
  • max output byte limits
  • redaction capability
  • explicit guidance for prompt injection and tool abuse

See SECURITY.md.


Contributing

Contributions are welcome after the initial v0.1 scaffolding lands.

See CONTRIBUTING.md.


License

MIT. See LICENSE.


Status

Pre-release.

Current target:

  • v0.1: core, adapters, cost gating, 7 capability factories
  • v0.2: expanded capabilities and observability package
  • v0.3: additional adapters and markdown skill format evaluation

Follow Releases

llm-ports is pre-release. To get notified when v0.1 lands on the latest tag (and for every minor release after):

  1. Click the Watch button at the top of the GitHub repo
  2. Choose Custom
  3. Enable Releases

You'll get an email or notification only when a real version ships. No PR or commit noise.

Keywords