0.1.0-alpha.26 • Published 19h ago

@llm-ports/adapter-ollama

Licence

MIT

Version

0.1.0-alpha.26

Deps

Size

162 kB

Vulns

Weekly

109

Stars

Summary Dependency Versions

llm-ports: TypeScript LLM Abstraction Layer for Multi-Provider AI Systems

Provider-agnostic LLM architecture for TypeScript.

Switch providers without changing code.
Avoid vendor lock-in.
Control cost.
Reuse prompts as capabilities.

Multi-provider routing • fallback chains • USD cost gating • capability factories • tool-use security • observability

Current release: 0.1.0-alpha.26 — canonical messages: LLMMessage[] input on all four generation methods. The { instructions, prompt } shape is @deprecated in alpha.26 and will be removed in alpha.27 (~2 weeks). Migrate mechanically with toMessages(instructions, prompt) (~30 min for a 20-site consumer) or idiomatically with sys(text) + usr(content) helpers. Multi-turn workloads (chat, interview agents, coaching workflows) are now first-class. See the alpha.25 → alpha.26 migration guide.

Previous release: 0.1.0-alpha.25 — three additive features under an "Observability surface + reliability hardening" theme: refs?: Record<string, ArtifactRef> on every call (#53), runtimeFallback: "aggressive" preset (#54), streamed cost surfacing via onCost / onTokenUsage at stream completion (#55). See the alpha.24 → alpha.25 migration guide.

The Problem

Most LLM applications break in predictable ways:

SDK upgrades touch too many files
Switching providers requires refactoring
Prompt logic is duplicated across features
Cost and routing logic are scattered
Business logic becomes coupled to provider-specific SDKs

This is not just an SDK problem.

It is an architecture problem.

The Solution

llm-ports applies the ports-and-adapters pattern to LLM systems.

Only two files in your codebase should know the LLM SDK exists.

Everything else talks to a typed interface.

Instead of calling models directly, your application uses reusable capabilities:

classify
draft
score
summarize
extract
plan
analyze

The LLM stops being a dependency you manage.
It becomes infrastructure you configure.

What You Get

Multi-provider LLM routing across OpenAI, Anthropic, Ollama, Vercel AI SDK, and compatible providers
Fallback chains when a provider fails or exceeds budget
USD-based cost gating with hourly, daily, and monthly limits
Reusable prompt capabilities so prompts are defined once and reused everywhere
Validation recovery for structured output failures
Tool-use safety primitives for destructive or confirmation-required actions
Observability hooks for cost, latency, quality, and outcomes
TypeScript-first API with full type support
No runtime dependency on LangChain, LlamaIndex, or heavy frameworks

60 Second Setup

1. Configure providers in `.env`

LLM_PROVIDER_FAST=anthropic|<model>|cost:50/day
LLM_PROVIDER_SMART=anthropic|<model>|cost:200/day
LLM_TASK_ROUTE_TRIAGE=fast,smart

2. Create the port once

import { createRegistryFromEnv } from "@llm-ports/core";
import { createAnthropicAdapter } from "@llm-ports/adapter-anthropic";

export const llm = createRegistryFromEnv({
  adapters: {
    anthropic: createAnthropicAdapter({
      apiKey: process.env.ANTHROPIC_API_KEY!,
    }),
  },
}).getPort();

3. Use it anywhere, with no SDK imports

const result = await llm.generateText({
  taskType: "triage",
  prompt: "Classify this email...",
});

The registry:

selects the right model for the task
enforces cost limits
falls back through the provider chain on failure
records usage, cost, and latency

Capabilities: Reusable LLM Operations

Instead of duplicating prompt logic across files, define a capability once and reuse it.

import { createClassifier } from "@llm-ports/capabilities";
import { z } from "zod";

const IntentSchema = z.object({
  intent: z.enum(["question", "request", "complaint", "feedback", "other"]),
  urgency: z.enum(["low", "normal", "high"]),
  reasoning: z.string(),
});

export const classifyIntent = createClassifier({
  port: llm,
  schema: IntentSchema,
  schemaName: "user-intent",
  rubric: `
    question: asking for information
    request: wants something done
    complaint: reports a problem
    feedback: opinion only
    other: anything else
  `,
});

Now call it anywhere:

const result = await classifyIntent({ content: userMessage });

Example output:

{
  intent: "request",
  urgency: "high",
  reasoning: "The user is asking for a concrete action."
}

Why this matters:

Improve a prompt once, and every call site benefits
Keep behavior consistent across the system
Make debugging and evaluation easier
Keep business logic free from provider-specific SDK details

Architecture Overview

Before:

Application code
  ├─ direct SDK call
  ├─ direct SDK call
  ├─ direct SDK call
  └─ model router leaking SDK types

After:

Application code
  ↓
Capabilities
  ↓
LLM Port
  ↓
Adapters and Provider Registry
  ↓
LLM providers

The key shift:

Application code stops calling models directly. It calls capabilities.

Packages

Package	Purpose
`@llm-ports/core`	Port interfaces, registry, routing, cost gating, validation strategies, content blocks
`@llm-ports/capabilities`	Reusable LLM operation factories
`@llm-ports/adapter-openai`	OpenAI SDK adapter with `baseURL` support for compatible providers
`@llm-ports/adapter-anthropic`	Anthropic SDK adapter
`@llm-ports/adapter-google`	Google Gemini native adapter (@google/genai SDK) — full multimodal, bundled pricing
`@llm-ports/adapter-ollama`	Ollama native adapter with model management
`@llm-ports/adapter-vercel`	Vercel AI SDK adapter for migration and compatibility

@llm-ports/observability (quality tracking hooks, sinks, deterministic edit-diff helpers) is planned for v0.2.

Examples

Seven runnable examples in examples/, each its own pnpm workspace package with a README walking through the code:

Example	What it shows
`basic`	The smallest possible end-to-end. One adapter, one task type, one `generateText` call. The 60-second-setup demo.
`multi-provider`	Fallback chain (Anthropic primary → OpenAI backup), USD cost gating per provider, capability factory.
`email-triage`	The most common production use case, condensed into ~150 lines. Inbound email → classify (intent + urgency + sentiment) → policy gate → draft brand-voiced reply → queue for human review. Capability composition story.
`streaming-chat`	Express server with three routes: `POST /chat` (one-shot), `POST /chat/stream` (Server-Sent Events), `POST /chat/agent` (tool-augmented). The most common LLM UX patterns in ~30 lines of glue.
`extract-from-pdf`	Document extraction: raw OCR'd invoice text → fully-typed structured object via Zod. Demonstrates `generateStructured`, validation-retry-with-feedback, and the `createExtractor` factory.
`agent-with-approval`	Tool-use agent with first-class security primitives. `destructive`, `requiresConfirmation`, `maxOutputBytes` flags + an approval-gate wrapper. The differentiation example.
`migrate-from-vercel-ai`	Two migration paths for users on Vercel AI SDK: (a) wrap your existing model factories with `@llm-ports/adapter-vercel`, (b) replace `@ai-sdk/*` with native llm-ports adapters. Side-by-side before/after diffs.

Each example is runnable from the monorepo root:

pnpm --filter @llm-ports/example-<name> start

Set the relevant API key (ANTHROPIC_API_KEY, OPENAI_API_KEY) before running. Each example's README documents which keys it needs.

Supported Use Cases

Use llm-ports when you need:

multi-provider LLM routing
LLM fallback chains
TypeScript LLM abstraction
OpenAI and Anthropic provider switching
cost control for production LLM applications
reusable prompt capabilities
structured output validation and recovery
tool-use security in agent workflows
observability for LLM cost, latency, and quality
vendor-neutral AI architecture

When to Use This

Use llm-ports if:

you use 2 or more LLM providers
you may switch providers later
SDK upgrades have caused multi-file changes
prompt logic is duplicated
cost control matters
you want business logic decoupled from provider SDKs

Skip it if:

you have 1 or 2 LLM calls
you are only prototyping
you are intentionally building around one provider-specific feature
you want a full agent framework, memory layer, RAG framework, or hosted gateway

Tool	How `llm-ports` relates
Vercel AI SDK	Vercel unifies provider calls. `llm-ports` adds registry, fallback chains, USD cost gating, validation recovery, and capability factories on top.
LiteLLM	LiteLLM is a Python-first HTTP proxy. `llm-ports` is TypeScript and runs in-process with no extra network hop.
Portkey	Portkey is a commercial hosted gateway. `llm-ports` is MIT, in-process, and has no hosted dependency.
LangChain.js	LangChain is a framework. `llm-ports` is a lightweight architecture and control layer.
LlamaIndex.TS	LlamaIndex is retrieval-first. `llm-ports` handles LLM invocation, routing, fallback, and cost control.
Mastra	Mastra is agent-first with built-in memory and workflow primitives. `llm-ports` provides lower-level LLM primitives beneath that layer.

llm-ports is pre-release. The core architecture is stable and the offline regression suite is comprehensive (250+ tests, latency p99 under 1 ms, no doc-rot detected across 110+ snippets). Some adapter and agent paths are still being hardened.

Fourteen medium-impact alpha-bake issues (#1, #3, #4, #5, #6, #9, #12, #14, #16, #19, #20, #21, #24, #32) shipped in 0.1.0-alpha.1 → 0.1.0-alpha.13 and are now closed. The alpha line completes the v0.1 surface: Gemini multi-turn runAgent + native responseSchema, runtime model discovery (LLMPort.listModels() across 4 adapters + Registry.checkPricingFreshness()), useStrictResponseFormat on adapter-openai for Cerebras strict-JSON, dangerouslyAllowBrowser opt-in on openai + anthropic, reasoningEffort parameter for o-series / gpt-5-nano / Groq gpt-oss-120b reasoning depth control, capability factories propagating reasoningEffort + signal + forceProviderAlias to the underlying port call, plus an expanded attemptValidationRepair pass that catches markdown-wrapped enums, trailing punctuation, stringified-JSON-as-object, and array-with-single-object misreads. The full per-surface inventory lives at the v0.1 status page.

What's still open:

Some compat-provider models (Groq, Together AI, Fireworks, Clarifai, SambaNova) may require a pricingOverrides entry to satisfy the registry's pricing-validation step. Bundled pricing tables cover OpenAI, Anthropic, Google, and Ollama by default. Worked examples for Clarifai's Qwen3.6 35B A3B FP8 and SambaNova's MiniMax-M2.7 are in the openai adapter docs.
Vercel adapter runAgent is single-turn only (multi-turn lands in v0.2).
Registry walks the chain on budget gating AND on runtime errors (alpha.7+, default predicate: ProviderUnavailableError). Configurable via runtimeFallback: "none" | "default" | { shouldFallback }. Streaming methods walk only on stream-creation failure, not mid-iteration.

If you hit something not listed here, please open an issue — the bug-report template captures the version + repro shape we need.

Installation

llm-ports is in alpha. All 7 packages plus the new @llm-ports/migrate codemod ship at v0.1.0-alpha.20.1. Stable v0.1 lands after a short alpha bake — see the v0.1 status page for what's stable today vs still being hardened.

npm install @llm-ports/core@0.1.0-alpha.20.1

Install adapters as needed:

npm install @llm-ports/adapter-anthropic@0.1.0-alpha.20.1
npm install @llm-ports/adapter-openai@0.1.0-alpha.20.1
npm install @llm-ports/adapter-google@0.1.0-alpha.20.1
npm install @llm-ports/adapter-ollama@0.1.0-alpha.20.1
npm install @llm-ports/adapter-vercel@0.1.0-alpha.20.1
npm install @llm-ports/capabilities@0.1.0-alpha.20.1

(Scoped under @llm-ports. Versioned together via changesets.)

Peer dependency: zod >=3.24.0 <5. Bring your own SDKs (@anthropic-ai/sdk, openai, ollama, ai).

Pinning during the alpha series

Recommended: pin to the exact alpha version, not the @alpha dist-tag, while we're still shape-locking. The @alpha tag tracks the latest published prerelease; a pnpm install or npm update can therefore jump you across breaking changes silently. Exact pins lock the version until you deliberately bump it, at which point you read MIGRATION.md and apply the per-release migration.

// package.json — recommended during alphas
{
  "dependencies": {
    "@llm-ports/core": "0.1.0-alpha.20.1",
    "@llm-ports/adapter-anthropic": "0.1.0-alpha.20.1"
  }
}

The @alpha tag is fine for experimentation:

npm install @llm-ports/core@alpha

When you bump, the @llm-ports/core postinstall emits a one-line banner pointing at the migration page. To upgrade across multiple alphas mechanically, use the bundled codemod:

npx @llm-ports/migrate@alpha alpha-19-to-alpha-20 --dry-run    # preview
npx @llm-ports/migrate@alpha alpha-19-to-alpha-20 --write      # apply

Documentation

Documentation site (auto-deployed from docs/ on every push to main):

https://baabakk.github.io/llm-ports/

Pages:

Getting Started
Concepts: ports, adapters, task routing, cost gating, content blocks, validation strategies
Guides: multi-provider routing, local-to-cloud, cost control, custom adapters, observability, security
Capabilities: one page per capability
Adapters: one page per adapter and feature matrix
Migration: from Vercel AI SDK, LangChain.js, and direct provider SDKs

Security

Tool use without a threat model is dangerous.

llm-ports treats security as a first-class part of the API:

destructive tool markers
confirmation-required actions
max output byte limits
redaction capability
explicit guidance for prompt injection and tool abuse

See SECURITY.md.

v0.1: core, adapters, cost gating, 7 capability factories
v0.2: expanded capabilities and observability package
v0.3: additional adapters and markdown skill format evaluation

Follow Releases

llm-ports is pre-release. To get notified when v0.1 lands on the latest tag (and for every minor release after):

Click the Watch button at the top of the GitHub repo
Choose Custom
Enable Releases

You'll get an email or notification only when a real version ships. No PR or commit noise.