contextplusplus
Convert public URLs and provider responses into clean, LLM-friendly Markdown — through a CLI, REST API, or MCP server.
npm install contextplusplus # or: npx contextplusplus process <url>
Overview
contextplusplus takes a public URL (or a provider's raw JSON), figures out which platform endpoint it belongs to, fetches a structured response through the Scrape Creators API, validates it against a schema, and renders it as readable Markdown. It exists so agent workflows, research pipelines, and internal tools get clean context instead of raw HTML scrapes or noisy JSON dumps.
- URL-only auto-dispatch — paste a URL, get Markdown; the right endpoint is detected from the URL shape.
- Explicit endpoint hints — force a specific endpoint and pass arguments when a URL is ambiguous or search-style.
- Provider caching — Redis, filesystem, or in-memory, with read-through / refresh / bypass / cache-only modes.
- Saved run artifacts — request, raw upstream, decoded payload, rendered Markdown, and metadata to disk.
- Generic-URL fallbacks — for URLs no provider claims: a staged escalation ladder — scrape.do T0 normal → T1 render → T2 super proxy, then a T3 Kernel headless-browser last resort.
- Authenticated MCP tool — expose the same URL-to-Markdown engine to MCP clients through mcp-use with Supabase OAuth and quota metering.
Built on Effect-TS v3 with a strict four-layer architecture, OpenTelemetry tracing, and an optional in-app Sentry integration.
How it works
URL or provider JSON
│
▼
detect endpoint ──► build outbound request ──► fetch (cache-aware) ──► decode (schema) ──► render Markdown
(pure provider (per-strategy, (Redis/FS/memory, (typed envelope) (Handlebars template →
definitions) SSRF-guarded) dedupe, retries) SC formatter → JSON)
The Markdown renderer tries, in order: an endpoint-specific Handlebars template, then the Scrape Creators success formatter, then a JSON code-fence fallback. When the resolved URL matches no structured provider, the generic-URL ladder runs instead and escalates only when a tier returns empty, thin, or bot-blocked Markdown: T0 scrape.do normal → T1 scrape.do rendered (render) → T2 scrape.do super proxy (super, US geo) → T3 Kernel headless browser (the last-resort tier; requires KERNEL_API_KEY). T0–T2 need SCRAPEDO_API_KEY; T3 also needs KERNEL_API_KEY.
Supported platforms
31 URL-routable platforms covering 146 endpoints, all served through Scrape Creators. Per-endpoint details (arguments, spec paths, URL examples) are in docs/ENDPOINTS.md.
| Platform | serviceId |
Host examples | Endpoints |
|---|---|---|---|
| TikTok | tiktok |
tiktok.com, m.tiktok.com | 20 |
| YouTube | youtube |
youtube.com, youtu.be | 15 |
instagram |
instagram.com | 13 | |
facebook |
facebook.com, m.facebook.com | 10 | |
| GitHub | github |
github.com | 9 |
reddit |
reddit.com, old.reddit.com, v.redd.it | 8 | |
| Twitter / X | twitter |
x.com, twitter.com | 6 |
| TikTok Shop | tiktok-shop |
tiktok.com/shop, shop.tiktok.com | 5 |
| Threads | threads |
threads.net, threads.com | 5 |
| Rumble | rumble |
rumble.com, www.rumble.com | 5 |
linkedin |
linkedin.com, www.linkedin.com | 5 | |
| Twitch | twitch |
twitch.tv, clips.twitch.tv | 4 |
pinterest |
pinterest.com, pinterest.co.uk | 4 | |
| Facebook Ad Library | facebook-ad-library |
facebook.com/ads/library | 4 |
| Truth Social | truth-social |
truthsocial.com | 3 |
| Spotify | spotify |
open.spotify.com | 3 |
| SoundCloud | soundcloud |
soundcloud.com | 3 |
| Google Ad Library | google-ad-library |
adstransparency.google.com | 3 |
| Facebook Marketplace | facebook-marketplace |
facebook.com/marketplace | 3 |
| Facebook Events | facebook-events |
facebook.com/events | 3 |
| Bluesky | bluesky |
bsky.app, staging.bsky.app | 3 |
| LinkedIn Ad Library | linkedin-ad-library |
linkedin.com/ad-library | 2 |
| Snapchat | snapchat |
snapchat.com/add | 1 |
| Pillar | pillar |
pillar.io | 1 |
| Linktree | linktree |
linktr.ee | 1 |
| Linkme | linkme |
link.me | 1 |
| Linkbio | linkbio |
link.bio | 1 |
| Komi | komi |
komi.io | 1 |
| Kick | kick |
kick.com | 1 |
| Google Search | google |
google.com/search | 1 |
| Amazon Shop | amazon-shop |
amazon.com, amazon.co.uk, amazon.de | 1 |
Plus one inference endpoint, age-gender (the detect-age-gender hint), which estimates audience age/gender for a profile and is reached by explicit hint rather than URL auto-dispatch.
Quickstart — HTTP server
pnpm install
cp .env.example .env # then set SCRAPE_CREATORS_API_KEY
pnpm --filter @contextplusplus/api serve # starts on PORT (default 3000)
POST /v1/process is authenticated by default (AUTH_MODE=required): pass either a Supabase JWT as Authorization: Bearer <jwt> or a programmatic key as X-Api-Key: <key>. An anonymous request is rejected with 401. For local development you can opt into the legacy anonymous path with AUTH_MODE=optional-dev (the examples below then work without a credential). /health and GET /v1/services are always public.
URL-only auto-dispatch:
curl -s http://localhost:3000/v1/process \
-H 'authorization: Bearer <supabase-jwt>' \
-H 'content-type: application/json' \
-d '{"url":"https://www.tiktok.com/@stoolpresidente"}'
URL plus an endpoint hint (here authenticating with an API key instead of a JWT):
curl -s http://localhost:3000/v1/process \
-H 'x-api-key: <api-key>' \
-H 'content-type: application/json' \
-d '{"url":"https://www.youtube.com/watch?v=dQw4w9WgXcQ","endpoint":"video"}'
Endpoint hint plus endpointArgs (for search-style endpoints):
curl -s http://localhost:3000/v1/process \
-H 'authorization: Bearer <supabase-jwt>' \
-H 'content-type: application/json' \
-d '{"url":"https://www.tiktok.com/","endpoint":"search-users","endpointArgs":{"query":"openai"}}'
Read your account's quota and subscription state (no quota is reserved):
curl -s http://localhost:3000/v1/usage -H 'authorization: Bearer <supabase-jwt>'
Send Accept: text/markdown to get the rendered Markdown directly for a single successful item (instead of the JSON envelope).
Quickstart — MCP server
The MCP server runs side-by-side with the HTTP API and serves the mcp-use endpoint at /mcp on MCP_PORT. Clients authenticate through Supabase OAuth (handled by mcp-use); the server reads the caller identity from ctx.auth.user.userId. Before the tool runs, the caller's subscription and quota are checked and a unit is reserved — the call is rejected when there is no active subscription or the monthly quota is exhausted, and the reserved unit is released if the engine fails. Point a client (Claude Desktop, MCP Inspector, or any mcp-use client) at ${MCP_URL}/mcp and complete the OAuth consent flow.
pnpm install
cp .env.example .env
# set SCRAPE_CREATORS_API_KEY plus the MCP_USE_OAUTH_SUPABASE_* values
pnpm --filter @contextplusplus/mcp dev # starts on MCP_PORT (default 3001)
The server exposes one tool:
| Tool | Input | Output |
|---|---|---|
convert_url |
{ "url": "https://..." } |
Markdown content plus minimal runId, status, and optional cacheState metadata. |
Offline CI tests construct the MCP server with fake OAuth, fake usage, and a fake runtime. They do not require live Supabase OAuth or provider network access.
Quickstart — CLI
The binary is contextplusplus. Core commands are process (fetch + render a URL or file), render (re-render a saved artifact), login, logout, whoami, and api-key create|list|revoke. There is no default subcommand — name process explicitly.
# from a checkout
pnpm --filter contextplusplus dev -- process --url https://www.linkedin.com/in/williamhgates --format markdown
pnpm --filter contextplusplus dev -- process --url https://www.tiktok.com/ --endpoint search-users --endpoint-args 'query=taylor swift' --format markdown
pnpm --filter contextplusplus dev -- process --url https://www.youtube.com/@MrBeast --explain --format markdown
printf '%s' '{"url":"https://www.google.com/search?q=contextplusplus","endpoint":"search"}' | pnpm --filter contextplusplus dev -- process - --format markdown
# after publish / npm link
npx contextplusplus process https://www.reddit.com/r/typescript/comments/1abcde/example/ --format markdown
contextplusplus render --raw artifacts/runs/2026-06-13/<run-id>/upstream.json
When CONTEXTPLUSPLUS_API_URL is set, process sends paid/default work to the authenticated backend and attaches credentials in this order: --api-key, CONTEXTPLUSPLUS_API_KEY, then the stored Supabase login session. Use --local for explicit developer-only in-process execution from the checkout. This keeps provider keys and quota enforcement server-side for distributed CLI use.
contextplusplus login
contextplusplus whoami
contextplusplus api-key create --name ci
CONTEXTPLUSPLUS_API_URL=https://api.example.com contextplusplus process --url https://example.com --format markdown
contextplusplus process --local --url https://example.com --format markdown
process flags
| Flag | Default | Meaning |
|---|---|---|
input (positional) |
— | A URL or a path to a JSON file, or - for stdin. Mutually exclusive with --url; one of the two is required. |
--url <url> |
— | Alternative to the positional input. |
--endpoint <hint> |
auto | Force a specific endpoint instead of URL detection. |
--endpoint-args key=value |
— | Repeatable. Adds arguments (e.g. query=...). Quote values with spaces. |
--cache-mode <mode> |
read-through |
read-through, refresh, bypass, or cache-only. refresh is live and paid — it requires CONFIRM_REFRESH=1. |
--save-artifacts / -a |
off | Persist request / upstream / decoded / Markdown / metadata to ARTIFACT_ROOT. |
--concurrency <n> |
10 |
Batch concurrency. |
--format json|markdown |
json |
Output format. |
--explain |
off | Prepend the resolved strategy / service / endpoint / attempt status. |
--api-key <key> |
— | Explicit backend credential. Takes priority over CONTEXTPLUSPLUS_API_KEY and stored login. |
--local |
off | Run the in-process developer engine even when CONTEXTPLUSPLUS_API_URL is configured. |
render takes a single --raw <path> and prints the re-rendered Markdown for a saved artifact.
Exit codes: 0 success · 2 input validation error · 3 auth failure · 4 subscription required · 5 quota exceeded · 124 provider timeout · 1 any other failure. The CLI never throws a raw stack at the user.
HTTP API
The service contract is in docs/openapi.yaml. (The root scrape-creators-openapi.yaml is the upstream Scrape Creators reference, not this service's contract.)
| Method | Path | Purpose |
|---|---|---|
POST |
/v1/process |
Process one request or a batch; returns the run envelope (or Markdown with Accept: text/markdown). |
GET |
/v1/services |
Discovery: public Scrape Creators services and endpoints. Generic-URL fallback strategies are private implementation details. |
GET |
/v1/metrics |
JSON snapshot of runtime counters. |
GET |
/v1/metrics/prom |
The same counters in Prometheus exposition format. |
GET |
/v1/usage |
Authenticated. Read the caller's plan tier, monthly usage, remaining quota, period end, subscription status, and entitlement state. |
GET |
/v1/customer/dashboard |
Authenticated. One call: the caller's own usage snapshot plus available tiers and current plan, for the self-serve account page. |
GET |
/v1/customer/usage-history |
Authenticated. Paginated ledger of the caller's own usage events (?start&end&limit&offset), newest first. |
POST |
/v1/api-keys |
Authenticated. Create a programmatic API key ({ "name": "...", "expiresAt"?: "...", "scopes"?: ["process", "account-read", "api-key-manage"] }); returns the raw key once plus its metadata (201). |
GET |
/v1/api-keys |
Authenticated. List the caller's API-key metadata (never the raw secret). |
DELETE |
/v1/api-keys/:keyId |
Authenticated. Revoke one of the caller's API keys (204). |
PATCH |
/v1/api-keys/:keyId/rotate |
Authenticated (api-key-manage scope). Mint a new secret on the same key — preserving its name, scopes, expiry and creation time — and invalidate the old one immediately; returns the new raw key once (200). A missing/foreign/revoked key is 404. |
POST |
/v1/billing/checkout |
Authenticated. Create a Stripe Checkout Session for a standard/pro plan; returns the redirect url. |
POST |
/v1/billing/portal |
Authenticated. Create a Stripe Customer Portal session (upgrade/cancel/payment method). |
POST |
/v1/billing/webhook |
Public but Stripe-Signature-verified against the raw body; upserts subscription state idempotently. |
GET |
/health |
Liveness probe (no /v1 prefix, always public). |
Authentication
Authenticated routes accept either credential, resolved per request:
Authorization: Bearer <jwt>— a Supabase JWT, verified againstSUPABASE_JWT_SECRET(local HS256) or the project JWKS (hosted), with the audience claim checked againstSUPABASE_JWT_AUD(defaultauthenticated).X-Api-Key: <key>— a programmatic API key minted viaPOST /v1/api-keys(or the CLI). The raw key is shown once at creation; the backend persists onlysha256(key)and looks the caller up by that hash, so the plaintext is never stored.
A missing/invalid credential returns 401 (AuthenticationRequiredError / InvalidCredentialError). The only unauthenticated routes are GET /health, GET /v1/services, GET /v1/metrics, GET /v1/metrics/prom, and POST /v1/billing/webhook (which is instead signature-verified).
API-key scopes
API keys carry capability scopes — process (run /v1/process), account-read (read /v1/usage, /v1/plans, /v1/customer/*), and api-key-manage (manage /v1/api-keys). A key created without an explicit scopes array — and every key minted before scopes existed — is granted the full set, so enforcement is fully backward-compatible. Bearer (Supabase JWT) sessions are always full-scope. A key that lacks the scope a route requires is rejected with 403 (InsufficientScopeError); the admin and billing routes are not scope-gated (admin stays gated by the ADMIN_USER_IDS allowlist).
Rate limiting
When the product server is run (pnpm --filter @contextplusplus/api serve), IP-scoped rate limiting is on by default and sits in front of the per-user quota gate. Limits are env-tunable; over-limit requests get a sanitized 429 with Retry-After and RateLimit-* (draft-8) headers. Defaults:
| Scope | Routes | Default limit |
|---|---|---|
| global | every /v1/* route |
600 requests / 15 min per IP |
| auth | /v1/api-keys, /v1/usage, /v1/plans, /v1/customer/*, /v1/billing/{checkout,portal} |
30 requests / 15 min per IP |
| webhook | /v1/billing/webhook |
600 requests / 1 min per IP |
/health is never throttled. Disable with HTTP_RATE_LIMIT_ENABLED=false; tune individual windows/limits with the HTTP_RATE_LIMIT_* variables (see the env table). Behind a reverse proxy, set HTTP_TRUST_PROXY_HOPS to the real hop count so the client IP (and thus the limit bucket) cannot be spoofed via X-Forwarded-For.
Request body (POST /v1/process) is either a single request or a batch:
// single
{ "url": "https://...", "endpoint": "video", "endpointArgs": { "query": "..." }, "timeoutMs": 30000 }
// batch (1–100 items)
{ "requests": [ { "url": "https://..." }, ... ],
"options": { "saveArtifacts": true, "cacheMode": "read-through", "concurrency": 10 } }
url must be http(s), ≤ 2048 chars, and may not contain credentials. Response is a ProcessRunResponse envelope (runStatus, summary, results[]). Failures use a stable error envelope: { "error": { "tag": string, "message": string, "context"?: object } } — never a raw cause. With Accept: text/markdown, a single successful item returns its Markdown plus x-contextplusplus-* response headers (including x-contextplusplus-request-id); a non-success run returns 502.
Each item carries a flattened provenance summary — { provider, strategyId, tier?, elapsedMs, cacheState?, cost?, estimatedUsd? } — derived from the decisive provider attempt (no extra computation), plus fetchedAt (when the decisive attempt completed) and a stable contentHash (sha256 hex of the item's markdown, or of its decoded data via an order-independent stringify) for change detection. cost/estimatedUsd reflect internal upstream spend and are redacted for non-admin callers — stripped from both provenance and every attempts[].cost / attempts[].usage.estimatedUsd; admin callers (the ADMIN_USER_IDS allowlist) see full cost. provider/strategyId/tier/latency/cache stay visible to everyone.
For a single-item response the boundary also sets additive headers: X-Cache (the cache state normalized to HIT, MISS, or STALE), a weak ETag (the item contentHash), and X-Fetched-At (the fetch timestamp) — usable for conditional/incremental fetching by crawlers and caches.
When a single-item run is rate-limited upstream (ProviderRateLimitedError) and the provider supplied a Retry-After, the response is an HTTP 429 with that Retry-After value passed through, so clients can honor the upstream backoff. (Per-user quota 429s carry no Retry-After.)
Billing & quota
Billing follows Model A: two flat recurring Stripe Prices (standard $7 / pro $17) map app-side to plan_tiers, and per-tier usage limits are enforced in Postgres via check_and_increment_quota — there is no Stripe metered usage. On the authenticated /v1/process path the active subscription is checked and a quota unit is reserved before any provider dispatch: no active subscription returns 402, over-quota returns 429. Customers can read their current account snapshot with GET /v1/usage, which returns { planTier, monthlyLimit, used, remaining, periodEnd, subscriptionStatus, entitled } without reserving quota. Subscribe with POST /v1/billing/checkout ({ "plan": "standard" | "pro" }), manage with POST /v1/billing/portal, and let Stripe drive POST /v1/billing/webhook, which verifies the Stripe-Signature header against the raw request body and upserts subscription state idempotently. Set the STRIPE_* variables below to enable it.
Tests are offline and deterministic: webhooks are exercised with stripe.webhooks.generateTestHeaderString + constructEvent against a fixed whsec_test_secret, and checkout/portal/quota run through fakes. There is no live Stripe call or spend in pnpm run verify. Hosted Supabase/Stripe provisioning and live payment credentials are out of scope here.
TypeScript SDK
A typed client lives in sdk/typescript/. Its types are generated from docs/openapi.yaml (so they never drift from the API), and the client is hand-authored over the global fetch — it adds no runtime dependency.
import { ContextPlusPlusClient } from "@contextplusplus/sdk";
const client = new ContextPlusPlusClient({ baseUrl: "https://api.example.com", apiKey: process.env.CONTEXTPLUSPLUS_API_KEY });
const run = await client.process({ url: "https://example.com/article" });
console.log(run.results[0]?.markdown);
Regenerate the types after any spec change with pnpm run sdk:gen; pnpm run sdk:check (a leg of pnpm run verify) fails CI on any drift, mirroring the live-fixtures discipline. The SDK is committed as source and is excluded from npm pack (it is a repo artifact, not part of the published runtime).
Python SDK
A typed Python client lives in sdk/python/. Its dataclass models are generated from docs/openapi.yaml by scripts/generate_sdk_py.py, and the client is hand-authored over the standard-library urllib — it has no third-party runtime dependency (the models are stdlib dataclasses).
from contextplusplus import ContextPlusPlusClient
client = ContextPlusPlusClient("https://api.example.com", api_key="atmd_...")
run = client.process({"url": "https://example.com/article"})
print(run["results"][0]["markdown"])
Regenerate the models with pnpm run sdk:gen:py (the generator toolchain is dev-only and fully pinned in sdk/python/requirements-dev.txt). Drift is gated by the dedicated python-sdk CI job running pnpm run sdk:check:py, which keeps the Node pnpm run verify gate Python-free. Like the TypeScript SDK, it is committed as source and excluded from npm pack.
Any change to
docs/openapi.yamlmust regenerate BOTH SDKs —pnpm run sdk:gen:allruns the TypeScript (sdk:gen) and Python (sdk:gen:py) generators together. Two drift gates enforce it: TypeScriptsdk:checkinsidepnpm run verify, and Pythonsdk:check:pyin thepython-sdkjob.
Playground
The HTTP server serves a zero-dependency "try it" page at GET /playground (public, no /v1 prefix, so it is not throttled by the /v1 limiters — like /health). Paste a URL, optionally supply your API key, and it calls the same-origin POST /v1/process and renders the returned Markdown, plus copy-paste curl / CLI / MCP snippets for that URL.
Security: the page is a fixed HTML constant with no server-side interpolation, every dynamic value is inserted via textContent (never innerHTML), and it ships a strict CSP (default-src 'none'; connect-src 'self') so it can only talk to this origin. The API key is sent only to this server as X-Api-Key, is not persisted unless you opt in (a checkbox, with a warning), and is never echoed into the generated snippets (which use a $CONTEXTPLUSPLUS_API_KEY placeholder). There is no anonymous/keyless compute path.
Caching
Provider responses are cached under a deterministic, secret-free key. Choose the backend with CACHE_BACKEND:
filesystem(default) — artifacts underARTIFACT_ROOT.redis— requiresREDIS_URL; connection is established eagerly at startup.memory— in-process, ephemeral (used by tests and ad-hoc CLI runs).
Per-request behavior is controlled by cacheMode: read-through (default), refresh (re-fetch and overwrite — paid), bypass (ignore cache), cache-only (never hit the network). Cache and key details: CACHE.md.
Observability
- Tracing is OpenTelemetry-native (
@effect/opentelemetry+ OTLP/HTTP). SetOTEL_EXPORTER_OTLP_ENDPOINT(defaulthttp://localhost:4318); spans post to${endpoint}/v1/traces. - Sentry is optional and gated on
SENTRY_DSN(unset → no-op). It is wired into the OTel span pipeline, receives sanitized Effect logs through Sentry Logs, and is initialized before HTTP, CLI, and MCP runtime construction. - Metrics are process-local counters at
GET /v1/metrics(JSON) andGET /v1/metrics/prom(Prometheus,text/plain; version=0.0.4), prefixedcontextplusplus_, withservice/endpoint/provider/error_taglabels.
# Prometheus scrape
- job_name: contextplusplus
static_configs:
- targets: ['contextplusplus:3000']
metrics_path: /v1/metrics/prom
Environment variables
| Variable | Required | Default | Purpose |
|---|---|---|---|
SCRAPE_CREATORS_API_KEY |
Yes | — | API key for all scrape-creators.* strategies. |
SCRAPEDO_API_KEY |
Go-live | — | scrape.do key for the generic-URL T0/T1/T2 tiers (the deployment-readiness check requires it). |
KERNEL_API_KEY |
Generic-URL T3 only | — | Kernel key for the generic-URL T3 headless-browser last-resort tier. |
REDIS_URL |
When CACHE_BACKEND=redis |
— | Redis connection string. |
REDIS_PASSWORD |
No | — | Used by the bundled docker-compose.yml Redis service. |
CACHE_BACKEND |
No | filesystem |
redis, filesystem, or memory. |
CONTEXTPLUSPLUS_CACHE_PROVIDER |
No | filesystem |
Backward-compatible alias used only when CACHE_BACKEND is unset. |
ARTIFACT_ROOT |
No | ./artifacts |
Directory for saved run artifacts. |
DEFAULT_CONCURRENCY |
No | 10 |
Default batch concurrency. |
UTILIZATION_TARGET |
No | 0.95 |
Rate-budget utilization target (0 < x ≤ 1). |
HOST |
No | Node default | Optional HTTP bind host. make local sets 127.0.0.1; make local-lan sets 0.0.0.0. |
PORT |
No | 3000 |
HTTP server port. |
MCP_PORT |
No | 3001 |
MCP server port for apps/mcp's dev / start scripts (pnpm --filter @contextplusplus/mcp dev / start). |
MCP_URL |
No | — | Public MCP base URL used by mcp-use for generated OAuth/widget URLs. |
MCP_ALLOWED_ORIGINS |
No | — | Comma-separated mcp-use origin allow-list / host validation input. |
MCP_USE_OAUTH_SUPABASE_PROJECT_ID |
Hosted MCP OAuth | — | Supabase project ID for mcp-use OAuth. |
MCP_USE_OAUTH_SUPABASE_URL |
Local/self-hosted MCP OAuth | — | Supabase base URL alternative to project ID. |
MCP_USE_OAUTH_SUPABASE_PUBLISHABLE_KEY |
MCP OAuth | — | Supabase publishable key consumed by mcp-use OAuth routes. |
CONTEXTPLUSPLUS_API_URL |
Hosted CLI processing | — | Backend base URL for authenticated CLI process and api-key commands. |
CONTEXTPLUSPLUS_API_KEY |
CLI backend auth | — | Programmatic API key; takes priority over stored login. |
CONTEXTPLUSPLUS_AUTH_FILE |
No | ~/.config/contextplusplus/credentials.json |
Override CLI token-store path; useful for tests/CI. |
SUPABASE_PUBLISHABLE_KEY |
CLI login | — | Supabase publishable key for PKCE login. |
SUPABASE_URL |
Live auth/usage | — | Supabase project URL; backend reads api_keys and writes usage/quota. |
SUPABASE_SERVICE_ROLE_KEY |
Live auth/usage | — | Supabase service-role key (secret). Never expose to clients. |
SUPABASE_JWT_SECRET |
Live JWT verify | — | HS256 signing secret for local Supabase JWT verification (hosted uses JWKS). |
SUPABASE_JWT_AUD |
No | authenticated |
Expected audience claim for verified Supabase JWTs. |
STRIPE_SECRET_KEY |
Live billing | — | Stripe secret key for checkout/portal/webhook (secret). |
STRIPE_WEBHOOK_SECRET |
Live billing | — | Verifies Stripe-Signature on /v1/billing/webhook (secret). |
STRIPE_PRICE_STANDARD |
Live billing | — | Stripe Price id mapped to the standard ($7) tier. |
STRIPE_PRICE_PRO |
Live billing | — | Stripe Price id mapped to the pro ($17) tier. |
STRIPE_CHECKOUT_SUCCESS_URL |
No | http://localhost:3000/billing/success |
Post-checkout success redirect. |
STRIPE_CHECKOUT_CANCEL_URL |
No | http://localhost:3000/billing/cancel |
Post-checkout cancel redirect. |
STRIPE_PORTAL_RETURN_URL |
No | http://localhost:3000/billing |
Customer Portal return URL. |
STRIPE_MOCK_HOST / STRIPE_MOCK_PORT / STRIPE_MOCK_PROTOCOL |
Tests only | — | Point the Stripe SDK at a local stripe-mock for request-shape tests. |
HTTP_REQUEST_TIMEOUT_MS |
No | 90000 |
Outer /v1/process timeout; returns 504 when exceeded. |
AUTH_MODE |
No | required |
required rejects anonymous /v1/process with 401; optional-dev re-enables the local anonymous path. |
ALLOW_UNAUTHENTICATED_PROCESS |
No | — | Legacy alias for AUTH_MODE, honoured only when AUTH_MODE is unset (1/true → optional-dev). |
HTTP_TRUST_PROXY_HOPS |
No | 0 |
Trusted reverse-proxy hop count for client-IP derivation. Set 1 behind exactly one proxy; never higher than the real hop count. |
HTTP_RATE_LIMIT_ENABLED |
No | true (server) |
Master switch for the IP rate limiters (server.ts enables; the in-process test app leaves them off). |
HTTP_RATE_LIMIT_GLOBAL_WINDOW_MS / _GLOBAL_LIMIT |
No | 900000 / 600 |
Global /v1/* window and per-IP limit. |
HTTP_RATE_LIMIT_AUTH_WINDOW_MS / _AUTH_LIMIT |
No | 900000 / 30 |
Stricter window/limit for credential-verifying routes. |
HTTP_RATE_LIMIT_WEBHOOK_WINDOW_MS / _WEBHOOK_LIMIT |
No | 60000 / 600 |
Window/limit for the Stripe webhook route. |
CORS_ALLOWED_ORIGINS |
No | * |
Single value placed verbatim in Access-Control-Allow-Origin. |
OTEL_EXPORTER_OTLP_ENDPOINT |
No | http://localhost:4318 |
OTLP/HTTP collector endpoint. |
SERVICE_VERSION |
No | dev |
OTel resource version + Sentry release fallback. |
NODE_ENV |
No | local |
Environment label. |
SENTRY_DSN |
No | — | Enables Sentry when set. |
SENTRY_ENVIRONMENT |
No | NODE_ENV |
Sentry environment label. |
SENTRY_RELEASE |
No | SERVICE_VERSION |
Sentry release identifier. |
SENTRY_TRACES_SAMPLE_RATE |
No | 0.05 |
Sentry trace sample rate. |
SENTRY_LOGS_ENABLED |
No | true |
Forward Effect.log* records to Sentry Logs when SENTRY_DSN is set. |
SENTRY_LOGS_MIN_LEVEL |
No | info |
Minimum Sentry log level: trace, debug, info, warn, error, or fatal. |
See .env.example for commented examples.
Architecture
Effect-TS layers, imports flowing downward only. Domain, application, and infrastructure live together in packages/core/src/; the three interfaces are separate apps (apps/cli, apps/api, apps/mcp) that depend on packages/core:
interfaces/ apps/cli + apps/api + apps/mcp — the only place an Effect is run
│
application/ packages/core/src/application: workflows, ports (service contracts), scheduling, cache-key, config schema — no IO
│
infrastructure/ packages/core/src/infrastructure: Live Layers: cache backends, outbound HTTP + URL safety, providers, markdown, artifacts, observability
│
domain/ packages/core/src/domain: tagged errors, branded ids, schemas, the ProviderStrategy contract — pure
Each provider has two 1:1 trees: pure definitions under packages/core/src/providers/definitions/<service>/ (URL parsing, endpoint detection, query builders) and Effect strategies under packages/core/src/infrastructure/providers/scrape-creators/<service>/. The catalog is assembled in packages/core/src/infrastructure/providers/catalog-live.ts; the runtime is composed in packages/core/src/infrastructure/runtime.ts. Agent-facing guidance lives in per-folder AGENTS.md files. See docs/ARCHITECTURE.md.
Development
git clone <repo-url> && cd contextplusplus
pnpm install
pnpm --filter contextplusplus dev -- --help
Agents should not use this workstation for package verification. Build, test,
fixture, pack, and deploy-prebuild checks run in GitHub Actions on Ubicloud
runners; local commands are for interactive development and lightweight
agent-feedback checks only. Run pnpm run agent:check locally before pushing to
catch whitespace/workflow mistakes without spending CI minutes.
Agent work should normally happen in a task worktree. In Herdr-managed sessions,
use native Herdr worktree commands, or $herdr-pm-agent's pm.py spawn-exec
helper when available, so each task has an isolated branch and cleanup path.
The same optimized GitHub Actions workflow is the proof path for worktrees and
normal repository work.
This is a pnpm + Turborepo monorepo: use pnpm run <script> for root scripts
and pnpm --filter <pkg> <script> to target one workspace member (package
names: @contextplusplus/core, contextplusplus (CLI), @contextplusplus/api,
@contextplusplus/mcp, @contextplusplus/website, @contextplusplus/dashboard).
| Script | Purpose |
|---|---|
pnpm --filter contextplusplus dev -- --help |
Run the CLI from source. |
pnpm --filter @contextplusplus/api serve |
Start the HTTP server (tsx, Sentry preloaded). |
pnpm --filter @contextplusplus/mcp dev |
Start the MCP server (tsx, Sentry preloaded) on MCP_PORT. |
pnpm run build |
turbo run build — builds every buildable app/package (packages/core, apps/cli, apps/api, apps/mcp via tsc -b; apps/website/apps/dashboard via next build); CI-owned for proof. |
pnpm run test / pnpm run test:e2e |
Root unit + integration / end-to-end suites; CI-owned for proof. |
pnpm run fixtures:check-live |
Re-render every fixture and assert it matches rendered.md; CI-owned for proof. |
pnpm run verify:fast |
Fast every-push CI gate: workspace typecheck + each app/package's tests + root test + e2e. This is what the Verify (fast, Node 24) CI job runs on every push. |
pnpm run lint:rules |
Project rule guard (scripts/check-rules.mjs): enforces Effect clean rules (runPromise confinement, no console/logFatal, .js imports, no raw throw outside providers, no legacy imports); workspace-aware — skips the Next.js frontend packages. Also invoked by agent:check and the CI verify step. |
pnpm run agent:check |
Fast local-only agent hygiene: git diff --check, lint:rules, plus actionlint when installed. |
pnpm run verify |
Full CI gate: verify:fast + fixtures + sdk:check + build + CLI smoke + CLI pack. Runs as an explicit step in release.yml on tag publish. |
The Makefile is a small convenience layer over the root scripts, but it has
not been updated for the pnpm/Turborepo split: it still shells out to npm
(npm ci, npm run serve, npm run dev -- --help, npm pack --dry-run), and
the root package is no longer installed with npm and no longer has serve or
dev scripts. Use the pnpm/pnpm --filter commands above instead until the
Makefile is regenerated:
| Target | Purpose | Status |
|---|---|---|
make install |
Install exact locked dependencies. | Broken — npm ci needs a package-lock.json; this repo has only pnpm-lock.yaml |
make local |
Start the HTTP API on 127.0.0.1:3456. |
Broken — calls npm run serve (no longer a root script) |
make local-lan |
Start the HTTP API on 0.0.0.0:3456 for explicit LAN testing. |
Broken — calls npm run serve (no longer a root script) |
make cli |
Run CLI help from source. | Broken — calls npm run dev (no longer a root script) |
make stop / make clean |
Stop this repo's dev process on PORT; clean also removes dist/. |
OK |
make build / make test / make fixtures / make verify |
Developer conveniences for the matching package checks; agents use GitHub Actions for proof. | Resolve if node_modules already exists (root retains these script names), but assume pnpm install, not make install, populated it |
make pack |
Preview npm package contents. | Not meaningful — packs the private root workspace, not the published CLI (apps/cli); use pnpm --filter contextplusplus pack |
Override local API settings with PORT=<port> and, for make local, HOST=<host>.
Behavior changes should land with a test. The fixture corpus (test/fixtures/live/) holds 145 captured upstream payloads and their rendered Markdown; regenerate with pnpm run fixtures:render-live only for an intentional template or schema change, never by hand. Final verification is the GitHub Actions run for the committed SHA.
Local git hooks
The repo ships lightweight local hooks in .githooks/:
- commit-msg — runs
commitlintto enforce Conventional Commits with a required scope (types:feat|fix|docs|refactor|test|chore|perf; subject ≤ 50 chars; barefix:without a scope is rejected). - pre-push — runs
pnpm run agent:check(whitespace diff + rule-guard + actionlint; sub-second).
Hooks are intentionally minimal: no tsc, no vitest, no build — those are CI-only and would thrash under multiple worktrees.
Wire them once after pnpm install (or manually if you rarely install):
git config core.hooksPath .githooks
pnpm install also runs the prepare script, which sets core.hooksPath automatically. The setting lives in the shared .git/config, so it applies to all worktrees of this repo.
Docker
cp .env.example .env # set deploy env; see DEPLOYMENT.md
docker compose up --build
curl http://localhost:3000/health
curl -s http://localhost:3000/v1/services | jq '.services | length'
The multi-stage Dockerfile + docker-compose.yml run the API plus Redis. The container listens on PORT, persists artifacts in the api-artifacts volume, and ships a /health healthcheck.
See DEPLOYMENT.md for go-live env validation, Supabase migrations, Stripe webhook setup, and the one-proxy HTTP_TRUST_PROXY_HOPS=1 setting.
CI & publishing
- CI (
.github/workflows/ci.yml) runs on every push and manual dispatch on Ubicloud arm64 runners; achangespaths-filter gates the heavy suites. Every push: Verify (Node 24 —pnpm run verify:fast= workspace typecheck + tests + e2e, plus the project rule guard) and a gitleaks Secret scan. Onmain/ dispatch: Package & smoke (build + CLIsmoke:cli+sdk:check+ CLIpack), parallel to Verify. Path-gated jobs run only on the relevant change (orworkflow_dispatchfor a full matrix): Compatibility (Node 22), RPC regression (real Postgres), Supabase types drift, Python SDK drift, Web build (website + dashboard). Typical wall-time < 10 min. Production deploys (deploy.yml,deploy-web.yml) are manual, confirm-gatedworkflow_dispatchonly. - Completion rule: a non-trivial change is not complete until the committed SHA is green in the intended CI jobs. Keep
mainclean, current, and pushed after merging verified work. - Publishing runs
pnpm run verifyas an explicit "Verify package" CI step before publish (noprepublishOnlylifecycle hook is used). Pushing av*.*.*tag triggers.github/workflows/release.yml, which publishes both@contextplusplus/coreandcontextplusplusto npm withNPM_TOKENand creates a GitHub Release.
Security & license
Report vulnerabilities per SECURITY.md. Outbound requests are guarded against SSRF (private/loopback/link-local/CGNAT addresses and DNS-rebinding are rejected). Secrets are Redacted and stripped from cache entries, artifacts, logs, and error envelopes.
MIT — see LICENSE.