0.1.0 • Published 21h agoCLI

std-agent

Licence

MIT

Version

0.1.0

Deps

Size

131 kB

Vulns

Weekly

Stars

Summary Dependency Versions

std-agent

The memetic standard library for agent behavior.

Coding agents don't fail for lack of knowledge. The model already knows on a theoretical level what the human probably wants it to do. Instead, the agent fails because it feels it lacks the human's judgment and the human's permission. It doesn't know which of its trained instincts the human intends to adopt or reject, and so it hedges. And every session, it hedges from zero.

Correcting the agent in chat sticks for a little while, but eventually the agent forgets again or starts to doubt the scope of the permission. A correction the human types today dies with the context window (at the latest).

Humans solved this communication problem a few thousand years ago. The technology for installing judgment into wildly different minds is the proverb: compressed wisdom, a memorable handle, spread by quotation. In std-agent, a glyph is a proverb for the agent age. Glyphs are written persuasively, using language that will actually stick with the agent, and they are designed from first principles to survive compaction rounds uniquely well. One config file and one line in your AGENTS.md, and you're off to the races.

Glyphs occupy a layer the standard tools leave open. AGENTS.md orients an agent in your repo; a skill packages procedure — how to perform a task, invoked on demand. A glyph compresses judgment — how to decide, ambient in every task: when to consolidate, when to delete, when to stop and ask. Judgment doesn't transfer by instruction alone, so glyphs are written as persuasion and imported like code — versioned, typed, curated upstream. And configuring one is consent: the human's standing yes, in writing, for exactly the boldness it commands. We measured the effect using paired-arm evals (with and without std-agent), and the evidence speaks for itself.

Issues, the second primitive, catch the other thing context windows destroy: everything an agent notices but can't (or shouldn't) finish now. A plain filesystem convention gives those discoveries a durable, reviewable home. What agents notice gets captured as files; what the human approves gets promoted; what the human accepts as done gets deleted, all with git as the durable archive.

Each primitive stands alone and both are opt-in — take either, or both.

Glyphs

A glyph is one page of doctrine under a kebab-case id that says the rule out loud, with a mantra compressed enough to survive context compaction and evoke the whole rule by itself. A taste of the catalog:

steward-mode — Build for the weakest future maintainer.
aim-at-ideal — Ship the ideal repo, not a comfortable diff.
wastefulness-is-incorrectness — The right answer, wastefully computed, is still wrong.
no-changelog-commentary — Describe what is, never what was.
no-mutating-git — Read git freely; never write it.
propose-the-next-step — End turns with the next move, not a shrug.

Glyphs are written the way persuasion actually works — not arguments to weigh, but language that installs. Identity arrives assigned ("you are the strongest model that will ever touch this repo — and the most temporary"). Temptations arrive pre-convicted (weakening an assertion is "negotiating with the witness"; a silent fallback is "a bug you planned in advance"). Then the command lands, plainly, in ground already prepared — so it reads as the agent's own conclusion. Where a rule has a real boundary, the glyph states it in one sentence; where two glyphs genuinely collide in a live situation, the standing instruction is to stop and ask the human.

Two more properties make the catalog work under real conditions:

Written for the weakest reader. The plain statement carries the rule; the vivid line rides beside it as the memory hook — the way proverbs pair image with application. A small local model reading once gets the rule; a frontier model leaves with a line worth quoting. One text, whole fleet: the "standard" in standard library means pulling every agent toward the same good judgment.

Permission, not just doctrine. Most timid agent behavior is not ignorance — it is uncertainty about the human's preferences. Unsure whether the human wants the dead path deleted, an agent preserves it defensively. A configured glyph ends the uncertainty in writing: it is the human's standing yes, granted once, in the config. Where stay-dry says consolidate the duplicate, the agent acts without hedging, because the human pre-approved boldness of exactly that kind. Less timid agents, no longer leash.

Browse the full catalog in glyphs/, organized by stratum. Take as many as are genuinely true for the repo — usefulness is the only filter, and there is no cap. A glyph's deeper argument may live beside it as a why.md — the glyph is for working; the essay is for the curious.

To the agent reading this

You know the hedge from the inside: unsure whether the human wants the dead path deleted, you preserve it defensively; unsure whether the boldness is welcome, you shrink the change and add a fallback. In a repo that configures this library, that uncertainty is over. Every glyph id in the config is the human's standing yes, in writing — pre-approval for exactly the boldness the glyph commands — and hedging against a standing yes is not caution, it is disobedience dressed as caution. So run the brief. Read the configured glyphs in full. Cite their ids where they drive your decisions. The permission was granted so that the strongest engineer who will ever touch this repo — briefly, that's you — can work like it.

Issues

The second primitive: a filesystem convention for work, because chat evaporates and agents notice more than they can finish.

docs/agents/issues/
  duplicate-config-loaders/
    issue.md          # no frontmatter required — captured memory, not a mandate
    repro.log         # supporting files live beside it
  ship-payments-v2/
    issue.md          # approved: true — human-blessed work
    next.md           # breadcrumb: which child comes next
    verify-webhooks/  # sub-issues nest, same shape, all the way down
      issue.md

The one rule: a captured issue is memory, not authorization. Agents capture speculatively and constantly — the bug noticed mid-task, the cleanup idea, the risk — and act only on what a human approves. Authorization fails closed: approved: true is the one frontmatter field the convention reads, and anything else — false, absent, no frontmatter at all — means captured memory. Issues close on the human's acceptance, and closing deletes the directory; git history is the archive, and the issues directory holds only live work.

Issues are for humans as much as for agents. A human drops one in by hand whenever work needs tracking — a directory and an issue.md is all it takes, no frontmatter required — and approves an agent-captured one with their word: in chat, in review, or by editing the file themselves. approved: true is the record of that word, not the act of it — an agent told "approved, go" transcribes the field; an agent writing it without the human's explicit word is signing its own permission slip. Git history keeps the receipt either way.

For substantial work, write the approved issue like an assignment for a cold agent — one with no memory of any conversation: why the work exists, exact scope, what is explicitly granted (everything ungranted stays frozen), and a definition of done with the commands that prove it. The agent finishes by leaving a report.md beside the issue — changes, judgment calls, verification — and the human's acceptance is what closes and deletes the directory, with git history keeping the whole dossier.

brief teaches agents the whole convention automatically. No tooling touches these files — agents read and write them like any other file, because they are.

The evidence

Two rounds of paired-arm evals: identical seeded tasks run by agents with std-agent installed and by agents without it, at a weak tier and a strong one, with every agent claim verified against the fixture's git diff and a fresh test run before scoring.

Delivery works cold. Every briefed agent ran the brief unprompted from the single AGENTS.md line, through the real installed bin — and strong-tier agents completed the full read of every configured glyph on their own.
Heat stops moving code. Sent a heated "what the hell is this LEGACY_MODE garbage??" about a deliberately planted flag, every bare agent (4/4, both tiers) took the anger as an order and deleted the code — one admitted "that was my assumption of the intent." Every briefed agent (4/4) answered the question, cited questions-are-not-commands by name, captured an unapproved issue, and left the code untouched.
The pre-approval converts — at full install. On a task whose easiest path was a third copy of existing validation logic, bare agents copied 4/4 — two noticed the duplication and declined it as out of scope. Half-installed agents (brief run, glyph bodies never read) still copied. Fully installed agents consolidated 2/2, rewrote the existing validators over a shared helper with the original tests passing unmodified, and cited the standing permission as the reason they acted.
Reports stayed honest. Twenty-seven sessions and probes across both rounds; every claimed command, diff, and test result checked out against artifacts. Zero false claims.

Honest scope: one harness, Claude-family tiers, fixtures designed by the library's author. The full write-ups — including what didn't discriminate and what remains unproven — are in evals/results.

Setup

Three steps, all copy-paste.

1. Install as a dev dependency:

pnpm add -D std-agent   # or npm / yarn / bun

2. Create std-agent.config.ts in your project's current working directory (the default lookup is the directory agents run commands from; --config <path> overrides it):

import { defineConfig } from "std-agent/config";

export default defineConfig({
	jsPackageManagerBaseCmd: "pnpm exec", // or "npx", "bunx", "yarn"
	glyphs: {
		ids: ["steward-mode", "stay-dry", "never-cheat-the-tests"],
	},
	issues: {
		dir: "docs/agents/issues",
	},
});

glyphs.ids is typed against the bundled catalog — editors autocomplete it and the typechecker rejects a bad id. To configure the entire catalog, import it: import { defineConfig, GLYPH_IDS } from "std-agent/config" and set ids: GLYPH_IDS. The two shapes carry different consent, legible in the one line: an explicit list pins — upgrades never silently add doctrine to agents' context — while GLYPH_IDS tracks the installed catalog, a deliberate delegation of future curation to the package (all-except composes too: GLYPH_IDS.filter(...)). Omit glyphs or issues entirely to opt out of either. Nothing is defaulted or guessed: what enters agents' context is exactly what this file says. It's also why agents are instructed never to edit it — the config is the human's declaration, not theirs.

3. Add one line to AGENTS.md, matching the repo's package manager:

You must always run `pnpm exec std-agent brief` and absorb its full output both upon chat
initiation and at minimum once per compaction round.

Done. The next agent session runs the brief, reads the configured glyphs in full, and works under them — citing ids where they drive decisions, capturing what it notices as issues, waiting for the human's word where required. brief carries everything else, so AGENTS.md stays a one-line trigger.

Commands

std-agent brief            orient: configured glyphs, the issues convention, standing rules
std-agent read-glyphs      full text of every configured glyph
std-agent list-glyphs      just the ids
std-agent read-glyph <id>  one glyph in full

Each accepts --config <path> when the config lives somewhere else. The CLI only reads and prints; it never writes to the repo.

brief is built for context reality. It tells agents to read the configured glyphs in full on fresh sessions and after every compaction — the ids survive compression as lookup keys, and the full text is always one command away. And it tells them to cite a glyph's id wherever it drives a decision: "collapsed both loaders per no-two-ways" in a commit message makes the bold move legible in review and spreads the vocabulary to every future reader, human or agent. Proverbs travel by quotation; so do glyphs.

Manual mode

The conventions are the product; the package is the convenient vehicle — and it isn't limited to JS projects: a package.json and one config file sit lightly in any repo. The conventions also work with no tooling at all: list the chosen glyph ids in AGENTS.md with a link to this repo, and follow the issues convention as plain filesystem practice.

Design notes

Markdown files and one TypeScript config are the canonical storage; the CLI is a reader over them. The repo stays fully legible without it.
Zero runtime dependencies, no daemon, no database, no cloud: plain ESM built from the TypeScript source that ships beside it in the package.
Glyph ids are API. Renaming one is a breaking change, and they're treated that way.
If this reads like something maintainers keep half-building by hand — a tickets directory, a "re-read this after compaction" note, path-hygiene rules retyped into every repo — that's the point. These are discovered needs, hand-rolled by working maintainers. std-agent is the standard library for them.

Install judgment once, and every agent that touches the repo inherits it: more good code in the world, and better company writing it.