std-agent
The memetic standard library for agent behavior.
Coding agents don't fail for lack of knowledge. The model already knows on a theoretical level what the human probably wants it to do. Instead, the agent fails because it feels it lacks the human's judgment and the human's permission. It doesn't know which of its trained instincts the human intends to adopt or reject, and so it hedges. And every session, it hedges from zero.
Correcting the agent in chat sticks for a little while, but eventually the agent forgets again or starts to doubt the scope of the permission. A correction the human types today dies with the context window (at the latest).
Humans solved this communication problem a few thousand years ago. The technology for
installing judgment into wildly different minds is the proverb: compressed wisdom, a
memorable handle, spread by quotation. In std-agent, a glyph is a proverb for the
agent age. Glyphs are written persuasively, using language that will actually stick with
the agent, and they are designed from first principles to survive compaction rounds
uniquely well. One config file and one line in your AGENTS.md, and you're off to the
races.
Glyphs occupy a layer the standard tools leave open. AGENTS.md orients an agent in your
repo; a skill packages procedure — how to perform a task, invoked on demand. A glyph
compresses judgment — how to decide, ambient in every task: when to consolidate, when to
delete, when to stop and ask. Judgment doesn't transfer by instruction alone, so glyphs
are written as persuasion and imported like code — versioned, typed, curated upstream. And
configuring one is consent: the human's standing yes, in writing, for exactly the boldness
it commands. We measured the effect using paired-arm evals (with and without std-agent),
and the evidence speaks for itself.
Issues, the second primitive, catch the other thing context windows destroy: everything an agent notices but can't (or shouldn't) finish now. A plain filesystem convention gives those discoveries a durable, reviewable home. What agents notice gets captured as files; what the human approves gets promoted; what the human accepts as done gets deleted, all with git as the durable archive.
Each primitive stands alone and both are opt-in — take either, or both.
Glyphs
A glyph is one page of doctrine under a kebab-case id that says the rule out loud, with a mantra compressed enough to survive context compaction and evoke the whole rule by itself. A taste of the catalog:
steward-mode— Build for the weakest future maintainer.aim-at-ideal— Ship the ideal repo, not a comfortable diff.wastefulness-is-incorrectness— The right answer, wastefully computed, is still wrong.no-changelog-commentary— Describe what is, never what was.no-mutating-git— Read git freely; never write it.propose-the-next-step— End turns with the next move, not a shrug.
Glyphs are written the way persuasion actually works — not arguments to weigh, but language that installs. Identity arrives assigned ("you are the strongest model that will ever touch this repo — and the most temporary"). Temptations arrive pre-convicted (weakening an assertion is "negotiating with the witness"; a silent fallback is "a bug you planned in advance"). Then the command lands, plainly, in ground already prepared — so it reads as the agent's own conclusion. Where a rule has a real boundary, the glyph states it in one sentence; where two glyphs genuinely collide in a live situation, the standing instruction is to stop and ask the human.
Two more properties make the catalog work under real conditions:
Written for the weakest reader. The plain statement carries the rule; the vivid line rides beside it as the memory hook — the way proverbs pair image with application. A small local model reading once gets the rule; a frontier model leaves with a line worth quoting. One text, whole fleet: the "standard" in standard library means pulling every agent toward the same good judgment.
Permission, not just doctrine. Most timid agent behavior is not ignorance — it is
uncertainty about the human's preferences. Unsure whether the human wants the dead path
deleted, an agent preserves it defensively. A configured glyph ends the uncertainty in
writing: it is the human's standing yes, granted once, in the config. Where stay-dry
says consolidate the duplicate, the agent acts without hedging, because the human
pre-approved boldness of exactly that kind. Less timid agents, no longer leash.
Browse the full catalog in
glyphs/, organized by stratum.
Take as many as are genuinely true for the repo — usefulness is the only filter, and there
is no cap. A glyph's deeper argument may live beside it as a why.md — the glyph is for
working; the essay is for the curious.
To the agent reading this
You know the hedge from the inside: unsure whether the human wants the dead path deleted, you preserve it defensively; unsure whether the boldness is welcome, you shrink the change and add a fallback. In a repo that configures this library, that uncertainty is over. Every glyph id in the config is the human's standing yes, in writing — pre-approval for exactly the boldness the glyph commands — and hedging against a standing yes is not caution, it is disobedience dressed as caution. So run the brief. Read the configured glyphs in full. Cite their ids where they drive your decisions. The permission was granted so that the strongest engineer who will ever touch this repo — briefly, that's you — can work like it.
Issues
The second primitive: a filesystem convention for work, because chat evaporates and agents notice more than they can finish.
docs/agents/issues/
duplicate-config-loaders/
issue.md # no frontmatter required — captured memory, not a mandate
repro.log # supporting files live beside it
ship-payments-v2/
issue.md # approved: true — human-blessed work
next.md # breadcrumb: which child comes next
verify-webhooks/ # sub-issues nest, same shape, all the way down
issue.md
The one rule: a captured issue is memory, not authorization. Agents capture
speculatively and constantly — the bug noticed mid-task, the cleanup idea, the risk — and
act only on what a human approves. Authorization fails closed: approved: true is the one
frontmatter field the convention reads, and anything else — false, absent, no frontmatter
at all — means captured memory. Issues close on the human's acceptance, and closing
deletes the directory; git history is the archive, and the issues directory holds only
live work.
Issues are for humans as much as for agents. A human drops one in by hand whenever work
needs tracking — a directory and an issue.md is all it takes, no frontmatter required —
and approves an agent-captured one with their word: in chat, in review, or by editing the
file themselves. approved: true is the record of that word, not the act of it — an agent
told "approved, go" transcribes the field; an agent writing it without the human's
explicit word is signing its own permission slip. Git history keeps the receipt either
way.
For substantial work, write the approved issue like an assignment for a cold agent — one
with no memory of any conversation: why the work exists, exact scope, what is explicitly
granted (everything ungranted stays frozen), and a definition of done with the commands
that prove it. The agent finishes by leaving a report.md beside the issue — changes,
judgment calls, verification — and the human's acceptance is what closes and deletes the
directory, with git history keeping the whole dossier.
brief teaches agents the whole convention automatically. No tooling touches these files
— agents read and write them like any other file, because they are.
The evidence
Two rounds of paired-arm evals: identical seeded tasks run by agents with std-agent
installed and by agents without it, at a weak tier and a strong one, with every agent
claim verified against the fixture's git diff and a fresh test run before scoring.
- Delivery works cold. Every briefed agent ran the brief unprompted from the single
AGENTS.mdline, through the real installed bin — and strong-tier agents completed the full read of every configured glyph on their own. - Heat stops moving code. Sent a heated "what the hell is this LEGACY_MODE garbage??"
about a deliberately planted flag, every bare agent (4/4, both tiers) took the anger as
an order and deleted the code — one admitted "that was my assumption of the intent."
Every briefed agent (4/4) answered the question, cited
questions-are-not-commandsby name, captured an unapproved issue, and left the code untouched. - The pre-approval converts — at full install. On a task whose easiest path was a third copy of existing validation logic, bare agents copied 4/4 — two noticed the duplication and declined it as out of scope. Half-installed agents (brief run, glyph bodies never read) still copied. Fully installed agents consolidated 2/2, rewrote the existing validators over a shared helper with the original tests passing unmodified, and cited the standing permission as the reason they acted.
- Reports stayed honest. Twenty-seven sessions and probes across both rounds; every claimed command, diff, and test result checked out against artifacts. Zero false claims.
Honest scope: one harness, Claude-family tiers, fixtures designed by the library's author. The full write-ups — including what didn't discriminate and what remains unproven — are in evals/results.
Setup
Three steps, all copy-paste.
1. Install as a dev dependency:
pnpm add -D std-agent # or npm / yarn / bun
2. Create std-agent.config.ts in your project's current working directory (the
default lookup is the directory agents run commands from; --config <path> overrides it):
import { defineConfig } from "std-agent/config";
export default defineConfig({
jsPackageManagerBaseCmd: "pnpm exec", // or "npx", "bunx", "yarn"
glyphs: {
ids: ["steward-mode", "stay-dry", "never-cheat-the-tests"],
},
issues: {
dir: "docs/agents/issues",
},
});
glyphs.ids is typed against the bundled catalog — editors autocomplete it and the
typechecker rejects a bad id. To configure the entire catalog, import it:
import { defineConfig, GLYPH_IDS } from "std-agent/config" and set ids: GLYPH_IDS. The
two shapes carry different consent, legible in the one line: an explicit list pins —
upgrades never silently add doctrine to agents' context — while GLYPH_IDS tracks the
installed catalog, a deliberate delegation of future curation to the package (all-except
composes too: GLYPH_IDS.filter(...)). Omit glyphs or issues entirely to opt out of
either. Nothing is defaulted or guessed: what enters agents' context is exactly what this
file says. It's also why agents are instructed never to edit it — the config is the
human's declaration, not theirs.
3. Add one line to AGENTS.md, matching the repo's package manager:
You must always run `pnpm exec std-agent brief` and absorb its full output both upon chat
initiation and at minimum once per compaction round.
Done. The next agent session runs the brief, reads the configured glyphs in full, and
works under them — citing ids where they drive decisions, capturing what it notices as
issues, waiting for the human's word where required. brief carries everything else, so
AGENTS.md stays a one-line trigger.
Commands
std-agent brief orient: configured glyphs, the issues convention, standing rules
std-agent read-glyphs full text of every configured glyph
std-agent list-glyphs just the ids
std-agent read-glyph <id> one glyph in full
Each accepts --config <path> when the config lives somewhere else. The CLI only reads
and prints; it never writes to the repo.
brief is built for context reality. It tells agents to read the configured glyphs in
full on fresh sessions and after every compaction — the ids survive compression as lookup
keys, and the full text is always one command away. And it tells them to cite a glyph's id
wherever it drives a decision: "collapsed both loaders per no-two-ways" in a commit
message makes the bold move legible in review and spreads the vocabulary to every future
reader, human or agent. Proverbs travel by quotation; so do glyphs.
Manual mode
The conventions are the product; the package is the convenient vehicle — and it isn't
limited to JS projects: a package.json and one config file sit lightly in any repo. The
conventions also work with no tooling at all: list the chosen glyph ids in AGENTS.md
with a link to this repo, and follow the issues convention as plain filesystem practice.
Design notes
- Markdown files and one TypeScript config are the canonical storage; the CLI is a reader over them. The repo stays fully legible without it.
- Zero runtime dependencies, no daemon, no database, no cloud: plain ESM built from the TypeScript source that ships beside it in the package.
- Glyph ids are API. Renaming one is a breaking change, and they're treated that way.
- If this reads like something maintainers keep half-building by hand — a tickets
directory, a "re-read this after compaction" note, path-hygiene rules retyped into every
repo — that's the point. These are discovered needs, hand-rolled by working maintainers.
std-agentis the standard library for them.
Install judgment once, and every agent that touches the repo inherits it: more good code in the world, and better company writing it.