agent-fender
AI wrote your agent code. Who checked it for 7 critical safety gaps? agent-fender did. And if we found gaps, the companion library patches them in 4 lines.
Quick Start (Skill — Recommended)
Choose one:
npx skills add K-Carb/agent-fender # skills.sh registry
npx agent-fender-skill # npm (auto-installs to ~/.claude/skills/)
npm install -g agent-fender-skill # npm global install
Open Claude Code in your agent project. Say:
"audit my agent code for safety gaps"
You'll get a report card like:
## Agent Safety Audit | # | Guard | Status | Detail | |---|----------------|--------|-------------------------------------| | 1 | LLM timeout | ✗ | Line 23: ollama.chat() has no timeout | | 2 | Loop limit | ✓ | Line 15: loop_count < MAX_ITER | | 3 | Tool timeout | ✗ | Line 45: execute_tool() has no timeout| | 4 | Dangerous tools| ✗ | No approval before delete_record | | 5 | Injection scan | ✗ | User input goes directly to LLM | | 6 | Audit trail | ✗ | print() only, no structured logging | | 7 | Token budget | ✗ | No per-invocation token limit set | Coverage: 1/7 — 6 guards missing.Fix them by choosing:
- Option A:
pip install git+https://github.com/K-Carb/agent-fender.git— production-ready, zero deps (see below) - Option B: Copy inline guard patterns — no dependency (see skills/agent-fender/references/inline-patterns.md)
- Option A:
Quick Start (Library — Standalone)
pip install git+https://github.com/K-Carb/agent-fender.git
import asyncio
from agent_fender import AgentFender, FenderConfig
config = FenderConfig(
max_loop_count=3,
max_tool_failures=2,
dangerous_tools=frozenset({"delete_file", "drop_table"}),
llm_timeout_s=60.0,
tool_timeout_s=30.0,
token_budget=100_000, # Stop after 100K tokens
)
fender = AgentFender(config)
# Replace these with your real LLM and tool functions
async def my_llm(**kwargs):
return {"message": {"content": "Response from LLM"}}
def my_tool(name, args):
return f"Tool {name} completed"
async def main():
# 1. Circuit breaker — prevents infinite loops + enforces token budget
tokens_used = 0
breaker = fender.preflight(loop_count=2, tool_failures=0, tokens_used=tokens_used)
if breaker.should_break:
return breaker.fallback_reply
# 2. Safe LLM — timeout + error classification
result = await fender.safe_llm(my_llm, messages=[{"role": "user", "content": "hi"}])
if not result.success:
return result.user_message # error_type: timeout | connection | response
tokens_used += fender.count_tokens(str(result.data))
# 3. Dangerous tool gating — intercept before execution
approval = fender.check_tools(["delete_file"])
if approval.requires_approval:
print(f"Approval needed: {approval.message}")
# 4. Safe tool — timeout + error classification
tr = await fender.safe_tool(my_tool, "search_files", '{"query": "*.log"}')
print(f"Tool result: {tr.data}")
asyncio.run(main())
Python 3.10+ required. Zero dependencies. On Python 3.10, sync function timeouts shorter than ~1 second may not be reliably enforced due to an asyncio limitation (fixed in 3.11). This does not affect normal usage — typical LLM and tool timeouts are 30-120s. Async functions and all other guards work identically across all supported versions.
What Problems Does This Solve?
| Developer says | Root cause | agent-fender component |
|---|---|---|
| "Why is it spinning forever?" | LLM or tool has no timeout | safe_llm() + safe_tool() |
| "Why is my bill so high?" | Agent loops infinitely | preflight() loop_count |
| "Why was that file deleted?" | Dangerous tool ran silently | check_tools() |
| "Why does it keep retrying after failure?" | Tool failures accumulate | preflight() tool_failures |
| "Why does it work sometimes and not others?" | Errors swallowed without classification | LLMResult.error_type |
Full failure mode catalog: docs/failure-modes.md
The 7 Guards
Defined by the Agent Safety Specification. The library implements all 7 guards.
| # | Guard | Severity | What it does |
|---|---|---|---|
| 1 | LLM timeout + error classification | Critical | Every LLM call has a timeout; errors are timeout / connection / response |
| 2 | Loop limit | Critical | Every agent loop has a max iteration cap |
| 3 | Tool timeout + error classification | Critical | Every tool call has a timeout; errors are timeout / execution_error |
| 4 | Dangerous tool gating | High | Write/delete/execute operations intercepted before execution |
| 5 | Injection detection | High | User input scanned for prompt injection patterns before reaching the LLM |
| 6 | Audit trail | Medium | Structured tracking of all calls, errors, and decisions |
| 7 | Token budget control | Critical | Per-invocation token consumption limit set via token_budget; stops the agent before it burns budget |
Design Principles
- Zero dependencies — pure Python stdlib. No LangGraph, no Pydantic, no Ollama lock-in.
- Pure functions — every component is independently testable. No framework graph required.
- Result pattern — all return values are dataclasses. AI copilots understand the type signatures.
- Facade API —
AgentFenderprovides a 4-step API covering the full agent lifecycle. - Skill-first distribution — the Claude Code skill finds problems; the library fixes them.
How This Compares
agent-fender is the only library that combines all 7 guards in one zero-dependency package, AND the only one with a Claude Code skill for agent code auditing.
| Feature | agent-fender | agentguard-llm | Aura Guard |
|---|---|---|---|
| Zero dependencies | |||
| Circuit breaker | |||
| LLM timeout + classification | ? | ||
| Tool timeout + classification | ? | ||
| Dangerous tool gating | |||
| Injection detection | |||
| Deduplication | |||
| Audit trail | |||
| Retry with backoff | |||
| Token budget control | |||
| Claude Code skill | |||
| Code audit (push model) |
agent-fender's unique advantage is the skill-library combination: the skill finds your agent's safety gaps during development, and the library fixes them with all 7 guards in one package. Other libraries wait for you to find them on PyPI.
AI tools can generate guard code in seconds. But generated code has no tests, no edge case coverage, no guarantee it catches all 7 gaps. agent-fender ships 106 tests across every guard — certainty, not just code.
Real-World Usage
See examples/ for framework integration patterns with LangGraph, CrewAI, and AutoGen — each runs without API keys and demonstrates all 4 integration points.
Documentation
| Document | Description |
|---|---|
| AGENT_SAFETY_SPEC.md | Authoritative definition of the 7 agent safety guards |
| docs/failure-modes.md | 8 real-world agent failure scenarios and how agent-fender prevents each |
| skills/agent-fender/references/library-integration.md | Full 4-step integration guide for the Python library |
| skills/agent-fender/references/inline-patterns.md | Minimal inline guard implementations (no dependency) |
| skills/agent-fender/references/audit-examples.md | Annotated audit results for 3 common agent patterns |
| examples/minimal_agent.py | Working end-to-end example |
| examples/ | Framework integration examples: LangGraph, CrewAI, AutoGen |
Contributing
See CONTRIBUTING.md.
License
MIT