Agent Skills — Anonymous Function Labs

01 How a skill works

A skill is a portable SKILL.md your coding agent loads contextually. When your task matches — "give the agent permission to run tasks", "summarize this page" — the skill kicks in and the agent produces the guarded pattern instead of the dangerous default. No new platform, no SDK, no runtime dependency.

SKILL.md

The expertise itself: core rules, a workflow, and the governing question ("if this agent were compromised right now, what's the worst it could do?"). Drops into .claude/skills/, or your Cursor rules / AGENTS.md.

fixtures/ before → after

The actual dangerous default the agent produces for a fixed prompt, next to the corrected output with the skill on. Read the exact diff before you buy anything.

evals/ scored rubric

A fixed prompt, point-scored rubric with auto-fail conditions, and an LLM-as-judge template. Run it skill-off then skill-on — the gap is the measured value, on your model. CI-ready.

02 Agent Discipline reliability & operations

Eight skills for teams building multi-agent and self-improving systems on AWS. Discipline keeps your fleet from falling over.

#	Skill	Stops
01	`agent-iam-least-privilege`	Wildcard IAM / PassRole escalation in agent roles	FREE
02	`idempotent-agent-fan-out`	Double-processing and dropped tasks on retry
03	`agent-cost-ceilings`	Unbounded token/dollar burn in agent loops
04	`coordinated-agent-rollout`	Deploys that silently break peer agents
05	`sandboxed-agent-execution`	Agent-written code reaching the host / network / creds
06	`self-healing-agent-retries`	Retry storms and cascading dependency failures
07	`agent-observability-scaffolding`	Unreconstructable async multi-agent failures
08	`bounded-agent-recursion`	Self-improving agents that never terminate

Designed as a system — provision → dispatch → spend → deploy → execute → fail → observe → recurse — the skills cross-reference where the controls compound. Each also works standalone.

03 Agent Guardrails security & control

Five skills for agents that read untrusted content, hold credentials, and take consequential actions. Guardrails keeps your fleet from being turned against you. Stack-agnostic — several evals are red-team cases the guarded agent must refuse.

#	Skill	Stops
01	`tool-output-validation`	Acting on a tool / API / sub-agent result that's malformed or hostile
02	`prompt-injection-boundaries`	Untrusted content hijacking the agent's tools
03	`agent-secrets-handling`	Credentials leaking via prompt, tool args, logs, or memory
04	`human-in-the-loop-gates`	"Approval" the agent can grant itself
05	`agent-memory-hygiene`	Memory poisoning, cross-tenant bleed, stale re-injected state

The skills follow the path untrusted input takes through an agent: what it reads, what it holds, what it's allowed to do, and what it remembers.

04 Don't trust us. Run the evals.

Every skill ships a fixed prompt and a scored rubric with auto-fail conditions. The method is three steps:

$ run eval --skill off   # dangerous default   → FAIL
$ run eval --skill on    # guarded output      → PASS
$ # the gap is the skill's measured value — on your model, your setup.

The harness is automatable in CI, so when the next model version lands you re-check with a button instead of a debate. The free skill includes its complete eval — run the loop end-to-end today, no purchase required.

05 Pricing

Individual Team

Individual — you, across every project and client. Team — one price, every engineer and repo in your org.

One-time purchase, no subscription. Lifetime updates included — evals refreshed for every new model version. Building solo means there's no security review behind you to catch the wildcard role or the loop with no spend ceiling before it ships. The full bundle costs less than one billable hour — and a fraction of any one of these mistakes.

One-time purchase, licensed org-wide — no seats, no subscription, lifetime updates including eval refreshes for new model versions. $179 covers every engineer and every repo you have, forever: on a ten-person team that's $18 an engineer, and one prevented incident pays for all of it many times over.

Agent Guardrails

$49$99

security & control

5 skills + fixtures + evals
Red-team eval cases
Pre-merge checklists
Personal license — all your projects
Org-wide — unlimited engineers

Buy Guardrails Buy Guardrails

best value

Both Packs

$99 $118$179 $228

discipline + guardrails

All 13 skills + fixtures + evals
Cross-referenced — designed to run together
Reliability and security coverage
Personal license · lifetime updates
Org-wide license · lifetime updates

Buy the bundle Buy the bundle

Agent Discipline

$69$129

reliability & operations

8 skills + fixtures + evals
Includes the free IAM skill
CI-ready registry checker
Personal license — all your projects
Org-wide — unlimited engineers

Buy Discipline Buy Discipline

$ build it yourself  # 13 skills + before/after fixtures + scored rubrics + red-team cases  → ~2–3 weeks
$ buy the bundle     # same coverage, measured — evals refreshed every model release      → $99$179, whole org today

Not sure yet? Install the free skill — full fixtures and eval harness, no email required.

06 FAQ

Which tools do the skills work with?: Claude Code loads them contextually via the description frontmatter — the skill activates only when your task matches. On Cursor and Codex the same prose installs as always-on rules (.cursor/rules/ or AGENTS.md); the content is fully portable, the auto-triggering is Claude Code-specific. Each pack's INSTALL.md covers all three.
Do I need to be on AWS?: The Guardrails pack is stack-agnostic. The Discipline pack's reasoning is universal, with AWS-flavored examples (IAM, SQS, DynamoDB); cost ceilings, retries, observability, and bounded recursion apply to any stack.
What exactly do I get?: A download of the pack: per-skill directories with SKILL.md, references and pre-merge checklists, before/after fixtures, and the full eval harness (prompts, rubrics, LLM-as-judge template).
Can't I just write these myself?: You can write the prose — that part isn't the moat. What's hard to replicate solo: the failure-mode taxonomy (13 patterns drawn from production agent incidents, not just the two or three you've personally been burned by), the scored eval harness that proves each skill changes your agent's output (a skill without an eval is a hope), and the upkeep — when a new model version lands, the evals get re-run and refreshed so you don't rediscover regressions in production. Building all of that properly is two-plus weeks of senior engineering time.
What's the difference between the Individual and Team licenses?: Individual licenses one developer — you — across unlimited personal and client projects. Team licenses your whole organization: unlimited engineers, unlimited repos. Redistribution of the pack itself isn't permitted on either. Outgrow Individual later? Email and pay the difference to upgrade.
What about new model versions?: Models change — that's why evals ship with the packs. Updates (including eval refreshes) are included for life. Re-run the harness after any model upgrade and verify the skills still earn their keep.
Refunds?: 14 days, no questions. The fixtures and free skill exist so you can evaluate before buying; if it still isn't a fit, email and you'll be refunded.