agent skill

Skills Are Harness Engineering You Can Do in a Markdown File

The previous post established the formula: Agent = Model + Harness. Everything in an AI agent that is not the model — the loop, the tools, the memory, the verification — is harness. That framing is useful but can feel abstract if you are not building agent infrastructure from scratch.

There is one part of the harness that does not require writing infrastructure code. It does not require modifying the agent loop, implementing middleware, or deploying sandboxes. It requires writing a markdown file.

That part is skills.

What a skill is

A skill is a packaged set of instructions that gets injected into an agent's context when it is relevant. In its simplest form, it is a SKILL.md file: YAML frontmatter describing when to use it, followed by markdown instructions the agent follows when the skill is triggered.

---
name: code-review
description: Reviews code for quality, maintainability, and common mistakes.
---

When reviewing code, always:
1. Check for logic that could be simplified
2. Identify missing error handling
3. Flag violations of the project's coding conventions
4. Run tests before presenting results

That is a skill. It lives in a directory, gets discovered by the agent, and surfaces when the context matches. No API. No schema. No deployment. A folder and a markdown file.

Claude Code, DeepAgents, Codex, and a growing number of agent harnesses support the same pattern. Anthropic has formalised it as the Agent Skills open standard, and the format works across Claude Code, Cursor, Gemini CLI, and others. The ecosystem now includes thousands of community-built skills covering everything from security auditing to architecture diagramming to deployment checklists.

Where skills sit in the harness

In Birgitta Böckeler's harness engineering framework, skills are guides — feedforward controls that steer the agent before it acts. They increase the probability that the agent gets it right on the first attempt, rather than catching errors after the fact.

But skills are not just static text injection. The interesting design decision is when they surface.

The naive approach is to load every skill into the system prompt at session start. This wastes context. If you have thirty skills and the agent only needs one for the current task, you have burned tokens on twenty-nine irrelevant instruction sets — and you have increased the risk of the agent confusing unrelated instructions.

The better approach is progressive disclosure. LangChain's SkillsMiddleware implements this explicitly: at startup, only skill metadata (name, description) loads into the prompt. When the agent identifies a task that matches a skill, it reads the full SKILL.md and any supporting files. The rest stay on disk, consuming zero tokens.

Claude Code does something similar. Skills can be invoked explicitly via slash commands (/code-review) or triggered automatically when the agent recognises a relevant task. The agent reads the SKILL.md from the filesystem at the point of use, bringing its instructions into the context window on demand.

This is not a cosmetic distinction. It is a context engineering decision. Context is the scarcest resource in an agent system, and skills that manage their own context footprint are fundamentally more useful than skills that occupy space permanently.

What makes a good skill

The pattern that emerges from the community is clear: the best skills encode domain knowledge that the model does not have and cannot infer.

A skill that says "write clean code" adds nothing — the model already tries to do that. A skill that says "in this project, all database migrations must be backward-compatible, split into add-column and make-non-nullable phases, and tested against the staging schema before merge" encodes institutional knowledge that no model has. The gap between generic capability and project-specific correctness is exactly what skills fill.

The strongest skill categories are repetitive workflows (deployment checklists, code review passes, PR templates), architectural constraints (module boundaries, dependency rules, API conventions), tool-specific knowledge (how your team uses Datadog, how your CI pipeline works, what your monitoring alerts mean), and output formats (how diagrams should look, how documentation should be structured, what a commit message should contain).

Each of these shares a property: they are things a senior engineer knows from experience that a new team member would need to be told. Skills are the mechanism for telling the agent.

Skills as the entry point to harness engineering

This is the point that matters for most people reading about harnesses for the first time.

The harness engineering literature — Böckeler's guides-and-sensors framework, LangChain's middleware architecture, Anthropic's initialiser-plus-coding-agent pattern — describes infrastructure-level work. System prompts, middleware hooks, compaction strategies, verification pipelines. All valuable, all requiring engineering effort.

Skills bypass that barrier. Writing a SKILL.md file is closer to writing documentation than writing infrastructure. If you can describe how a task should be done — step by step, with the right level of specificity — you can build a skill. The harness handles discovery, injection, and execution.

This makes skills the most accessible form of harness engineering. You are not modifying the agent loop. You are not writing middleware. You are encoding domain knowledge in a format the agent can consume, and the existing harness machinery does the rest.

The progression is natural: start with skills (encode what the agent should know), then add computational sensors (linters, tests, structural checks that verify the agent's output), then consider inferential controls (review agents, LLM-as-judge) and middleware customisation. Each step adds capability. The first step — skills — requires nothing more than a markdown file and a clear understanding of what the agent needs to know that it does not already know.

The deeper pattern

Skills make explicit something that was always true about software engineering: the hardest knowledge to transfer is not technical — it is institutional. How this team does things. Why that architectural decision was made. What broke last time someone tried the obvious approach.

Senior engineers carry this knowledge implicitly. They apply it without thinking about it, often without being able to articulate it. Junior engineers acquire it slowly, through months of code review and production incidents and tribal conversation.

Skills are the mechanism for making that knowledge explicit and machine-readable. Not as documentation that humans read and sometimes follow, but as instructions that an agent reads and reliably executes. The agent does not forget, does not skip steps, does not decide that this time the checklist does not apply.

That reframing — from "I should write this down" to "I should write this as a skill" — is the smallest possible shift with the largest possible leverage on agent quality. The model is the same for everyone. The skills are yours.

Unlock the Future of Business with AI

Dive into our immersive workshops and equip your team with the tools and knowledge to lead in the AI era.

Scroll to top