The last three chapters introduced the foundations of a disciplined software workflow: version control, packaging, tests, and debugging. This chapter adds AI coding agents to that workflow. At first glance, capable agents appear to make the earlier material less important. If an assistant can generate code, write tests, explain errors, and open pull requests, why invest so much effort in process?
The answer is that AI increases the value of engineering discipline rather than replacing it. A fast agent can produce a large amount of plausible code, but it cannot guarantee that the code matches your intent, fits your architecture, or is safe to run. The more capable the assistant becomes, the more important it is to surround it with a harness that distinguishes correct from merely convincing. In practice, that harness is built from specifications, tests, version control, review, and tool permissions, among other engineering practices.
Throughout the chapter we will use GitHub Copilot in Visual Studio Code, but the ideas are tool-agnostic. The same concerns appear with Codex, Claude Code, Cursor, Windsurf, and other agentic environments.
From prompts to harnesses¶
Between 2023 and 2026, the dominant framing for working with coding models shifted twice.
Prompt engineering focused on phrasing: how to ask for the right thing.
Context engineering focused on selection: which files, tests, decisions, and examples the model should see.
Harness engineering focuses on the environment around the model: executable specs, reusable artifacts, tool approvals, and workflows that make mistakes visible before they become expensive.
This chapter sits in that third era. The central question is no longer “how do I word the prompt?” It is “what environment makes the agent reliable enough to help me?”
Common failure modes in AI coding¶
AI coding assistants fail in recurring ways. The details vary from one model or editor to another, but the patterns are stable enough that you should plan for them in advance.
| # | Failure mode | What it looks like |
|---|---|---|
| 1 | Misalignment / goal drift | The agent solves a different problem from the one you intended to solve. |
| 2 | Hallucinated capabilities | The model invents flags, APIs, arguments, or libraries that do not exist in your environment. |
| 3 | Context loss | Earlier decisions disappear from the working memory and the agent contradicts itself later. |
| 4 | Context bloat | Too much irrelevant text is included, so the important constraints are diluted. |
| 5 | Over-engineering | The assistant adds abstractions, files, or patterns that were never requested. |
| 6 | Codebase entropy | Repeated AI edits slowly erode consistency, naming conventions, and architecture. |
| 7 | Decision amnesia | Rationale is never captured, so the same design question gets re-opened every session. |
| 8 | Destructive actions | Broad permissions let the agent delete files, alter history, or change systems it should only inspect. |
| 9 | Prompt injection and poisoned memory | Instructions embedded in docs, tools, or retrieved text redirect the agent away from your real goal. |
What this chapter adds¶
The response to these failure modes is not a single magic prompt. It is a small set of engineering practices and artifacts that let you control the collaboration.
In ch4/ch41.md we focus on two development practices:
Spec-Driven Development to clarify the behavior before asking for code.
Test-Driven Development to make that behavior executable and verifiable.
In ch4/ch42.md we focus on the artifacts that make those practices sustainable:
AGENTS.md to store persistent conventions and working agreements.
Prompt and skill files to turn repeated instructions into reusable assets.
Tools and MCP servers to give the agent controlled access to external information and actions.
Approvals and least privilege to keep automation inside safe boundaries.
References¶
Beck, K. (2002). Test Driven Development: By Example. Addison-Wesley.
Building Effective Agents — Anthropic Engineering, 2024.
AGENTS.md — open standard for repository-level agent instructions.
OWASP Top 10 for Agentic Applications 2026.
Characterizing Faults in Agentic AI — arXiv 2603.06847, March 2026.
Thoughtworks Technology Radar Vol. 33 (Q4 2025): Spec-Driven Development and Context Engineering for Coding Agents.