Agentic Workflows
Agentic coding means the model plans, edits files, runs commands, and iterates — not just suggests the next line.
Last reviewed: June 2026
Agent UI labels and CLI flags change between tool versions. Confirm against Cursor docs and Claude Code docs before training your team.
Prerequisites
Read Prompting for Code and Verifying AI Output. Agent modes amplify both good prompts and bad scope.
The Plan → Implement → Review Loop
flowchart LR
plan[Plan mode] --> approve[Human approves scope]
approve --> implement[Agent implements]
implement --> verify[Lint test build]
verify -->|fail| debug[Debug with error output]
debug --> implement
verify -->|pass| review[Human diff review]
review --> merge[Merge]
Never skip human review on production code. Agents are fast drafters, not accountable engineers.
Cursor Walkthrough: Fix a Failing Test
- Open Plan mode (or Ask first if the failure is unclear). Paste the exact CI error and name the test file.
- Approve a narrow plan — e.g. "Change only
src/utils/formatDate.test.tsandsrc/utils/formatDate.ts." - Switch to Agent mode with the approved plan attached. Example prompt:
Implement the approved plan. Do not edit files outside src/utils/formatDate*.
Run npm test -- formatDate after changes.
- If tests fail, switch to Debug mode or paste stderr into Agent:
Tests still fail with:
<paste exact output>
Fix only formatDate.ts — do not refactor other modules.
- Review the diff line by line before commit. Run pre-merge checklist.
Cursor reference: Agent mode, Plan mode, @-mentions.
Claude Code Walkthrough: Same Task in the Terminal
# Start in repo root
claude
# Plan first — no edits until you approve
> Read src/utils/formatDate.test.ts and the failure below.
> Propose a plan limited to formatDate.ts and its test. Do not implement yet.
> Failure: Expected "Jan 1, 2026" received "1/1/2026"
After reviewing the plan:
> Implement the plan. Run npm test -- formatDate when done.
If the agent loops:
> Stop. Revert changes to files outside src/utils/formatDate*.
> Show git diff --stat only.
Claude Code reference: Claude Code overview.
When Agents Help
| Task | Why agents work |
|---|---|
| Repetitive refactors with clear rules | Pattern is well-defined |
| Test fixes after CI failure | Error output is concrete |
| Scaffolding CRUD with existing patterns | Templates exist in repo |
| Documentation passes | Low risk; easy to verify |
| Migration scripts with dry-run | Verifiable output |
When Agents Hurt
| Task | Why to stay manual |
|---|---|
| Auth / payments / crypto | High security stakes |
| Visual CSS/layout polish | Models miss pixel nuance |
| Architecture decisions | Needs human tradeoff judgment |
| Vague "make it better" requests | Scope creep → common mistakes |
| Unfamiliar legacy codebase | Agent lacks historical context |
Subagents and Parallel Work
Tools like Cursor can spawn subagents for parallel exploration — e.g., one agent searches for usages while another drafts a fix.
Guidelines:
- Give each subagent a narrow mission and file boundary
- Merge results through a single human-reviewed diff
- Do not let subagents edit the same file concurrently
Example Cursor prompt:
Subagent A: List all imports of formatDate across src/ — report only, no edits.
Subagent B: Draft fix in src/utils/formatDate.ts only.
I will merge results manually.
Background Agents
Run long tasks (full test suite fixes, doc generation) in background while you continue.
Set guardrails:
- Max files or directories
- Required commands before completion (
npm test) - Timeout and cancel if looping
Failure Recovery Playbook
Signs you're stuck in a refactor spiral:
- Agent "fixes" lint by rewriting unrelated files
- Each fix introduces a new test failure elsewhere
- Diff grows while the original bug remains
Annotated recovery example
Situation: You asked to fix one failing test. The agent touched 14 files and tests still fail.
| Step | Action |
|---|---|
| 1 | git stash or git checkout -- . to last green commit |
| 2 | Re-open Plan mode with: file list, test name, exact error only |
| 3 | Reject plans that mention files outside the boundary |
| 4 | Agent implements → you run npm test locally |
| 5 | If still failing, paste only new stderr — not "try again" |
| 6 | Human reviews diff ≤2 files before merge |
Bad vs good recovery prompt
Bad: "Keep trying until tests pass."
Good: "Revert unrelated changes. Edit only formatDate.ts. Error: AssertionError: expected 'Jan 1'. Run npm test -- formatDate. Stop if more than 2 files change."
CI Integration
Claude Code SDK and similar tools can run in GitHub Actions to automate triage, summaries, and release notes. Always require passing CI and human approval before merge.
PR Summarization Workflow
This workflow posts an AI-generated summary as a PR comment whenever a pull request is opened or updated:
# .github/workflows/ai-pr-summary.yml
name: AI PR Summary
on:
pull_request:
types: [opened, synchronize]
jobs:
summarize:
runs-on: ubuntu-latest
permissions:
pull-requests: write
contents: read
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Generate PR summary
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
PR_NUMBER: ${{ github.event.pull_request.number }}
run: |
DIFF=$(git diff origin/${{ github.base_ref }}...HEAD -- '*.ts' '*.tsx' '*.js' '*.jsx' | head -c 12000)
SUMMARY=$(node -e "
const Anthropic = require('@anthropic-ai/sdk');
const client = new Anthropic();
(async () => {
const msg = await client.messages.create({
model: 'claude-haiku-4-20250514',
max_tokens: 512,
messages: [{
role: 'user',
content: 'Summarize this diff for a PR description in 3-5 bullet points. Focus on what changed and why, not how.\n\n' + process.env.DIFF
}]
});
console.log(msg.content[0].text);
})();
" DIFF="$DIFF")
gh pr comment $PR_NUMBER --body "**AI Summary (verify before merging)**\n\n$SUMMARY"
Failing Test Triage Workflow
When CI fails, this workflow attaches an AI diagnosis to the PR. A human still decides whether to act on it.
# .github/workflows/ai-test-triage.yml
name: AI Test Triage
on:
workflow_run:
workflows: ["CI"]
types: [completed]
jobs:
triage:
if: ${{ github.event.workflow_run.conclusion == 'failure' }}
runs-on: ubuntu-latest
permissions:
pull-requests: write
steps:
- uses: actions/checkout@v4
- name: Install deps and capture test output
run: npm ci && npm test 2>&1 | tail -100 > test-output.txt || true
- name: Post AI triage comment
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
FAILURES=$(cat test-output.txt)
PR_NUMBER=$(gh pr list --head ${{ github.head_ref }} --json number -q '.[0].number')
TRIAGE=$(node -e "
const Anthropic = require('@anthropic-ai/sdk');
const client = new Anthropic();
(async () => {
const msg = await client.messages.create({
model: 'claude-haiku-4-20250514',
max_tokens: 512,
messages: [{
role: 'user',
content: 'These CI tests failed. Identify the root cause and suggest a minimal fix. Output a brief diagnosis and one concrete suggestion.\n\n' + process.env.FAILURES
}]
});
console.log(msg.content[0].text);
})();
" FAILURES="$FAILURES")
gh pr comment $PR_NUMBER --body "**AI Test Triage (human review required)**\n\n$TRIAGE"
Key rules for CI AI workflows:
- Store API keys in GitHub Secrets — never in workflow YAML
- Pin model IDs; a model upgrade could change output format
- Mark AI comments clearly ("AI Summary", "AI Triage") so reviewers know what they're reading
- Never auto-merge based on AI output — require a human approval gate
See Claude Code SDK and Testing AI-Generated Code.
For Teams
| Policy area | Recommendation |
|---|---|
| Background agents | Allowed on docs/tests only unless explicitly approved for production paths |
| Required CI | lint, test, build must pass before merge — agent output is not exempt |
| PR labels | Use ai-assisted for traceability; require human reviewer on auth/billing paths |
| File boundaries | Encode in Project Rules — max scope per agent task |
| Incident response | If agent commits secrets, rotate keys per Team AI Policy |
Rollout checklist:
- [ ] Shared rules file with agent boundaries
- [ ] Plan-before-Agent norm documented
- [ ] Pre-merge verification required on all AI-assisted PRs
- [ ] Background agents disabled or scoped for junior contractors