1. ai
  2. /building
  3. /agentic-workflows

Agentic Workflows

Agentic coding means the model plans, edits files, runs commands, and iterates — not just suggests the next line.

Last reviewed: June 2026

Agent UI labels and CLI flags change between tool versions. Confirm against Cursor docs and Claude Code docs before training your team.

Prerequisites

Read Prompting for Code and Verifying AI Output. Agent modes amplify both good prompts and bad scope.

The Plan → Implement → Review Loop

flowchart LR
    plan[Plan mode] --> approve[Human approves scope]
    approve --> implement[Agent implements]
    implement --> verify[Lint test build]
    verify -->|fail| debug[Debug with error output]
    debug --> implement
    verify -->|pass| review[Human diff review]
    review --> merge[Merge]

Never skip human review on production code. Agents are fast drafters, not accountable engineers.

Cursor Walkthrough: Fix a Failing Test

  1. Open Plan mode (or Ask first if the failure is unclear). Paste the exact CI error and name the test file.
  2. Approve a narrow plan — e.g. "Change only src/utils/formatDate.test.ts and src/utils/formatDate.ts."
  3. Switch to Agent mode with the approved plan attached. Example prompt:
Implement the approved plan. Do not edit files outside src/utils/formatDate*.
Run npm test -- formatDate after changes.
  1. If tests fail, switch to Debug mode or paste stderr into Agent:
Tests still fail with:
<paste exact output>
Fix only formatDate.ts — do not refactor other modules.
  1. Review the diff line by line before commit. Run pre-merge checklist.

Cursor reference: Agent mode, Plan mode, @-mentions.

Claude Code Walkthrough: Same Task in the Terminal

# Start in repo root
claude

# Plan first — no edits until you approve
> Read src/utils/formatDate.test.ts and the failure below.
> Propose a plan limited to formatDate.ts and its test. Do not implement yet.
> Failure: Expected "Jan 1, 2026" received "1/1/2026"

After reviewing the plan:

> Implement the plan. Run npm test -- formatDate when done.

If the agent loops:

> Stop. Revert changes to files outside src/utils/formatDate*.
> Show git diff --stat only.

Claude Code reference: Claude Code overview.

When Agents Help

TaskWhy agents work
Repetitive refactors with clear rulesPattern is well-defined
Test fixes after CI failureError output is concrete
Scaffolding CRUD with existing patternsTemplates exist in repo
Documentation passesLow risk; easy to verify
Migration scripts with dry-runVerifiable output

When Agents Hurt

TaskWhy to stay manual
Auth / payments / cryptoHigh security stakes
Visual CSS/layout polishModels miss pixel nuance
Architecture decisionsNeeds human tradeoff judgment
Vague "make it better" requestsScope creep → common mistakes
Unfamiliar legacy codebaseAgent lacks historical context

Subagents and Parallel Work

Tools like Cursor can spawn subagents for parallel exploration — e.g., one agent searches for usages while another drafts a fix.

Guidelines:

  • Give each subagent a narrow mission and file boundary
  • Merge results through a single human-reviewed diff
  • Do not let subagents edit the same file concurrently

Example Cursor prompt:

Subagent A: List all imports of formatDate across src/ — report only, no edits.
Subagent B: Draft fix in src/utils/formatDate.ts only.
I will merge results manually.

Background Agents

Run long tasks (full test suite fixes, doc generation) in background while you continue.

Set guardrails:

  • Max files or directories
  • Required commands before completion (npm test)
  • Timeout and cancel if looping

Failure Recovery Playbook

Signs you're stuck in a refactor spiral:

  • Agent "fixes" lint by rewriting unrelated files
  • Each fix introduces a new test failure elsewhere
  • Diff grows while the original bug remains

Annotated recovery example

Situation: You asked to fix one failing test. The agent touched 14 files and tests still fail.

StepAction
1git stash or git checkout -- . to last green commit
2Re-open Plan mode with: file list, test name, exact error only
3Reject plans that mention files outside the boundary
4Agent implements → you run npm test locally
5If still failing, paste only new stderr — not "try again"
6Human reviews diff ≤2 files before merge
Bad vs good recovery prompt

Bad: "Keep trying until tests pass."

Good: "Revert unrelated changes. Edit only formatDate.ts. Error: AssertionError: expected 'Jan 1'. Run npm test -- formatDate. Stop if more than 2 files change."

CI Integration

Claude Code SDK and similar tools can run in GitHub Actions to automate triage, summaries, and release notes. Always require passing CI and human approval before merge.

PR Summarization Workflow

This workflow posts an AI-generated summary as a PR comment whenever a pull request is opened or updated:

# .github/workflows/ai-pr-summary.yml
name: AI PR Summary

on:
  pull_request:
    types: [opened, synchronize]

jobs:
  summarize:
    runs-on: ubuntu-latest
    permissions:
      pull-requests: write
      contents: read
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Generate PR summary
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          PR_NUMBER: ${{ github.event.pull_request.number }}
        run: |
          DIFF=$(git diff origin/${{ github.base_ref }}...HEAD -- '*.ts' '*.tsx' '*.js' '*.jsx' | head -c 12000)
          SUMMARY=$(node -e "
            const Anthropic = require('@anthropic-ai/sdk');
            const client = new Anthropic();
            (async () => {
              const msg = await client.messages.create({
                model: 'claude-haiku-4-20250514',
                max_tokens: 512,
                messages: [{
                  role: 'user',
                  content: 'Summarize this diff for a PR description in 3-5 bullet points. Focus on what changed and why, not how.\n\n' + process.env.DIFF
                }]
              });
              console.log(msg.content[0].text);
            })();
          " DIFF="$DIFF")
          gh pr comment $PR_NUMBER --body "**AI Summary (verify before merging)**\n\n$SUMMARY"

Failing Test Triage Workflow

When CI fails, this workflow attaches an AI diagnosis to the PR. A human still decides whether to act on it.

# .github/workflows/ai-test-triage.yml
name: AI Test Triage

on:
  workflow_run:
    workflows: ["CI"]
    types: [completed]

jobs:
  triage:
    if: ${{ github.event.workflow_run.conclusion == 'failure' }}
    runs-on: ubuntu-latest
    permissions:
      pull-requests: write
    steps:
      - uses: actions/checkout@v4

      - name: Install deps and capture test output
        run: npm ci && npm test 2>&1 | tail -100 > test-output.txt || true

      - name: Post AI triage comment
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          FAILURES=$(cat test-output.txt)
          PR_NUMBER=$(gh pr list --head ${{ github.head_ref }} --json number -q '.[0].number')
          TRIAGE=$(node -e "
            const Anthropic = require('@anthropic-ai/sdk');
            const client = new Anthropic();
            (async () => {
              const msg = await client.messages.create({
                model: 'claude-haiku-4-20250514',
                max_tokens: 512,
                messages: [{
                  role: 'user',
                  content: 'These CI tests failed. Identify the root cause and suggest a minimal fix. Output a brief diagnosis and one concrete suggestion.\n\n' + process.env.FAILURES
                }]
              });
              console.log(msg.content[0].text);
            })();
          " FAILURES="$FAILURES")
          gh pr comment $PR_NUMBER --body "**AI Test Triage (human review required)**\n\n$TRIAGE"

Key rules for CI AI workflows:

  • Store API keys in GitHub Secrets — never in workflow YAML
  • Pin model IDs; a model upgrade could change output format
  • Mark AI comments clearly ("AI Summary", "AI Triage") so reviewers know what they're reading
  • Never auto-merge based on AI output — require a human approval gate

See Claude Code SDK and Testing AI-Generated Code.

For Teams

Policy areaRecommendation
Background agentsAllowed on docs/tests only unless explicitly approved for production paths
Required CIlint, test, build must pass before merge — agent output is not exempt
PR labelsUse ai-assisted for traceability; require human reviewer on auth/billing paths
File boundariesEncode in Project Rules — max scope per agent task
Incident responseIf agent commits secrets, rotate keys per Team AI Policy

Rollout checklist:

  • [ ] Shared rules file with agent boundaries
  • [ ] Plan-before-Agent norm documented
  • [ ] Pre-merge verification required on all AI-assisted PRs
  • [ ] Background agents disabled or scoped for junior contractors