AI in CI/CD
CI/CD is where agent speed meets production gates — useful for review comments and failure triage, dangerous when agents merge or deploy without human approval.
Last reviewed: June 2026
GitHub Actions, Codex Action, and Claude Code SDK APIs change frequently. Pin action SHAs and verify vendor docs before org-wide rollout.
What to automate vs not
| Automate (non-blocking) | Keep human-gated |
|---|---|
| PR review comments | Merge approval |
| Suggest test cases | Production deploy |
| Summarize failure logs | Schema migrations |
Label ai-assisted PRs | Secret rotation |
| Lint/format suggestions on bot branches | Infrastructure Terraform apply |
Policy belongs in Team AI Policy and AI Code Review.
Common patterns
PR review bot (comment-only)
# .github/workflows/ai-review.yml (illustrative — pin exact action version)
name: AI Review
on:
pull_request:
types: [opened, synchronize]
jobs:
review:
runs-on: ubuntu-latest
permissions:
pull-requests: write
contents: read
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Run review tool
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: |
# Vendor CLI or custom script — posts comments, does not approve
echo "Run org-approved review CLI here"
Never store API keys in workflow files. Use GitHub Secrets. Comments only — do not set required checks on AI alone.
Failure triage on CI red
On test failure, post workflow summary with AI-generated hypothesis:
- Capture failed job log (truncate to token budget)
- Send to API with structured output — see Structured Outputs
- Post summary as job annotation or PR comment
- Developer verifies — AI does not auto-fix on main
Claude Code / Codex in CI
| Tool | CI fit |
|---|---|
| Codex GitHub Action | PR review comments (verify current action) |
| Claude Code SDK | Custom pipelines — script headless agent with scoped repo checkout |
| Aider | Rare in CI — better interactive |
Headless agents need read-only checkout and explicit file allowlists.
Security in CI
| Risk | Mitigation |
|---|---|
| Secret leak in logs | Mask secrets; redact tool output |
| Fork PR exfiltration | Do not run AI on external forks with secrets |
| Prompt injection via PR title/body | Sanitize; system prompt ignores instruction overrides |
Over-privileged GITHUB_TOKEN | Minimum permissions per job |
| Supply chain | Pin action SHAs; vet third-party review actions |
See MCP Security if CI agents use MCP.
Context limits in CI
| Source | Token cost |
|---|---|
| Full monorepo | Too large — use diff only |
git diff main...HEAD | Preferred for review |
| Failed test log (last 200 lines) | Enough for triage |
| AGENTS.md | Include for stack context |
See Context Engineering.
Observability
Log per run:
- Model ID and token usage
- Job URL and PR number
- Outcome (comment posted / skipped / error)
Route to same dashboards as product LLM features — LLM Observability.
Production concerns
| Concern | What to do |
|---|---|
| Cost | Trigger on PR open/sync only; skip draft PRs |
| Latency | Async jobs; do not block merge on AI |
| Flaky AI | Retry once; fail open (no comment) vs fail closed |
| Compliance | Enterprise API plans; zero retention contracts |