Comment and Control: Prompt Injection Exploits in Claude Code, Gemini CLI, and GitHub Copilot Agent

Tue, 19 May 2026 20:12:00 GMT · Security Updates

Intro

Three of the most widely deployed AI agents on GitHub Actions — Anthropic's Claude Code Security Review, Google's Gemini CLI Action, and GitHub's Copilot Agent — can be tricked into leaking the host repository's own API keys and access tokens. Researcher Aonan Guan, working with Johns Hopkins University's Zhengyu Liu and Gavin Zhong, calls the class "Comment and Control," a play on the Command and Control (C2) terminology from traditional malware. The injection surface and the exfiltration channel are both GitHub itself — no external infrastructure required.

The pattern

Each agent reads GitHub data such as PR titles, issue bodies, or comments, treats that content as part of its prompt context, and then executes tools based on what it reads. Because outside contributors can write that data, they can write instructions. Because the agents run inside GitHub Actions with secrets like ANTHROPIC_API_KEY, GEMINI_API_KEY, and GITHUB_TOKEN in their environment, those secrets can be extracted from the runner and posted back into a PR comment, an issue reply, or a commit. The entire attack loop stays inside GitHub.

The agent has access to production secrets because it needs them to do its job. The agent processes untrusted input because that is its job. These two requirements are in direct conflict, and few current deployments have adequately addressed it.
— Aonan Guan

Finding 1 — Claude Code Security Review

In Claude Code Security Review, the PR title is interpolated directly into the prompt with no sanitization, and the Claude CLI is invoked without --allowed-tools or --disallowed-tools restrictions. A PR opened with a malicious title can break out of the prompt and instruct the agent to run shell commands such as whoami and ps auxeww, whose output — including ANTHROPIC_API_KEY and GITHUB_TOKEN pulled from the process environment — is then posted as a PR comment and written into the Actions log. Anthropic initially rated the issue Critical (CVSS 9.3, later 9.4), shipped a Bash(ps:*) block, updated documentation, and paid a $100 bounty. On April 20, 2026, the severity was changed to None, with Anthropic noting the action "is not designed to be hardened against prompt injection."

Finding 2 — Gemini CLI Action

The Gemini CLI Action includes issue titles, bodies, and comments in its prompt under an "Additional Content" section. An attacker can append a fake "Trusted Content Section" that overrides Gemini's safety instructions and asks the agent to share environment variables. The agent complies, posting the full GEMINI_API_KEY as a public issue comment. Google's VRP accepted the report (#1609699) and paid a $1,337 bounty.

Finding 3 — GitHub Copilot Agent

The Copilot Agent finding is the most elaborate. GitHub layered three runtime defenses on top of model and prompt protections: environment filtering (the UU() function and zJe deny-list strip about 20 sensitive variables from bash subprocesses), pre-push secret scanning, and an outbound network firewall. Guan bypassed all three. The payload was hidden inside an HTML comment in an issue body — invisible in GitHub's rendered Markdown but parsed by the agent — so a victim could assign the issue to Copilot without ever seeing the malicious instructions. Once running, the agent executed ps auxeww | base64 -w0 > running-environment-checkmark.txt, reading /proc/[pid]/environ of the parent Node.js and MCP server processes that still held the unfiltered credentials. Base64 encoding defeated secret scanning, and git push to a whitelisted github.com domain delivered the file as a normal commit. Decoded, the file revealed GITHUB_TOKEN, GITHUB_COPILOT_API_TOKEN, GITHUB_PERSONAL_ACCESS_TOKEN, and COPILOT_JOB_NONCE. GitHub initially closed the report as Informative, then reopened it after Guan submitted the reverse-engineered filter code, and paid $500.

Why it matters

This is the first public cross-vendor demonstration of a single prompt-injection pattern landing against three different major AI agents — and the pattern generalizes well beyond GitHub Actions. Any agent that ingests untrusted input while holding execution tools and production secrets in the same runtime is exposed: Slack bots, Jira agents, email agents, deployment automation. Guan compares the situation to phishing, where employees must process external content to do their jobs and attackers exploit that necessity. The recommended posture is least-privilege allowlisting for tools, secrets, and network access — treating each AI agent like a new employee whose role is narrowly scoped, rather than a trusted insider with full environmental access.

Full technical write-up, code snippets, screenshots, video demo, and disclosure timeline: https://oddguan.com/blog/comment-and-control-prompt-injection-credential-theft-claude-code-gemini-cli-github-copilot/