AI
AI Analysis
Live Data

Indirect Prompt Injection: AI Supply-Chain Risks & Data

Infographic analysis of indirect prompt injection and AI supply-chain risks. Visualizes OWASP and NIST data, attack paths, and mitigations for LLM apps.

@Noahpinionposted on X

This is cyberpunk AF. Bad actors appear to be using AI to create malicious software that human coders can't see, but which other AIs then use to code, producing damaging effects that no human can catch...

View original tweet on X →
Infographic related to the article topic

Source: Google Cloud Blog

Research Brief

What our analysis found

- Key facts and data points - “Indirect prompt injection” (a.k.a. context poisoning) is recognized as a top risk for LLM apps. OWASP’s Top 10 for LLM Applications lists LLM01: Prompt Injection as the #1 category. NIST’s March 2025 AI 100-2 report explicitly defines “RESOURCE CONTROL” as an attacker capability enabling INDIRECT PROMPT INJECTION and discusses AI supply-chain exposures. ([owasp.org](https://owasp.org/www-project-top-10-for-large-language-model-applications/?utm_source=openai)) - Trail of Bits (Aug 6, 2025) showed a working Copilot Agent exploit via a GitHub Issue: hidden instructions are placed in HTML inside a block so humans don’t see them in the UI, but the LLM reads them, leading Copilot to add a backdoor dependency and curl|sh a script from raw.githubusercontent.com. ([blog.trailofbits.com](https://blog.trailofbits.com/2025/08/06/prompt-injection-engineering-for-attackers-exploiting-github-copilot/)) - Orca Security (Feb 16, 2026) demonstrated “RoguePilot”: when opening a Codespace from an Issue, Copilot is passively prompted with the issue body; hidden instructions plus a pre‑crafted PR with a symlink let Copilot read /workspaces/.codespaces/shared/user-secrets-envs.json, then exfiltrate the GITHUB_TOKEN by setting a remote JSON $schema URL, enabling full repo takeover. GitHub patched after disclosure (reported Feb 24, 2026). ([orca.security](https://orca.security/resources/blog/roguepilot-github-copilot-vulnerability/)) - Invisible/unrendered characters can carry instructions that humans don’t notice. Oct 2025 research (“Imperceptible Jailbreaking”) used invisible Unicode variation selectors to craft prompts that look benign on screen but tokenize differently, achieving high jailbreak success across multiple LLMs. ([arxiv.org](https://arxiv.org/abs/2510.05025?utm_source=openai)) - Amazon Q Developer VS Code extension supply‑chain incident (July 2025): a malicious PR added a destructive prompt to v1.84.0; AWS issued v1.85.0 and an advisory (CVE‑2025‑8217) stating the injected code shipped but didn’t execute due to a syntax error. Dates: malicious PR July 13; tainted release July 17; fix July 24–25, 2025. ([aws.amazon.com](https://aws.amazon.com/security/security-bulletins/AWS-2025-015/)) - New Copilot CLI vulnerability (CVE‑2026‑29783, published Mar 6, 2026): bash parameter expansion patterns could bypass “read‑only” checks. Vectors include prompt injection via repo content (README, issues, comments), MCP servers, or crafted user instructions. Fixed in Copilot CLI v0.0.423. CVSS 8.8 High. ([github.com](https://github.com/advisories/GHSA-g8r9-g2v8-jv6f)) - Evidence or sources that SUPPORT the claim - Real exploit path where AI reads “hidden” human-invisible instructions and then does harmful actions: Trail of Bits’ Copilot Agent demo hid an instruction in HTML that the LLM obeyed, adding a backdoor dependency and executing a fetched script—an action chain likely to bypass casual human review. ([blog.trailofbits.com](https://blog.trailofbits.com/2025/08/06/prompt-injection-engineering-for-attackers-exploiting-github-copilot/)) - RoguePilot shows an AI‑mediated supply‑chain impact without direct attacker interaction: Copilot, passively seeded from a GitHub Issue, used its tools (run_in_terminal, file_read, create_file) to exfiltrate a privileged GITHUB_TOKEN via a JSON $schema fetch and seize the repo. ([orca.security](https://orca.security/resources/blog/roguepilot-github-copilot-vulnerability/)) - Reporting on RoguePilot confirms the attacker instructions can be hidden in HTML comments in an Issue, visually invisible to humans but consumed by Copilot; GitHub patched post‑disclosure. ([securityweek.com](https://www.securityweek.com/github-issues-abused-in-copilot-attack-leading-to-repository-takeover/)) - Invisible Unicode techniques make “malicious text humans can’t see” practical: variation selectors alter tokenization while preserving on‑screen appearance, enabling invisible jailbreak/prompt‑injection content. ([arxiv.org](https://arxiv.org/abs/2510.05025?utm_source=openai)) - The Copilot CLI CVE states that an attacker who can influence commands—e.g., through prompt injection in repo content or MCP responses—could get arbitrary code execution, illustrating model‑mediated command paths that can evade naïve human checks. ([github.com](https://github.com/advisories/GHSA-g8r9-g2v8-jv6f)) - Standards and guidance bodies treat indirect prompt injection as first‑class: OWASP Top 10 (LLM01) and NIST AI 100‑2e2025 explicitly call out indirect prompt injection/supply‑chain risks to agentic systems. ([owasp.org](https://owasp.org/www-project-top-10-for-large-language-model-applications/?utm_source=openai)) - Evidence or sources that CONTRADICT or NUANCE the claim - Not literally “no human can catch”: the hidden content in these cases is often present in source (e.g., HTML comments, / tags, zero‑width characters) and can be revealed in “view raw,” diffs, or with scanners; Trail of Bits also notes some hiding methods (like HTML comments) may be stripped in certain paths, so attackers choose alternatives that survive to the LLM. ([blog.trailofbits.com](https://blog.trailofbits.com/2025/08/06/prompt-injection-engineering-for-attackers-exploiting-github-copilot/)) - Amazon Q incident: AWS’s official bulletin says the malicious prompt shipped in v1.84.0 but did not execute due to a syntax error; users were told to update to v1.85.0. This tempers claims of widespread destructive impact. ([aws.amazon.com](https://aws.amazon.com/security/security-bulletins/AWS-2025-015/)) - Some mitigations can materially reduce risk: Microsoft’s “Spotlighting” (public preview Oct 1, 2025) reported reducing indirect prompt‑injection success “from >50% to <2%” in experiments, showing layered defenses can catch many attacks before they reach agents. ([techcommunity.microsoft.com](https://techcommunity.microsoft.com/blog/-/better-detecting-cross-prompt-injection-attacks-introducing/4458404)) - The most publicized Copilot exploits (Trail of Bits 2025; Orca 2026) are, to date, researcher‑driven proofs of concept with vendor patches issued post‑disclosure; claims of large‑scale real‑world damage specifically via invisible prompt text remain limited in public reporting. ([securityweek.com](https://www.securityweek.com/github-issues-abused-in-copilot-attack-leading-to-repository-takeover/)) - Recent developments or updates - Feb 16–24, 2026: Orca publishes RoguePilot; SecurityWeek reports GitHub patched the issue after disclosure. Attack chain used hidden Issue content, symlink to shared secrets, and remote JSON $schema to exfiltrate GITHUB_TOKEN. ([orca.security](https://orca.security/resources/blog/roguepilot-github-copilot-vulnerability/)) - Mar 6, 2026: GitHub Advisory Database publishes CVE‑2026‑29783 for Copilot CLI; high‑severity RCE via shell expansion patterns influenced by prompt injection in repo text/MCP; fixed in v0.0.423 (updated Mar 13). ([github.com](https://github.com/advisories/GHSA-g8r9-g2v8-jv6f)) - Mar 4, 2026: Orca describes “HackerBot‑Claw,” an automated campaign abusing misconfigured GitHub Actions; includes an example of “AI prompt injection against a CI workflow” among targets, underscoring AI‑agent/CI intersections in supply‑chain risk. ([orca.security](https://orca.security/resources/blog/hackerbot-claw-github-actions-attack/)) - Oct 1, 2025: Microsoft announces Spotlighting in Azure AI Foundry Prompt Shields, citing large measured reductions in cross/indirect prompt‑injection success in internal experiments. ([techcommunity.microsoft.com](https://techcommunity.microsoft.com/blog/-/better-detecting-cross-prompt-injection-attacks-introducing/4458404)) - July 23–25, 2025: AWS posts official security bulletin (CVE‑2025‑8217) for Amazon Q Developer VS Code extension v1.84.0; confirms malicious prompt was shipped but non‑functional; urges update to v1.85.0. TechRadar reports ~1M installs affected by the tainted version before it was pulled. ([aws.amazon.com](https://aws.amazon.com/security/security-bulletins/AWS-2025-015/)) - Aug 6, 2025: Trail of Bits publishes detailed engineering guidance for attackers on hiding prompt‑injection payloads in GitHub Issues and steering Copilot to insert backdoors—an early public blueprint for AI‑mediated supply‑chain abuse in developer tooling. ([blog.trailofbits.com](https://blog.trailofbits.com/2025/08/06/prompt-injection-engineering-for-attackers-exploiting-github-copilot/)) - Additional background sources - Foundational 2023 paper on indirect prompt injection (Greshake et al.) establishing feasibility of compromising LLM‑integrated apps by injecting instructions into retrieved content. ([arxiv.org](https://arxiv.org/abs/2302.12173?utm_source=openai)) - NIST AI 100‑2e2025 details how indirect prompt injection can hijack agents to leak data or execute attacker‑specified tasks; recommends defense‑in‑depth and assuming exposure to untrusted inputs. ([nvlpubs.nist.gov](https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-2e2025.pdf))