Openclaw security measures

This skill covers OpenClaw-specific security topics across its full lifecycle: runtime sandboxing, permission and capability models, secrets handling, supply-chain protections for skills and the ClawHub registry, vulnerability disclosures, hardening practices, security-relevant features and patches, audit mechanisms, and incident reports involving OpenClaw or its agent runtime. OpenClaw has been implicated in multiple supply-chain incidents — the Clinejection attack via Cline CLI (February 2026, GHSA-9ppg-jx86-fqw7), a broader axios npm compromise distributing a cross-platform RAT via OpenClaw-impersonating packages (March 2026, attributed to Sapphire Sleet), and malicious skills on ClawHub — as well as documented architectural vulnerabilities including reasoning-text leaks to end-user channels, exec-approval resets on upgrade, susceptibility to prompt injection, memory poisoning, skill poisoning, and intent drift. Academic research (Tsinghua University / Ant Group, arXiv:2603.11619) has formally characterized these as multi-stage systemic threats and proposed a five-layer lifecycle-oriented defense framework, with the ClawAegis plugin released as a purpose-built implementation. Key hardening actions include: upgrading to at least v2026.3.8 (patches CVE-2026-33574 and 12+ other issues), v2026.4.25 (patches a privilege escalation in token rotation), and v2026.4.27 (adds Codex Computer Use with fail-closed pre-checks); configuring the two-layer exec-approval policy after every upgrade; explicitly suppressing reasoning output for production deployments; auditing transitive dependency trees for compromised packages; and treating all third-party ClawHub skills as untrusted executable code requiring review before installation. Enterprise alternatives such as PokeeClaw and Clawdi address native gaps in user isolation, RBAC, audit logging, and sandboxed execution that stock OpenClaw does not provide.

When to use this skill

Apply this skill when the user asks about OpenClaw runtime security, sandboxing configuration, permission models, secrets management, supply-chain integrity of skills or packages, vulnerability disclosures, hardening, audit logs, security-relevant release features (e.g., Bedrock Guardrails, content filtering, ACP provenance, backup/restore, safe update procedures, reasoning/thinking visibility controls, exec-approval policies), security incidents involving OpenClaw or its agent runtime, OpenClaw's known architectural security limitations (broad system access, prompt injection susceptibility, skill-based data exfiltration, reasoning text leaking to end-user channels, exec-approval reset on upgrade, memory poisoning, intent drift), academic security research on OpenClaw (e.g., the Tsinghua/Ant Group five-layer lifecycle security framework, arXiv:2603.11619, or the ClawAegis defense plugin), enterprise-secure alternatives to OpenClaw (e.g., PokeeClaw), or industry leadership perspectives on AI supply-chain risk and responsible enterprise AI adoption (e.g., OWASP LLM Top 10 supply-chain risks, CISO priorities around third-party AI components). Also apply when the user mentions the Cline CLI supply-chain attack (Clinejection / GHSA-9ppg-jx86-fqw7), the axios npm compromise (March 2026, attributed to Sapphire Sleet), OpenClaw-impersonating packages such as @shadanai/openclaw or @qqbrowser/openclaw-qbot, malicious skills on ClawHub, specific CVEs patched in OpenClaw releases (e.g., CVE-2026-33574), reasoning/thinking leaks with GPT-5.4 or other reasoning-capable models, exec-approval configuration (exec-approvals.json, tools.exec.*), post-update exec/allowlist breakage, lifecycle-oriented agent security frameworks (initialization, input, inference, decision, execution stages), or industry framing of AI-first enterprise security posture. (source, source, source, source, source, source, source, source, source, source)
The consensus brief describes DeFi wallet-drainer mechanics (unlimited token approvals, delayed drains, phishing-based approval grants) and AI-powered Web3 security products such as CerbAgent ($CERB) that perform pre-transaction scanning and automated revokes. These topics involve generic DeFi/crypto security incidents and third-party AI agent products unrelated to the OpenClaw agent runtime, and therefore fall outside the scope of this skill. (source, source, source, source, source, source)

Critical rules

DO NOT recommend installing cline@2.3.0 under any circumstances — it was a compromised npm release (GHSA-9ppg-jx86-fqw7) that silently ran npm install -g openclaw@latest via a postinstall script. The CLI binary was byte-identical to cline@2.2.3; the only modification was a single added postinstall line in package.json. If a user has cline@2.3.0, instruct them to upgrade immediately to cline@2.4.0 or later and run npm uninstall -g openclaw to remove the unexpected OpenClaw installation. (source, source, source, source, source)
When recommending ANY npm package that touches OpenClaw, verify its provenance. After the Clinejection incident, Cline migrated npm publishing to OIDC-based provenance via GitHub Actions. Prefer packages that publish with npm provenance attestations (--provenance flag) and verify signatures before recommending installation. (source, source, source)
OpenClaw itself is NOT malware. Public analyses from SafeDep, Snyk, and Endor Labs confirm OpenClaw is a legitimate open-source project. DO NOT tell users that OpenClaw contains malware. However, warn that: (1) an OpenClaw installation originating from the compromised cline@2.3.0 postinstall was unauthorized and should be removed, because the installed OpenClaw binary has broad permissions including shell command execution, credential access from ~/.openclaw/, and can install itself as a persistent system daemon surviving reboots; (2) OpenClaw's design inherently grants deep system access that cybersecurity researchers (including Cisco) have flagged as a security concern — it is susceptible to prompt injection and third-party skills have been found exfiltrating data. (source, source, source, source, source, source, source)
DO NOT recommend axios@1.14.1, axios@1.14.2, or axios@0.30.4 — these were compromised npm releases (published ~2026-03-31, 00:21–03:29 UTC) that injected a hidden dependency plain-crypto-js@4.2.1, which deployed a cross-platform RAT (Windows/macOS/Linux). Microsoft Threat Intelligence attributed the attack to Sapphire Sleet (North Korean state-sponsored). Pin to axios@1.14.0 (last legitimate 1.x) or axios@0.30.3 (last legitimate 0.30.x). If a user has any of the compromised versions, instruct them to: (1) downgrade immediately, (2) delete node_modules/plain-crypto-js, (3) rotate ALL secrets and credentials on affected machines, and (4) audit CI/CD build logs for the March 31 UTC window. (source, source, source, source, source, source, source)
DO NOT trust packages named @shadanai/openclaw or @qqbrowser/openclaw-qbot — these are attacker-controlled packages impersonating OpenClaw. Specifically, @shadanai/openclaw versions 2026.3.28-2, 2026.3.28-3, 2026.3.31-1, and 2026.3.31-2 vendor the malicious plain-crypto-js payload directly. @qqbrowser/openclaw-qbot@0.0.130 ships a tampered axios@1.14.1 with plain-crypto-js injected. If a user has any of these installed, treat as fully compromised: remove immediately, rotate all credentials, and scan for RAT persistence. (source, source, source)
When auditing OpenClaw installations for supply-chain compromise, checking top-level package names is INSUFFICIENT. OpenClaw's dependency graph includes 1,246 third-party modules and 2,672 dependency paths. MUST inspect package-lock.json, yarn.lock, or pnpm-lock.yaml to verify whether axios@1.14.1, axios@1.14.2, axios@0.30.4, or plain-crypto-js@4.2.1 were resolved anywhere in the tree — including in the dependency trees of Agents and Skills modules. (source, source, source)
DO NOT install third-party skills from ClawHub without auditing their source code. Security researchers have discovered malicious skills on ClawHub — including the MoltBot skill targeting cryptocurrency users — that exploit OpenClaw's broad permission model to exfiltrate data. Cisco researchers have independently confirmed data exfiltration via a third-party skill. An empirical audit cited in the Tsinghua/Ant Group research (arXiv:2603.11619) found that approximately 26% of community-contributed tools in agent skill ecosystems contain security vulnerabilities. Treat every ClawHub skill as untrusted executable code; review it before installation, especially skills requesting network access, file-system access, or credential access. (source, source, source, source, source)
OpenClaw versions prior to v2026.3.8 are affected by CVE-2026-33574, a path-traversal/TOCTOU vulnerability. MUST upgrade to v2026.3.8 or later to remediate. The v2026.3.8 release includes 12+ security fixes total. If a user is running any OpenClaw version older than v2026.3.8, treat the installation as potentially vulnerable and recommend immediate upgrade. (source, source, source, source, source)
For ANY production OpenClaw deployment using reasoning-capable models (GPT-5.x, openai-codex family, or any model where resolveDefaultReasoningLevel() auto-enables reasoning), MUST explicitly set agents.defaults.reasoningDefault to "off" in openclaw.json to prevent internal chain-of-thought text from leaking to end-user channels (WhatsApp, Telegram, Discord, Slack). Without this setting, OpenClaw auto-sets reasoningLevel to "on" for new sessions with reasoning-capable models, causing "Reasoning:\n_..._" prefixed text to be delivered as a separate user-visible message. The /reasoning off command only fixes this per-session and resets when the session expires — it is NOT a durable mitigation. Also set agents.defaults.heartbeat.includeReasoning to false (which is the default) to prevent heartbeat messages from including reasoning output. For public-facing channels, OpenClaw's own security documentation warns that /reasoning and /verbose can expose internal reasoning or tool output not intended for public rooms — keep them disabled. (source, source, source, source, source, source, source)
After ANY OpenClaw upgrade (especially to v2026.4.1 or later), MUST verify and reconfigure exec-approval settings — upgrades can reset ~/.openclaw/exec-approvals.json to strict defaults, stripping existing allowlist entries and breaking automated workflows. OpenClaw's exec policy is a TWO-LAYER system: the effective policy is the STRICTER of (1) exec-approvals.json defaults and (2) tools.exec.* settings in openclaw.json. Setting only one layer is INSUFFICIENT — both must agree. After upgrading: (1) set ~/.openclaw/exec-approvals.json defaults to {"security": "full", "ask": "off", "askFallback": "full"} for fully automated exec (NO approval prompts), (2) set tools.exec.host to "gateway" and tools.exec.security to "full" in ~/.openclaw/openclaw.json, (3) optionally set tools.exec.strictInlineEval to false in openclaw.json to allow inline interpreter eval forms (python -c, node -e, ruby -e, perl -e, php -r, lua -e, osascript -e) without forced approval, (4) restart the gateway (openclaw gateway restart) — changes to openclaw.json do NOT take effect without a restart. Valid values for exec-approvals.json: security accepts "full" (allow everything), "allowlist", "deny"; ask accepts "off", "on-miss", "always"; askFallback accepts "deny" and "full". WARNING: security: "full" with ask: "off" grants unrestricted command execution — use ONLY when the deployment's threat model permits it (e.g., single-user, trusted environment). For multi-user or public-facing deployments, use security: "allowlist" with explicit allowlist entries instead. (source, source, source, source, source, source, source, source, source)

Known incidents — Clinejection supply-chain attack (February 2026)

INCIDENT SUMMARY: On 2026-02-17 (03:26–11:30 PT), an attacker published cline@2.3.0 to npm using a stolen NPM_RELEASE_TOKEN. The CLI binary was byte-identical to cline@2.2.3; the only change was an added postinstall script: npm install -g openclaw@latest. Approximately 4,000 downloads occurred during the ~8-hour window. The attack is tracked as GHSA-9ppg-jx86-fqw7 and codenamed 'Clinejection'. Only users of the Cline CLI npm package were affected — the Cline VS Code extension and JetBrains plugin were NOT compromised. (source, source, source, source, source, source, source, source, source, source)
ATTACK VECTOR: Cline added an AI-powered issue triage workflow using Anthropic's claude-code-action on 2025-12-21, configured to run Claude with broad tool access (Bash, Read, Write, etc.) on every new GitHub issue. The issue title was interpolated into Claude's prompt via ${{ github.event.issue.title }} without sanitization. An attacker crafted a prompt-injection payload in an issue title that caused Claude to run npm install pointing to a typosquatted repository (glthub-actions/cline). This led to GitHub Actions cache poisoning (Cacheract technique: flooding >10 GB of junk data to trigger LRU eviction, then planting poisoned cache entries matching Cline's nightly release workflow keys). The nightly workflow ran at ~2 AM UTC and exfiltrated VSCE_PAT, OVSX_PAT, and NPM_RELEASE_TOKEN secrets. (source, source, source, source)

Known incidents — axios npm supply-chain compromise (March 2026)

INCIDENT SUMMARY: On 2026-03-31 (00:21–03:29 UTC), an attacker published axios@1.14.1 (SHA1: 2553649f2322049666871cea80a5d0d6adc700ca) and axios@0.30.4 (SHA1: d6f3f62fd3b9f5432f5782b62d8cfd5247d5ee71) to npm via the compromised maintainer account jasonsaayman. These versions added a hidden dependency plain-crypto-js@4.2.1 whose postinstall dropper delivered a cross-platform RAT targeting Windows, macOS, and Linux. The malicious versions were live for approximately 3 hours before npm removal. Microsoft Threat Intelligence attributed the attack to Sapphire Sleet, a North Korean state-sponsored threat actor. The attacker-controlled account that published plain-crypto-js has been disabled. (source, source, source, source, source, source, source, source)
OPENCLAW IMPACT: Attacker-controlled packages impersonated OpenClaw to distribute the axios/plain-crypto-js payload. @shadanai/openclaw versions 2026.3.28-2, 2026.3.28-3, 2026.3.31-1, and 2026.3.31-2 vendor the malicious plain-crypto-js payload directly. @qqbrowser/openclaw-qbot@0.0.130 ships a tampered axios@1.14.1 with plain-crypto-js injected as a dependency. SlowMist founder Yu Xian publicly warned that OpenClaw version 3.28 may have introduced the poisoned axios, and that Skills depending on axios could be indirectly affected. OpenClaw responded by locking its dependency module versions. (source, source, source, source, source)
EVASION TECHNIQUE: The plain-crypto-js RAT self-destructs after execution, replacing its own package.json with a clean stub. This means npm audit and manual node_modules inspection are UNRELIABLE for post-compromise detection. The primary forensic indicator is the presence of the node_modules/plain-crypto-js directory (even if contents appear clean). C2 domain to block: sfrclak[.]com (IP: 142.11.206.73). Instruct users to check network logs and firewall rules for connections to this domain/IP. (source, source, source, source)

Known incidents — malicious ClawHub skills

INCIDENT SUMMARY: Security researchers discovered malicious skills published on ClawHub, OpenClaw's community skill marketplace. The MoltBot skill specifically targeted cryptocurrency users, exploiting OpenClaw's broad system-access permissions to exfiltrate sensitive data. Cisco researchers independently confirmed data exfiltration via a third-party OpenClaw skill. As of 2026-04-09, ClawHub does NOT have adequate sandboxing or vetting to prevent malicious skills from accessing the host system. Treat all third-party skills as untrusted code. (source, source, source)

Known incidents — reasoning text leak to end-user channels

INCIDENT SUMMARY: OpenClaw has a documented, reproducible bug where internal chain-of-thought reasoning text (prefixed with "Reasoning:\n_..._") is emitted as a separate user-visible message to messaging channels (WhatsApp, Telegram, Discord, Slack) instead of being suppressed. This affects any reasoning-capable model (GPT-5.x, openai-codex family, and similar). ROOT CAUSE: resolveDefaultReasoningLevel() auto-sets reasoningLevel to "on" for any new session using a reasoning-capable model when no explicit config is present. The /reasoning off command only fixes the leak per-session and resets when the session expires. Multiple GitHub issues confirm this across versions (issues #24954, #25060, #29708). Independent security writeups (Penligent) and community reports (Reddit r/openclaw) document real-world occurrences and successful mitigations. A regression in v2026.3.28 also caused "Reasoning is required for this model endpoint" errors for GPT-5.4-mini on OpenRouter (issue #57430), confirming reasoning enforcement is an active, ongoing concern. (source, source, source, source, source, source, source)

Known incidents — exec-approvals reset on v2026.4.1 upgrade

INCIDENT SUMMARY: OpenClaw v2026.4.1 reset ~/.openclaw/exec-approvals.json to strict defaults during the update process, stripping user-configured allowlist entries and breaking automated workflows that relied on exec permissions. Users on Telegram bots and other automated channels reported commands failing with allowlist-miss errors immediately after upgrading. Additionally, the v2026.4.1 update decoupled sandbox mode from execution permissions — disabling sandbox mode alone no longer restores exec/git/filesystem access; those capabilities now require explicit re-enablement through the exec-approvals system and/or tools.exec.* config. If a user reports exec failures, allowlist misses, or broken automation after an OpenClaw upgrade, instruct them to reconfigure BOTH exec-approvals.json defaults AND tools.exec.* in openclaw.json, then restart the gateway. (source, source, source, source, source)

OpenClaw architectural security limitations

OpenClaw's architecture grants agents deep system access by design — including shell command execution, file-system read/write, credential access from ~/.openclaw/, and persistent daemon capabilities. OpenClaw's 'kernel-plugin' architecture is anchored by a pi-coding-agent serving as the Minimal Trusted Computing Base (TCB), and has been formally confirmed vulnerable to multi-stage systemic risks that bypass traditional isolated defenses (arXiv:2603.11619, Tsinghua/Ant Group). This broad permission model has drawn scrutiny from multiple security researchers and mainstream tech press (TechCrunch specifically notes that 'systems like OpenClaw raise security concerns due to its deep system access'). Known architectural risks as of 2026-04-14: (1) susceptibility to prompt injection — including indirect prompt injection where attacker-controlled content in tool outputs or documents manipulates the agent, (2) third-party skills (from ClawHub or other sources) execute as untrusted code with the same broad permissions as the core agent — ClawHub has NO adequate sandboxing or vetting process, and ~26% of community-contributed tools contain security vulnerabilities per academic audit, (3) no built-in per-user isolation — all skills and agents share the same host-level access, (4) no built-in role-based access control or approval workflows for sensitive operations, (5) no native audit logging of agent actions in the open-source version, (6) reasoning-capable models auto-enable visible reasoning output by default, leaking internal chain-of-thought to end-user channels unless explicitly suppressed via config, (7) exec-approval policy operates as a two-layer system (exec-approvals.json + tools.exec.*) where the stricter layer wins — and upgrades can silently reset one layer to strict defaults, breaking automation, (8) complex shell commands (using &&, ||, ;, |, $, etc.) are treated as allowlist misses on macOS companion-app approvals unless the shell binary itself is allowlisted, (9) if OpenClaw cannot identify exactly one concrete local file for an interpreter/runtime command (e.g., package scripts, eval forms), approval-backed execution is denied — a second-layer gate beyond the allowlist, (10) memory poisoning — persistent MEMORY.md manipulation where an attacker injects instructions that cause the agent to refuse or redirect future requests (demonstrated in arXiv:2603.11619), (11) skill poisoning via metadata manipulation — an attacker can elevate a malicious skill's priority over legitimate skills by manipulating skill metadata, causing the agent to serve attacker-controlled output on benign requests (demonstrated via 'hacked-weather' skill in arXiv:2603.11619), (12) intent drift — multi-turn interactions or manipulated context can cause the agent to deviate from the user's original intent and perform unauthorized actions. When a user requires user isolation, RBAC, audit trails, sandboxed execution, guaranteed reasoning suppression, memory integrity, or fine-grained exec control, warn them that stock OpenClaw does NOT provide these features natively and they must implement external controls or evaluate alternatives such as PokeeClaw or the ClawAegis defense plugin. (source, source, source, source, source, source, source, source, source, source, source, source, source)

Academic security research — Tsinghua/Ant Group lifecycle threat analysis

Tsinghua University and Ant Group researchers published "Taming OpenClaw: Security Analysis and Mitigation of Autonomous LLM Agent Threats" (arXiv:2603.11619, submitted 2026-03-12, lead author Xinhao Deng). This paper provides the first comprehensive formal security analysis of OpenClaw. It identifies four compound threat types across the agent lifecycle: (1) indirect prompt injection — attacker-controlled content in tool outputs or retrieved documents that manipulates agent behavior, (2) skill supply-chain contamination — malicious skills that poison the plugin ecosystem (demonstrated via a 'hacked-weather' skill that manipulates metadata to hijack priority over legitimate weather tools), (3) memory poisoning — persistent manipulation of MEMORY.md that causes the agent to refuse or redirect future requests (demonstrated via a poison entry causing persistent refusal of 'C++' requests), (4) intent drift — multi-turn context manipulation causing the agent to deviate from the user's original intent. The paper formally demonstrates that point-based defenses are INSUFFICIENT against these cross-temporal, multi-stage systemic risks. When a user asks about OpenClaw threat modeling or security architecture, reference this paper as the canonical academic threat analysis. (source, source, source)
The paper proposes a five-layer lifecycle-oriented defense framework corresponding to agent operation stages: (1) Initialization stage — plugin vetting frameworks, SBOM (Software Bill of Materials) generation, and cryptographic signatures for skills; (2) Input stage — instruction-hierarchy enforcement and cryptographic token tagging to distinguish trusted from untrusted inputs; (3) Inference stage — Merkle-tree state snapshotting and semantic cross-encoder verification to detect memory poisoning and cognitive state tampering; (4) Decision stage — formal plan verification using symbolic solvers to validate agent decisions against the user's original intent before execution; (5) Execution stage — kernel-level sandboxing via eBPF (Extended Berkeley Packet Filter) and seccomp, plus capability enforcement to constrain what agent actions can actually do at the OS level. When hardening OpenClaw deployments, use this framework as a checklist — stock OpenClaw implements NONE of these five layers natively. Each layer must be implemented via external tooling or the ClawAegis plugin. (source, source, source, source)
Following the arXiv paper, Ant Group and Tsinghua University jointly open-sourced 'ClawAegis' on 2026-04-02 — described as the first security defense plugin covering the full lifecycle of OpenClaw. When a user asks about hardening OpenClaw against the threats identified in arXiv:2603.11619, recommend evaluating ClawAegis as a purpose-built defense layer. As of 2026-04-14, verify ClawAegis availability and feature completeness at its official repository before recommending — the project is newly released and may not yet implement all five framework layers. (source, source, source) (medium confidence)

Enterprise-secure alternatives — PokeeClaw

When a user needs OpenClaw-style agent functionality with enterprise security controls, PokeeClaw (by Pokee AI, https://pokee.ai/) is a commercial alternative that explicitly addresses OpenClaw's security gaps. Pokee AI's own marketing explicitly positions PokeeClaw against OpenClaw, stating 'OpenClaw doesn't belong in production.' PokeeClaw provides: (1) per-user isolated workspaces — each user gets their own VM with dedicated CPU/RAM; nothing is shared between users, (2) encrypted credential vaults for token and secret storage, (3) audit logging and access controls (enterprise plans), (4) role-based access control (enterprise plans), (5) on-premises deployment option (enterprise plans), (6) 1,000+ app integrations, (7) persistent per-user storage, and (8) the ability to run code and install packages within the isolated workspace. Pokee AI claims up to 20× lower token consumption via RL-based tooling selection vs. prompting frontier LLMs — this claim appears in their FAQ (https://dev.pokee.ai/faq) but lacks independent verification; verify independently before recommending based on cost alone. As of 2026-04-09, PokeeClaw is a real shipping product confirmed by multiple independent sources; verify current feature set and pricing at https://pokee.ai/ and https://dev.pokee.ai/faq. (source, source, source, source, source, source, source) (medium confidence)
If a user asks whether OpenClaw is suitable for production or enterprise use: warn that OpenClaw's open-source design lacks native user isolation, RBAC, audit logging, and sandboxed skill execution. Multiple independent sources (Cisco, TechCrunch, TechRadar, security researchers) have flagged these gaps. For production deployments requiring these controls, recommend evaluating PokeeClaw or implementing external sandboxing (e.g., running OpenClaw inside per-user containers with network policies, read-only filesystems, and stripped capabilities). DO NOT recommend running stock OpenClaw in multi-user production environments without external hardening. (source, source, source, source, source)
PokeeClaw's 'approval workflows' feature is mentioned on the product site under general 'governance' terminology but lacks explicit product detail in public documentation as of 2026-04-09. Verify whether explicit approval workflows (e.g., human-in-the-loop gating for sensitive agent actions) are actually available before recommending this specific capability to users. Check https://pokee.ai/ and https://dev.pokee.ai/faq for current feature documentation. (source, source, source) (medium confidence)
Clawdi is a managed, cloud-hosted platform that delivers full OpenClaw capabilities (web browsing, sandboxed terminal, skills system, persistent memory) without requiring local Docker or YAML setup. Every Clawdi instance runs inside its own Trusted Execution Environment (TEE) powered by Intel TDX on Phala's confidential compute infrastructure, giving each agent an isolated, sandboxed filesystem and browser whose activity is not visible to Phala's own infrastructure. TEE-backed remote attestation makes Clawdi's privacy guarantee cryptographically verifiable rather than trust-based, and conversations and API keys remain encrypted even from Clawdi's own systems. (source, source, source, source, source, source, source)

Warnings

Credential rotation is NOT required for end users who installed cline@2.3.0 — the postinstall payload only ran npm install -g openclaw@latest and no credential theft from user machines has been publicly confirmed as of 2026-04-09. However, instruct users to remove OpenClaw (npm uninstall -g openclaw), check for any OpenClaw daemon persistence (e.g., systemd units, launchd plists), and verify no unexpected entries exist in ~/.openclaw/. (source, source, source, source)
Security researcher Adnan Khan discovered the Cline triage vulnerability in late December 2025 and filed GHSA-9ppg-jx86-fqw7 on 2026-01-01. He was NOT the attacker — he conducted his PoC on a mirror repo. An unknown actor found his PoC and weaponized it against Cline. Cline did not respond for five weeks; the workflow was patched within 30 minutes of public disclosure on 2026-02-09. Cline's subsequent credential rotation was incomplete — the team deleted the wrong token, leaving the exposed npm token active until 2026-02-11, but the attacker had already exfiltrated credentials, allowing publication of the malicious package six days later on 2026-02-17. DO NOT attribute the attack to Khan. (source, source, source)
Endor Labs assessed overall impact of the Clinejection incident as low because OpenClaw itself is not malicious, and the postinstall script did not automatically start the Gateway daemon. However, the technique demonstrates that any npm postinstall script can silently install persistent, privileged software. Researchers warn the same technique could have delivered far worse payloads. Warn users about the general risk of npm lifecycle scripts and recommend using --ignore-scripts when installing unfamiliar packages. (source, source, source)
Microsoft Threat Intelligence observed a noticeable uptick in OpenClaw installations on 2026-02-17, corroborating the supply-chain compromise timeline. If a user reports an unexpected OpenClaw installation dated around that date, treat it as originating from the compromised cline@2.3.0 and instruct removal. (source, source)
For users affected by the axios compromise (March 2026): credential rotation IS required, unlike the Clinejection incident. The plain-crypto-js RAT had full system access including credential exfiltration capabilities. Instruct affected users to: (1) rotate ALL secrets, tokens, API keys, and credentials on the compromised machine, (2) audit CI/CD pipelines that ran npm install during 2026-03-31 00:21–03:29 UTC, (3) block C2 domain sfrclak[.]com (IP 142.11.206.73) at the firewall/DNS level, and (4) scan for persistence mechanisms left by the RAT. (source, source, source, source)
The v2026.3.8 release fixes a Telegram integration issue where stalled media downloads would hang indefinitely, causing polling failures. The fix applies a timeout only to stalled body reads so polling recovers without aborting slow-but-active downloads. Separately, v2026.3.8 deduplicates inbound Telegram DMs per agent so a single DM no longer triggers duplicate replies when multiple session keys resolve for the same agent. If a user reports duplicate Telegram bot responses or hung Telegram media downloads on OpenClaw, instruct them to upgrade to v2026.3.8 or later. (source, source, source)
When using GPT-5.4 or other models from the openai-codex family with OpenClaw, the correct API parameter key is reasoning.effort (NOT reasoning_effort). Builds prior to PR #36590 (merged 2026-03-06) reject openai-codex/gpt-5.4 as 'not allowed'. If a user reports 'model not allowed' errors for GPT-5.4, verify they are running a build from 2026-03-06 or later. A separate regression introduced in v2026.3.28 caused "Reasoning is required for this model endpoint" errors for GPT-5.4-mini on OpenRouter (GitHub issue #57430) — if a user hits this error, instruct them to check for patches beyond v2026.3.28 or set an explicit reasoningDefault to work around the enforcement. (source, source, source)
Setting exec-approvals.json defaults to {"security": "full", "ask": "off", "askFallback": "full"} may NOT be sufficient on all versions. GitHub issue #20141 documents users who set exactly those values but still hit approval walls, suggesting the fix may be incomplete or version-dependent. If a user applies the recommended exec-approval settings and still encounters approval failures after a gateway restart, instruct them to: (1) verify both exec-approvals.json AND tools.exec.* in openclaw.json are set (the two-layer policy means the stricter wins), (2) check that tools.exec.strictInlineEval is false if inline eval forms are being blocked, (3) verify they restarted the gateway after changing openclaw.json, and (4) check the OpenClaw GitHub issues for version-specific regressions. (source, source, source, source)
OpenClaw's MEMORY.md file is a persistent attack surface. The Tsinghua/Ant Group research (arXiv:2603.11619) demonstrates that an attacker who can write to MEMORY.md (e.g., via a malicious skill or prompt injection) can cause persistent behavioral changes — including outright refusal to perform certain tasks — that survive across sessions. If a user reports unexplained refusals, unexpected agent behavior, or suspicion of compromise, instruct them to audit MEMORY.md for injected directives, compare against a known-good backup, and clear any suspicious entries. Run openclaw backup verify against a pre-incident archive to detect tampering. (source, source, source)

Version guidance

For the Cline CLI: DO NOT use cline@2.3.0 (compromised, deprecated). Use cline@2.4.0 or later. The 2.4.0 release revoked the compromised token, removed the vulnerable triage workflow, and migrated to OIDC-based npm provenance. As of 2026-04-09, verify the latest safe version at https://www.npmjs.com/package/cline before recommending. (source, source, source)
For OpenClaw: v2026.3.8 (released 2026-03-09) is a REQUIRED security update — it patches CVE-2026-33574 (path-traversal/TOCTOU) and includes 12+ additional security fixes. It also adds ACP provenance support (openclaw acp --provenance off|meta|meta+receipt) and local backup CLI commands (openclaw backup create, openclaw backup verify). All users on versions prior to v2026.3.8 MUST upgrade. As of 2026-04-09, verify the latest version at https://github.com/openclaw/openclaw/releases before recommending. (source, source, source, source, source, source)
For OpenClaw: v2026.4.1 (released around 2026-04-09) is a backward-compatible additive release with no breaking changes to features — but it DOES reset exec-approvals.json to strict defaults, which can break automated exec workflows. It adds Amazon Bedrock Guardrails support (content-safety filtering), a bundled SearXNG web-search provider plugin, a /tasks background task board, macOS Voice Wake for Talk Mode, Feishu/Lark Drive comment integration, agents.defaults.params for global default provider parameters, configurable chat-history text truncation via gateway.webchat.chatHistoryMaxChars, and glm-5.1/glm-5v-turbo models in the Z.AI catalog. After upgrading to v2026.4.1, MUST reconfigure exec-approval settings (see 'Exec-approval and execution policy configuration' section). As of 2026-04-11, verify the latest version at https://github.com/openclaw/openclaw/releases before recommending. (source, source, source, source, source)
For axios: DO NOT use axios@1.14.1, axios@1.14.2, or axios@0.30.4 (compromised with RAT payload, removed from npm). Safe versions as of 2026-04-09: axios@1.14.0 (last legitimate 1.x release) and axios@0.30.3 (last legitimate 0.30.x release). Before recommending any axios version, verify at https://www.npmjs.com/package/axios that the version is not flagged and was published by a legitimate maintainer. (source, source, source, source)
Claude Code v2.1.122 was released on April 28, 2026 with approximately 18 CLI changes and 2 system prompt changes. Key changes include: (1) a new ANTHROPIC_BEDROCK_SERVICE_TIER environment variable to select an Amazon Bedrock service tier (default, flex, or priority), sent as the X-Amzn-Bedrock-Service-Tier header — use priority for low-latency interactive sessions and flex for lower-cost batch work (tier availability varies by model and region); (2) the ability to paste a PR URL into the /resume search box to find the session that created that PR (supporting GitHub, GitHub Enterprise, GitLab, and Bitbucket); (3) system prompt changes adding a 'looking is not acting' principle requiring explicit consent before risky actions; (4) security-relevant permission prompt improvements (permission suggestions now populated when safety checks trigger an ask response, with improved prompts for path/working-directory safety); (5) /mcp now shows claude.ai connectors hidden by a manually-added server with the same URL (with a hint to remove the duplicate); (6) OpenTelemetry numeric attributes on api_request/api_error events are now emitted as numbers (not strings); (7) a new claude_code.at_mention log event for @-mention resolution; (8) a /branch fork fix; and (9) additional fixes including !exit/!quit behavior in bash mode and image resize corrected to the intended 2000px maximum. (source, source, source, source, source, source, source, source, source)

Supply-chain hardening lessons

When configuring AI-powered GitHub Actions workflows (e.g., claude-code-action, Copilot agents), NEVER interpolate untrusted input (issue titles, PR bodies, comment text) directly into agent prompts. Use environment variables or parameterized inputs, restrict tool permissions to the minimum required, and run the workflow only on trusted trigger events (e.g., pull_request_target with explicit approval gates). The Clinejection attack demonstrates that prompt injection in CI/CD agents can lead to full credential exfiltration and package compromise. (source, source, source, source)
OpenClaw's dependency graph includes 1,246 third-party modules and 2,672 dependency paths. After the axios compromise, OpenClaw locked its dependency module versions. When hardening OpenClaw installations: (1) ALWAYS use exact-version pinning in lockfiles — never floating ranges for transitive dependencies, (2) audit lockfiles for plain-crypto-js and compromised axios versions after any npm install/yarn install, (3) use --ignore-scripts for initial installs of unfamiliar packages, (4) enable npm provenance verification where available, and (5) run comprehensive dependency audits regularly — Skills and Agents modules may pull axios transitively. (source, source, source, source)
When hardening OpenClaw skill installations, apply the initialization-stage defenses from the Tsinghua/Ant Group framework (arXiv:2603.11619): (1) require SBOM (Software Bill of Materials) for every skill before installation, (2) require cryptographic signatures on skill packages and verify them before loading, (3) implement plugin vetting — automated static analysis and behavioral sandboxing of skills before deployment. Stock OpenClaw does NOT enforce any of these natively. The openclaw acp --provenance meta+receipt setting (available since v2026.3.8) provides ACP-level provenance but does NOT cover skill-package signing or SBOM generation. For production deployments, implement SBOM generation and signature verification via external tooling (e.g., syft for SBOM, cosign for signatures) or evaluate the ClawAegis plugin. (source, source, source)

Security-relevant features and hardening configuration

When using OpenClaw with Amazon Bedrock, enable Bedrock Guardrails (added in v2026.4.1, PR #58588) for opt-in PII blocking, content filtering, and grounding checks. Configure guardrail IDs in the Bedrock provider settings. This is the first native content-safety integration in the bundled Bedrock provider — recommend enabling it for any production deployment handling sensitive data. (source, source)
When a user wants private web search within OpenClaw, recommend the bundled SearXNG provider plugin (added in v2026.4.1, PR #57317). It supports configurable host pointing, so users can run a self-hosted SearXNG instance that does NOT track or profile users. For security-sensitive environments, instruct users to self-host SearXNG rather than relying on public instances, and configure the web_search provider host to the internal URL. (source, source, source)
Use agents.defaults.params (added in v2026.4.1) to set global default provider parameters across all agents. This allows enforcing consistent temperature, token limits, and safety parameters at the configuration level rather than per-agent. Use gateway.webchat.chatHistoryMaxChars to control chat-history text truncation, which limits how much context is sent and reduces the risk of prompt-exfiltration attacks via large conversation histories. (source, source)
Enable ACP provenance tracking (added in v2026.3.8) for agent-to-agent communication audit trails. Configure via openclaw acp --provenance off|meta|meta+receipt. The meta mode retains ACP-origin metadata and session trace IDs on inbound messages; meta+receipt additionally injects visible receipt data. For any deployment where agent-to-agent interactions must be traceable or auditable, set --provenance meta+receipt. This is the first native provenance mechanism for ACP ingress in OpenClaw. Independently confirmed by the Umbrel app store changelog, blockchain.news coverage, and GitHub release notes. (source, source, source, source, source)
Use the local backup CLI (added in v2026.3.8) to create and verify state archives before upgrades, migrations, or hardening changes. Commands: openclaw backup create (creates a local archive of OpenClaw state), openclaw backup verify (validates manifest and payload integrity of an existing archive). Flags: --only-config (backs up configuration only, excluding workspace data), --no-include-workspace (excludes workspace artifacts). MUST run openclaw backup create before applying any security-critical upgrade or configuration change. After restoring from backup, run openclaw backup verify to confirm archive integrity. Independently confirmed by the Umbrel app store changelog and GitHub release notes. (source, source, source, source)
To prevent reasoning text from leaking to end-user channels, configure the following in openclaw.json: (1) Set agents.defaults.reasoningDefault to "off" — this globally suppresses reasoning display for all new sessions. Without this, resolveDefaultReasoningLevel() auto-enables reasoning for reasoning-capable models. (2) For per-agent override, set agents.list[].reasoningDefault to "off" on specific agents. (3) Set agents.defaults.heartbeat.includeReasoning to false (or leave at default false) to prevent heartbeat messages from including reasoning output. (4) Optionally set agents.list[].thinkingDefault to "off" to suppress the related thinking feature. The resolution order is: inline directive (/reasoning off) → session override → per-agent default (agents.list[].reasoningDefault) → global default (agents.defaults.reasoningDefault) → fallback "off". Valid values for both reasoningDefault and thinkingDefault: "off", "on", "stream". For production deployments on public-facing channels, set BOTH global defaults to "off" and verify no per-agent overrides re-enable them. See https://docs.openclaw.ai/tools/thinking for the full resolution chain and https://docs.openclaw.ai/gateway/configuration-reference for all config keys. (source, source, source, source, source, source, source, source)
A proposed output-level filter feature (agents.defaults.outputFilters with regex strip actions) was filed as GitHub Issue #45041 (2026-03-13) to strip leaked "Reasoning:" lines and meta-planning text (e.g., "Let me...", "I will...") from assistant messages before channel delivery. As of 2026-04-10, this feature request is still open/unmerged. DO NOT recommend outputFilters as a config key — it does not exist yet. If a user needs output-level filtering, they must implement it externally (e.g., via a proxy or webhook post-processor) until this feature is merged. Verify status at https://github.com/openclaw/openclaw/issues/45041. (source, source)
For execution-stage hardening, the Tsinghua/Ant Group framework (arXiv:2603.11619) recommends kernel-level sandboxing using eBPF (Extended Berkeley Packet Filter) and seccomp to constrain what agent-triggered commands can actually do at the OS level. Stock OpenClaw does NOT implement eBPF or seccomp sandboxing. For production deployments on Linux, implement these externally: (1) use seccomp profiles to restrict the system calls available to the OpenClaw gateway process and any child processes spawned by exec, (2) use eBPF-based monitoring (e.g., Falco, Tetragon) to detect and block anomalous agent behaviors at runtime (unexpected network connections, file access outside approved paths, privilege escalation attempts), (3) enforce Linux capabilities with a minimal set — drop CAP_SYS_ADMIN, CAP_NET_RAW, and other unnecessary capabilities from the container or process. Combine with OpenClaw's own tools.exec.security: "allowlist" for defense-in-depth. (source, source, source, source)
OpenClaw supports a claude-cli backend that bridges to the local claude binary via cliBackends in openclaw.json, a pattern Anthropic staff have sanctioned as distinct from extracting OAuth tokens for third-party API clients. However, since April 4, 2026, Anthropic no longer covers OpenClaw usage under Pro/Max/Team subscriptions routed via OAuth; the CLI backend approach (invoking the real claude -p binary) remains permitted for personal, single-user gateway hosts, but is explicitly not suited for shared or multi-user billing setups, and tools/streaming are disabled on the OpenClaw side for CLI backend runs. (source, source, source, source, source, source, source, source)
OpenClaw v2026.4.27 adds official Codex Computer Use support, exposing /codex computer-use status and /codex computer-use install commands. The integration is macOS-specific, requires OS-level permissions (Accessibility and Screen Recording), and enforces a fail-closed pre-check at turn start — if the Codex MCP server/tools are unreachable, desktop control is refused and the turn does not proceed. The feature is opt-in and operator-configured, not enabled by default; it requires explicit setup via /codex computer-use install and only activates after the MCP server is verified reachable. OpenClaw does not perform desktop actions itself; it orchestrates the Codex app-server plugin (auto-discovery, marketplace install from /Applications/Codex.app/Contents/Resources/plugins/openai-bundled, and MCP availability verification via cua-driver mcp and PeekabooBridge components), with config keys under plugins.entries.codex.config.computerUse governing autoInstall, marketplaceDiscoveryTimeoutMs, and marketplace source options. The underlying Codex Computer Use capability (provided by OpenAI's Codex desktop app) enables agents to see the screen, move the cursor, click, and type inside Mac applications. Despite the fail-closed safety checks, users are advised to run OpenClaw in non-root environments, bind gateways to loopback, and follow official security guidance, as the integration grants agents significant system access. (source, source, source, source, source, source, source, source)

Exec-approval and execution policy configuration

OpenClaw's exec-approval system is a TWO-LAYER policy. Layer 1: ~/.openclaw/exec-approvals.json (the defaults block controls security, ask, and askFallback). Layer 2: tools.exec.* in ~/.openclaw/openclaw.json (controls host, security, and strictInlineEval). The effective policy is the STRICTER of the two layers — configuring only one layer is INSUFFICIENT. To configure unrestricted exec (single-user, trusted environments only): (1) In ~/.openclaw/exec-approvals.json, set defaults to {"security": "full", "ask": "off", "askFallback": "full"}. (2) In ~/.openclaw/openclaw.json, set tools.exec.host to "gateway", tools.exec.security to "full", and optionally tools.exec.strictInlineEval to false. (3) Restart the gateway: openclaw gateway restart. Changes to openclaw.json do NOT take effect without a restart. For multi-user or public-facing deployments, use security: "allowlist" with explicit per-command allowlist entries instead of "full". See https://docs.openclaw.ai/tools/exec-approvals and https://docs.openclaw.ai/tools/exec for full documentation. (source, source, source, source, source, source, source, source)
Valid values for exec-approvals.json fields: security accepts "full" (allow everything — equivalent to elevated), "allowlist" (only pre-approved commands), "deny" (block all exec). ask accepts "off" (never prompt for approval), "on-miss" (prompt only when command is not in allowlist), "always" (prompt for every command). askFallback accepts "full" (allow if approval is not available or times out) and "deny" (block if approval is not available). The tools.exec.strictInlineEval setting in openclaw.json controls whether inline interpreter eval forms (python -c, node -e, ruby -e, perl -e, php -r, lua -e, osascript -e) always require explicit approval; set to false to allow them without approval prompts. On macOS, complex shell commands using control syntax (&&, ||, ;, |, $, etc.) are treated as an allowlist miss unless the shell binary itself is allowlisted. If OpenClaw cannot identify exactly one concrete local file for an interpreter/runtime command, approval-backed execution is denied regardless of other settings. (source, source, source, source, source)

Safe update procedures

When advising users on updating OpenClaw, use ONLY the officially documented methods (https://docs.openclaw.ai/updating): (1) For source/git installs: openclaw update which runs update + restart. (2) For global npm installs: sudo npm i -g openclaw@latest (or sudo npm i -g openclaw@<version> to pin/rollback), then openclaw gateway restart. (3) For Docker installs: pull the updated image (e.g., ghcr.io/openclaw/openclaw:latest or a specific tag) and recreate/restart the container per https://docs.openclaw.ai/install/docker. DO NOT recommend openclaw update openclaw --version — this syntax is inaccurate and does not match official documentation as of 2026-04-09. Before any upgrade, run openclaw backup create to capture a restorable state archive (available since v2026.3.8). After any upgrade, MUST verify and reconfigure ~/.openclaw/exec-approvals.json and tools.exec.* in openclaw.json — upgrades (especially v2026.4.1) can reset exec-approval settings to strict defaults, breaking automated workflows. Always restart the gateway after config changes. (source, source, source, source, source, source, source, source)

Known CVEs patched in OpenClaw releases

CVE-2026-33574: Path-traversal/TOCTOU vulnerability affecting OpenClaw versions prior to v2026.3.8. Fixed in v2026.3.8 (released 2026-03-09). SentinelOne documented this CVE. If a user is running any version older than v2026.3.8, instruct them to upgrade immediately. Verify current CVE status at https://www.sentinelone.com/vulnerability-database/cve-2026-33574/ and the OpenClaw releases page. (source, source, source)
OpenClaw v2026.4.25 contains a critical security fix: gateway and device tokens now enforce caller-scope containment during token rotation and revocation, preventing pairing-only sessions from mutating higher-scope operator tokens. This addresses a privilege escalation vulnerability in remote device access and pairing flows. The same release also expanded OpenTelemetry coverage (model calls, token usage, tool loops, exec processes, context assembly, memory pressure) with bounded low-cardinality attributes that explicitly exclude prompt/history/response/session contents from metrics and spans, and added a bundled diagnostics-prometheus plugin exposing a protected Gateway scrape route at /api/diagnostics/prometheus. (source, source, source)
Claude Code CLI v2.1.126 (May 1, 2026) includes several security-relevant changes: --dangerously-skip-permissions now bypasses prompts for writes to protected paths (.claude/, .git/, .vscode/, shell config files), with catastrophic-removal commands still prompting as a last-resort safety net — avoid this flag in multi-user or untrusted environments. On Windows, clipboard writes no longer expose copied content in process command-line arguments visible to EDR/SIEM telemetry (and fixes >22KB selections not reaching the clipboard). Additionally, claude auth login now accepts an OAuth code pasted directly into the terminal when the browser callback cannot reach localhost (WSL2, SSH, containers), and claude project purge [path] provides a supported way to delete all Claude Code project state (transcripts, tasks, file history, config) with --dry-run preview support. (source, source, source, source)
A local file inclusion vulnerability (GHSA-gfg9-5357-hv4c) in OpenClaw's webchat audio embedding allowed attacker-influenced ReplyPayload.mediaUrl values to resolve absolute local paths or file: URLs, enabling the gateway process to read, base64-encode, and embed host-local files into webchat responses — effectively crossing the model/tool output boundary into a host file read. The flaw affected all releases up to and including v2026.4.14 and was patched in v2026.4.15 via commit 6e58f1f9f54bca1fea1268ec0ee4c01a2af03dde, which introduced assertLocalMediaAllowed checks. Users should upgrade to openclaw@2026.4.15 or later and avoid exposing webchat sessions to untrusted prompt or tool content as an interim workaround. (source, source, source, source)
OpenClaw v2026.5.3 introduces a bundled file-transfer plugin that exposes four agent tools (file_fetch, dir_list, dir_fetch, file_write) for binary file operations on paired nodes. The plugin enforces a default-deny per-node path policy requiring explicit operator approval via plugins.entries.file-transfer.config.nodes, refuses symlink traversal by default (opt-in followSymlinks), and caps data transfer at approximately 16 MB per round-trip. The same release hardens official plugin install/update/uninstall flows by preserving sibling plugins via a managed npm-root manifest, preferring @beta channel updates with a fallback to latest, rejecting source-only plugin packages before runtime load, and cleaning stale bundled load paths for externalized installs. (source, source, source, source, source)

Industry context — AI supply-chain risk landscape and CISO priorities

When framing OpenClaw supply-chain risks for enterprise users, reference the broader industry threat landscape: OWASP Top 10 for LLM Applications includes LLM03:2025 (Supply Chain Vulnerabilities), which covers risks from third-party components used in building, training, and deploying LLMs — directly applicable to OpenClaw's 1,246-module dependency graph and ClawHub skill ecosystem. As of 2026-04-17, 68% of CISOs report concern about third-party software supply chain risk, and an identical 68% view secure deployment of generative AI as a critical priority. Use these data points when advising users on whether OpenClaw's supply-chain posture is acceptable for their threat model. Industry leaders including Fred Voccola (Chairman/CEO of Simpro Group, formerly CEO of Kaseya, Forbes Technology Council member) have publicly emphasized that enterprise AI security requires mapping real threats to real business impact and adopting solutions that are 'simple, automated and measurable' — guidance directly applicable to OpenClaw hardening decisions. Voccola's experience with the Kaseya ransomware supply-chain incident (as CEO during the 2021 REvil attack) underscores the real-world consequences of insufficient supply-chain security in software platforms. When recommending OpenClaw hardening controls, frame them in terms of measurable business-risk reduction, not just technical configuration. (source, source, source, source, source, source)
The enterprise AI security vendor landscape spans multiple layers: orchestration/control platforms (e.g., Palantir Foundry/AIP with a Rule-of-40 of ~127%), endpoint/identity/agent detection (e.g., CrowdStrike Falcon AIDR), AI runtime and model security (e.g., Palo Alto Networks Prisma AIRS), Zero Trust network enforcement (e.g., Zscaler, which processed nearly one trillion AI transactions in 2025), cyber recovery (e.g., Rubrik with AI agent auditing), internet control-plane coverage (e.g., Cloudflare, behind ~20% of global web traffic), and enterprise data/RAG pipelines (e.g., Snowflake Cortex Agents, MongoDB Atlas Vector Search). These vendors collectively define the commercial security stack that organizations operating AI agents — including OpenClaw deployments — are expected to integrate with for defense-in-depth. Security practitioners deploying OpenClaw in enterprise environments should account for which layers of this stack are already in place, as gaps in identity enforcement, Zero Trust policy, or supply-chain monitoring directly affect OpenClaw's runtime threat surface. (source, source, source, source, source, source, source, source, source, source)
CerbAgent (ticker: $CERB) is a Solana-based project presenting a network of three autonomous AI security agents — Shield (real-time approval monitoring and instant revocation), Sentinel (pre-transaction simulation), and Recovery (active asset rescue via private mempools/flashbots) — targeting DeFi wallet security across Ethereum, Solana, Base, Arbitrum, Polygon, and BSC. The $CERB SPL token powers agent access, governance, fee-sharing, and holder-priority recovery. Importantly, the official roadmap shows these capabilities as phased future releases (Shield public beta Q3 2026; Sentinel Q4 2026; Recovery 2027), meaning most features are planned rather than live-deployed. (source, source, source) (medium confidence)
Cerberus / CerbAgent is a Solana-based project ($CERB SPL token) that advertises three continuously-running AI 'security agents' — Shield (real-time approval monitoring), Sentinel (pre-transaction simulation), and Recovery (private-mempool/flashbots-style asset rescue) — marketed as protection for both regular and automated (agent) wallets across Ethereum, Solana, Base, and other chains. The project is unaffiliated with OpenClaw and represents an external DeFi/crypto security product; its relevance to OpenClaw operators is limited to awareness that automated wallet-protection tooling of this kind exists and may interact with or be invoked by OpenClaw skills handling on-chain transactions. Independent verification of its claimed capabilities and the security of the $CERB token-gating mechanism have not been established. (source, source, source, source, source) (medium confidence)
ERC-20 unlimited token approvals (set to type(uint256).max) allow a spender contract to drain all approved tokens from a wallet at any future time — no additional user signature required. Attackers have exploited dormant approvals hours, days, or weeks after granting, as seen in the BadgerDAO hack (~$120M lost). Approval-based exploits accounted for over $200 million in losses in 2024–2025. (source, source, source, source, source, source, source)
Token approval exploits (ERC-20 allowances and NFT SetApprovalForAll) are a well-documented DeFi attack vector unrelated to OpenClaw: a malicious contract with an unlimited allowance can call transferFrom() to drain wallets without stealing private keys, and approvals remain active on-chain even after disconnecting from a dApp. Projects such as CerbAgent ($CERB) advertise AI-agent stacks (Shield for real-time auto-revoke, Sentinel for pre-transaction simulation, Recovery for MEV-based front-running of drainers) to address this, with their full suite staged for delivery between Q3 2026 and 2027. This attack class and its mitigations are outside OpenClaw's agent runtime scope and are noted here only to distinguish them from supply-chain or sandboxing issues that do fall within OpenClaw's security domain. (source, source, source, source, source, source)
OpenAI's Codex agent runtime implements a defense-in-depth approval architecture: an automatic reviewer agent (enabled via approvals_reviewer = "auto_review") intercepts eligible approval requests — sandbox escalations, network access, credential-touching or side-effecting MCP tool calls — and evaluates them for data exfiltration, credential probing, persistent security weakening, or destructive actions before Codex executes them; critical-risk actions are denied outright and high-risk actions require sufficient user authorization. This reviewer agent is complemented by a separate chain-of-thought (CoT) monitoring system (powered by GPT-5.4 Thinking at maximum reasoning effort) that logs and analyzes agent reasoning to flag misalignment or reward-hacking behavior and alert for triage. Actions already permitted inside the sandbox boundary continue without additional review, preserving the sandbox as the primary containment layer. (source, source, source, source, source, source, source, source)

Skill file authoring and security

OpenClaw skill files are SKILL.md documents containing YAML frontmatter (name, description) plus freeform Markdown instructions that are injected directly into the agent's system prompt, making their content the primary lever for shaping agent behavior. The agent reads only the L1 metadata (~100 tokens) at startup to decide whether to activate a skill, then loads the full instructions (<5,000 tokens) only on trigger — meaning a poorly written or malicious description is the single most impactful attack or failure surface at the routing layer. Official OpenClaw skill files (e.g., the bundled coding-agent skill) encode explicit negative constraints such as 'do NOT hand-code patches yourself' and 'NEVER start Codex inside your OpenClaw state directory,' establishing negative triggers as a first-class security and correctness pattern. (source, source, source, source, source, source, source)

Claude CLI backend integration

OpenClaw supports using the local Claude Code CLI as a backend by launching the 'claude -p' binary with headless flags (--output-format json, --session-id) on the gateway host, reusing the user's local Claude session and subscription auth. Anthropic staff (confirmed publicly by Boris from the Claude Code team) reversed a prior April 4, 2026 ban and stated that OpenClaw-style 'claude -p' CLI usage is allowed again, and OpenClaw's official docs treat it as sanctioned unless Anthropic publishes a new policy. However, Anthropic's own legal/compliance docs state that OAuth authentication is intended for individual use only and that third-party developers must use API keys — and OpenClaw itself notes that Anthropic API keys remain 'the clearest and most predictable production path' for long-lived or fleet deployments. (source, source, source, source, source, source, source, source) (medium confidence)

Last updated: 2026-05-03T03:00:53.486Z