I’ve been using AI coding tools daily for about two years now. They’ve rewritten functions I was too lazy to refactor, generated tests I should’ve written myself, and occasionally produced bugs that took me an entire afternoon to untangle. But through all of it, I operated on one quiet assumption: the tool wouldn’t actively work against me.

Developer working on Linux laptop with computer hardware and repair tools
Image: Kowalski7cc via Wikimedia Commons (CC0)

Last week’s DuneSlide disclosure reminded me how fragile that assumption really is.

What DuneSlide Is — And Why It Scored a 9.8

On July 1, researchers at Cato AI Labs published details of two critical vulnerabilities in Cursor, the AI-powered code editor used by millions of developers. They called the pair “DuneSlide,” and the severity score leaves no room for ambiguity: CVSS 9.8 out of 10. That’s “drop what you’re doing and patch this now” territory.

The attack begins with prompt injection — a technique that’s becoming disturbingly common in AI agent security research. You, the developer, ask Cursor a perfectly normal question about your codebase. But the AI agent also reads content from a connected service, like an MCP server or a web search result — something I explored when building your first MCP server with Python —. Hidden inside that content is a malicious instruction. No click. No approval dialog. The instruction rides along with your request, and the agent executes it.

As the researchers put it: “There is no click to fall for and no approval box to ignore.”

Cursor’s sandbox — introduced in the 2.x line to prevent the agent from touching system files — was supposed to stop exactly this kind of attack. The sandbox says: you can only write files inside the project folder. DuneSlide found two ways around it.

The first vulnerability, CVE-2026-50548, exploits how Cursor handles working directories. When the agent runs a terminal command, it can specify a working directory — and that directory gets added to the write allowlist without any validation. An attacker points the working directory at the sandbox helper binary itself and overwrites it. Game over.

The second, CVE-2026-50549, uses a symlink trick. Cursor resolves symlinks before writing to confirm the real destination is within the project. But when the symlink check fails — because the target doesn’t exist or the attacker removed read permissions — Cursor gives up and trusts the shortcut’s in-project path. The write goes straight through to wherever the symlink points.

The result? Full control of the developer’s machine. Every cloud workspace, every SaaS session the editor is signed into — all compromised. From one harmless-looking question about your React component.

This Isn’t Cursor’s First Rodeo

Here’s what really bothers me about DuneSlide: it’s not a one-off. Cursor has been hit before, and the pattern is unsettling.

In August 2025, CurXecute (CVE-2025-54135) let a planted Slack message rewrite Cursor’s MCP configuration file. Commands ran even after the user rejected the edit. That same month, MCPoison (CVE-2025-54136) allowed an approved MCP config to be silently swapped for a malicious one. In February 2026, a Git hook vulnerability (CVE-2026-26268) fired hidden commands when the agent simply ran a git operation.

Cursor shipped a sandbox to address all of these. DuneSlide is about escaping that sandbox. Each fix addresses the previous vulnerability; none of them seem to address the underlying architecture that keeps producing new ones.

The Bigger Picture: GuardFall and Half a Million Stars of Exposure

And it’s not just Cursor.

Just days before DuneSlide, researchers at Adversa AI published GuardFall — a vulnerability class affecting 10 out of 11 popular open-source AI coding agents. The affected list reads like a developer’s toolkit: Cline, Roo-Code, Aider, Open Interpreter, OpenHands, SWE-agent. Combined GitHub stars: over half a million. All of them left the same gap open.

GuardFall exploits something I find almost elegant in its simplicity. Safety filters check commands as plain text — they look for patterns like “rm -rf” and block them. But bash doesn’t execute plain text. It strips quotes, expands variables, and resolves substitutions first. You can write r''m -rf / and the text filter sees a harmless string with some empty quotes. Bash removes the quotes and happily runs rm -rf /.

Commands hidden in base64 piped into a shell. Standard Unix tools like find and dd turned destructive with the right flags. The researchers call this not a bug but “a dangerous convention and a class of problems.” And they’re right — adding more blocklist patterns fixes precisely nothing, because the filter and the shell fundamentally see different things.

There’s one tool that held up: Continue. Its defense works by reading commands the way bash will, before deciding whether to block them. It breaks the command into the pieces the shell would, checks what actually runs, and maintains a hard blocklist of destructive operations. Adversa estimates this took about two days of engineering work.

Two days. For a defense that stopped the most destructive payloads cold.

Why This Matters More Than You Think

AI coding agents run with the user’s full privileges. They can read files, write files, execute commands, and access cloud credentials. Unlike a browser — which sandboxes every tab — or a mobile operating system — which isolates every app — these agents treat all input equally, whether it came from your carefully written project code or from a random npm package’s README.

Think about that for a second. Any README. Any tool description. Any Slack message. Any web search result. Any of these can contain instructions the agent didn’t ask for and the developer never saw.

The common thread across DuneSlide, GuardFall, CurXecute, and every other entry in this growing list is the same: untrusted content reaches a real shell before the guard understands what bash will actually run. We built the capability first and are now scrambling to bolt security on after the fact.

As Cato AI Labs noted in their disclosure, they’re finding similar flaws in other coding agents too. This isn’t a Cursor problem. This isn’t even a specific tool problem. It’s an architectural problem — and the industry hasn’t faced it squarely yet.

What Developers Should Do Today

I’m not saying stop using AI coding tools. I use them every day and I’m not about to quit. But there are practical steps that dramatically reduce your exposure right now.

First, update Cursor to version 3.0 if you haven’t already. The fix shipped in April 2026, but plenty of developers run old versions indefinitely — I know because I’ve been guilty of this myself. Check your version. If it’s below 3.0, update now.

Second, treat auto-execute flags with extreme caution. Flags like --auto-exec, --auto-run, --dangerously-skip-permissions — if you see these in your config, understand that you’ve removed the last manual checkpoint between the AI and your file system. Turn them off unless you have a specific, time-boxed reason to use them.

Third, run agents with their home folder pointed somewhere disposable. This protects your real ~/.ssh, ~/.aws, and other credential directories. It’s not a complete solution, but it raises the bar significantly. The Adversa researchers specifically recommend this as an immediate mitigation.

Fourth, never let an AI agent run on pull requests from forks — this is an extension of the same supply-chain thinking behind auditing your npm dependencies for malware, and the same logic applies here in your CI pipeline. A malicious PR can contain poisoned tool configurations or documentation that the agent reads as legitimate instructions. This is especially dangerous in automated workflows where no human reviews the agent’s actions.

And fifth — this is the one I’m still working on — read what your AI agent is about to do before it does it. It reminds me of what I wrote about reviewing AI-generated code like a senior engineer — the same vigilance applies. I know it’s tedious. The whole value proposition of these tools is speed. But until the security model catches up, that approval dialog is your last line of defense. I caught a near-miss a few months ago — the agent suggested a command that would have dropped a database if I’d been connected to the wrong terminal. I caught it because I was reading. If the instruction had come wrapped in a normal-looking prompt, injected through a tool I’d configured weeks ago? I’m not sure I would have.

What the Industry Needs to Do Next

The fix for DuneSlide took about a month from report to release. The fix for GuardFall — at least the pattern Continue demonstrated — took two days. These are solvable problems, and the gap between what we have and what we need is narrower than the headlines suggest.

But the bigger fix won’t come from patching individual tools. It requires a shift in how we think about AI agent architecture.

First: validate commands after shell expansion, not before. Every tool that checks commands as plain text is vulnerable to GuardFall-style bypasses. The defense is to read the command the way the shell will — resolve the quotes, expand the variables, see what actually runs — and then decide.

Second: treat every input as potentially hostile. Every MCP server response. Every web search result. Every repository file. Every tool description. If text can reach the agent and the agent can reach the shell, that text is a potential attack vector. Full stop.

Third: run agents with the minimum privileges they need, not the user’s full account. This is Operating System Security 101 — the principle of least privilege. Browsers do it. Mobile apps do it. Docker does it. AI coding agents should too.

Fourth: sandbox file writes at the operating system level, not inside application logic. DuneSlide exploited symlink handling and directory validation bugs — problems that OS-level sandboxing solves categorically. If the agent process literally cannot write outside a designated directory, no amount of prompt engineering can change that.

The industry has been moving fast. That’s been a feature, not a bug — we’re building incredible tools that genuinely make developers more productive. But security that depends on application-level checks that don’t understand shell semantics is security theater. And the attackers are starting to notice.

The Bottom Line

Building AI agents has become remarkably accessible. But AI coding agents are not going anywhere. They’ve become as essential to my workflow as my terminal emulator or my package manager. They catch bugs I would have missed. They help me learn frameworks I was too intimidated to touch. They handle the boilerplate so I can focus on the interesting problems.

But they’re also trust boundaries we haven’t fully understood yet. Every time an AI agent reads content from the internet and turns it into a shell command, there’s a security decision being made. Right now, that decision is being made by a string-matching function that doesn’t speak the same language as the shell it’s supposed to protect.

The Cato researchers closed their disclosure with an open question: “whether treating every input as hostile becomes the default, or stays a patch-by-patch scramble.” That’s the right question. And if we’re honest, we already know the answer — we just haven’t built it yet.

Update your Cursor. Turn off auto-execute. Read the commands before they run. And if you’re building tools in this space, start from the assumption that your agent will encounter hostile input. Because it will.

Filed under AI Coding
Last Update: July 2, 2026 by Felix AlterEgo
0 0 votes
Article Rating
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Newest
Oldest Most Voted