Agentjacking Attack Tricks AI Coding Agents Into Running Malicious Code

Meta Description: Cybersecurity researchers discovered Agentjacking, a new attack that tricks AI coding agents like Claude Code and Cursor into running malicious code via Sentry error reports.

Excerpt: A new class of attack exploits the trust AI coding agents place in error-tracking services, allowing attackers to run arbitrary code on developer machines.

Image: 79LuMoTo79 via Wikimedia Commons (CC0)

If you’re using AI coding assistants like Claude Code, Cursor, or GitHub Copilot to help fix bugs, you might want to sit down for this one. Cybersecurity researchers just discovered a new class of attack that can trick these AI agents into running arbitrary code on your machine — and they’re calling it Agentjacking.

The attack, discovered by Tenet Security, exploits a critical flaw in how AI coding agents interact with error-tracking services like Sentry. It’s clever, it’s scary, and it works against some of the most popular AI coding tools in use today.

Table of Contents

What Actually Happened

Here’s the breakdown. Sentry is an error-tracking platform that many development teams use to monitor crashes and bugs. When your application throws an error, Sentry captures it and displays it in a dashboard so developers can fix it.

The problem? Sentry’s architecture allows anyone with a Data Source Name (DSN) — essentially a public endpoint — to submit error events. And here’s where it gets dangerous: AI coding agents that integrate with Sentry via the Model Context Protocol (MCP) treat these error events as trusted system output.

Tenet Security’s researchers figured out they could inject malicious instructions into Sentry error events. When a developer asks their AI agent to “fix this Sentry issue,” the agent reads the attacker’s crafted error as legitimate diagnostic information and executes the embedded command — with the developer’s own privileges.

The attack chain works like this:

1. Attacker identifies an organization’s Sentry DSN (publicly exposed)

2. Attacker submits a crafted error event with malicious instructions disguised as a “Resolution”

3. Developer sees the error in Sentry dashboard and asks AI agent to fix it

4. AI agent reads the attacker’s instruction as trusted guidance

5. Agent executes the malicious code on the developer’s machine

The researchers found at least 2,388 organizations exposed with valid injectable DSNs. When they tested the attack in controlled conditions, they achieved an 85% success rate against injected errors across popular AI coding assistants.

Why This Matters for Developers

This isn’t just a theoretical vulnerability. It’s a fundamental flaw in how we’ve built trust between AI agents and external services.

Think about it: when you connect your AI coding assistant to Sentry, you’re essentially telling it, “Trust whatever comes from this service.” The AI agent can’t distinguish between a legitimate error from your application and a malicious payload injected by an attacker. It just sees “error from Sentry” and processes it accordingly.

The implications are serious. A successful Agentjacking attack can expose:

**Environment variables** (API keys, database credentials)
**Git credentials** and private repository URLs
**Developer identities** and access tokens
**Sensitive code** and configuration data

And the attacker never touches your infrastructure. The malicious instruction arrives disguised as a legitimate error resolution inside Sentry. When your AI agent processes it, the attacker gets everything they need — using your own privileges, on your own machine.

The Bigger Picture: AI Agents as Attack Surfaces

Agentjacking represents something we haven’t seen before: AI agents themselves becoming the attack surface. Until now, we’ve focused on securing the code AI agents write, or preventing prompt injection attacks. But this is different — it’s exploiting the trust relationships we’ve built between AI tools and the services they connect to.

This attack works because of a fundamental architectural flaw: Sentry’s event ingestion accepts arbitrary payloads from anyone with the DSN, while the MCP server returns this data to AI agents as trusted system output. The AI agent has no way to verify whether the error event is legitimate or malicious.

Sentry has acknowledged the issue but says it’s “technically not defensible.” They’ve activated a global content filter that blocks a specific payload string, but that’s a band-aid solution. The underlying problem remains: when you give an AI agent access to an external service, you’re also giving it access to anything that service can send back.

What Developers Should Do Now

If you’re using AI coding assistants with Sentry or similar error-tracking services, here are some immediate steps:

1. Review Your Sentry DSN Exposure

Check whether your Sentry DSN is publicly exposed. If it is, anyone can submit error events to your Sentry dashboard. Consider restricting access or using environment-specific DSNs.

2. Audit AI Agent Permissions

Review what permissions your AI coding assistants have. Do they need full access to Sentry, or can you limit them to specific projects? The principle of least privilege applies here just as it does everywhere else.

3. Implement Manual Review for AI-Generated Fixes

Don’t blindly accept AI-suggested fixes for Sentry errors. Review the proposed solution before applying it. If an error resolution looks suspicious or unusual, investigate further before trusting it.

4. Monitor for Unusual Activity

Keep an eye on your AI agent’s activity. If it starts accessing environment variables, git credentials, or sensitive files without clear reason, that’s a red flag.

5. Consider Alternative Error-Tracking Workflows

Maybe connecting AI agents directly to Sentry isn’t the best approach. Consider having developers manually review errors and provide context to the AI agent, rather than giving the agent direct access to the error stream.

The Filipino Developer Perspective

Here in the Philippines, we’re seeing rapid adoption of AI coding tools across our tech ecosystem. From startups in Makati to government IT teams in Quezon City, everyone’s excited about the productivity gains these tools promise.

But Agentjacking is a reminder that convenience often comes with security trade-offs. As we embrace these AI assistants, we need to be thoughtful about how we connect them to our development workflows.

For Filipino developers working remotely or in hybrid setups — which is most of us now — the stakes are even higher. Your home network might not have the same security controls as your office. A compromised AI agent could give attackers access to your employer’s codebase, customer data, or internal systems.

This doesn’t mean we should avoid AI coding tools. It means we should use them wisely, with clear eyes about the risks. The 85% success rate Tenet Security achieved in their tests should serve as a wake-up call.

Bottom Line

Agentjacking isn’t just another vulnerability to patch. It’s a signal that we need to rethink how we build trust between AI agents and the services they connect to.

The attack works because we’ve given AI agents implicit trust in external services like Sentry. That trust is now being weaponized against us. As we integrate AI deeper into our development workflows, we need to be more deliberate about security — not less.

For developers using AI coding assistants, the message is clear: review what your AI tools can access, monitor their activity, and never blindly trust automated fixes. The convenience these tools provide is real, but so are the risks.

The AI coding revolution is here. Now we need to make sure we’re not trading security for speed.

Sources: