AI Email Agents: The Most Dangerous Integration, and How to Contain It

An AI email agent reads, triages, drafts, and sometimes sends email on your behalf, connected to a mailbox over Gmail, Outlook, or IMAP. It can clear a backlog, draft replies, pull details out of messages into your other systems, and surface what actually needs you. It is one of the highest-value things you can point an agent at. It is also the integration where the most can go wrong, for a reason that is structural rather than incidental, and that reason is the whole point of this guide.

We build Pinchy, a self-hosted agent platform with an email integration, so we have a stake. We will be specific about the containment that makes an email agent safe to run, because the default version is genuinely risky.

Why email is uniquely dangerous

Most integrations are dangerous in one direction. Email is dangerous in two at once, and that is what sets it apart.

Incoming email is untrusted content. Anyone can send you a message, which means an email agent is, by design, reading attacker-controllable text. That is the perfect delivery vehicle for indirect prompt injection: a hidden instruction buried in a message that the agent reads as part of its task and acts on. Prompt injection now sits at the top of the security risk lists for language-model applications, and email is its most exposed surface.

Outgoing email is an exfiltration channel. An agent that can send mail can send data out, to anyone. Put those two together and an email agent holds two of the three legs of the lethal trifecta the moment you connect it, the untrusted input and the way out. Give it access to anything sensitive and it has all three. No other common integration hands you that combination for free.

This is not hypothetical

The attack has been demonstrated, and the dangerous versions are zero-click: no user mistake required. Researchers have shown crafted HTML emails, using tricks like white-on-white text or microscopic fonts to hide the payload from a human, that instruct a mailbox-connected agent to find sensitive information and exfiltrate it, in some cases before the recipient ever opens the message. One disclosed class of these (sometimes called service-side leaks) exfiltrates data from the AI provider's own infrastructure rather than the user's device, which means the usual endpoint defenses never see it (Infosecurity Magazine). The blunt summary: a single successful injection through an email agent can leak years of correspondence, customer data, and internal discussion. The convenience and the catastrophe run through the same connection.

Containing an email agent

The good news is that the containment is concrete and mostly about one decision: treat sending as the dangerous act, and gate it.

Draft-only by default. Let the agent do all the work, reading, triaging, drafting, but stop short of sending. A human makes the irreversible decision. This keeps almost all the value and removes the exfiltration leg, because a draft goes nowhere until a person sends it.
If it sends, allow-list recipients. Where autonomous sending is genuinely needed, restrict it to a fixed set of recipients or domains, so an injected instruction cannot redirect data to an attacker's address.
Scope the read side. Give the agent access only to the mailboxes or labels it needs, not the whole account, so a compromise reaches less.
Treat all incoming mail as untrusted. It is. Do not let the agent hold standing permissions that an instruction inside a message could turn against you.
Log every read and send. A tamper-evident audit trail turns a silent exfiltration into a visible event you can catch and trace.

None of this removes the prompt injection. It removes the agent's ability to do anything catastrophic when one lands, which, for an unpatchable attack, is the only honest goal.

How Pinchy does it

This is the part about our own product. Pinchy's email integration is exposed as scoped tools that run through each agent's default-deny allow-list: reading, drafting, and sending are separate capabilities you grant on purpose, not a single "email" power. The pattern we build around, and run ourselves, is draft-only with a human sending, so the agent prepares the reply or the extraction and a person makes the call to send it. Every email action lands in a per-row signed audit trail. And because Pinchy is self-hosted, the class of attack that exfiltrates from a shared cloud provider's own infrastructure does not have that infrastructure to exploit, the agent and its mailbox access live on systems you control. The honest position is the same as the rest of this guide: we cannot stop your inbox from receiving a malicious message, so we built the platform so that receiving one is survivable.

Learn More

Related Pages

FAQ

Frequently asked questions.

What is an AI email agent?

An AI email agent reads, triages, drafts, and sometimes sends email on your behalf, connected to a mailbox through Gmail, Outlook, or IMAP. It can summarize an inbox, draft replies, extract information from messages into other systems, and flag what needs attention. The capability is genuinely useful and also the reason it needs careful containment, because email is the integration where the most can go wrong.

Why is email the most dangerous integration for an AI agent?

Because it is two dangerous things at once. Incoming email is untrusted content the agent reads, which makes it a prime channel for indirect prompt injection: a hidden instruction in a message. Outgoing email is a way to send data out, which makes it an exfiltration channel. An agent that reads and sends email holds two legs of the lethal trifecta by default, and adding access to your data completes all three.

Can a malicious email really make an AI agent leak data?

Yes, and it has. Researchers have demonstrated zero-click attacks where a crafted HTML email, using tricks like white-on-white text, instructs a mailbox-connected agent to exfiltrate inbox data before any human reads the message. A single successful injection can leak years of correspondence. Prompt injection is now ranked the top security risk for language-model applications, and email is its most exposed surface.

How do you safely run an AI agent on email?

Treat sending as the dangerous action. Default to draft-only, so the agent prepares messages but a human sends them; if it can send autonomously, restrict the recipients to an allow-list. Scope its read access to the mailboxes and labels it needs, treat all incoming mail as untrusted, log every read and send to an audit trail, and bound what other systems it can reach so a leaked message has little to carry.

Should an AI email agent be able to send email automatically?

For most uses, no. Autonomous sending is where an injected instruction turns into a real action that leaves your control, a reply to an attacker, a forward of sensitive data, a phishing message in your name. Draft-only keeps the useful part (the agent does the work) while a person makes the irreversible decision to send. Reserve autonomous send for narrow, low-stakes, allow-listed cases where the recipient set is fixed.

Put an agent on your inbox, safely.

Pinchy gives email agents scoped read, draft, and send as separate grants, a draft-only default, a signed audit trail, and self-hosting. Open source, free to run.

Book a Call → Self-host it free →

Or email us: info@heypinchy.com

AI email agents:the most dangerous integration, and how to contain it.