Air-Gapped AI Agents: Running Agents With No Route to the Internet

An air-gapped AI agent runs with no route to the public internet. The model that powers it, the data it works on, and the tools it calls all live inside a sealed network, so no request and no data ever crosses the boundary. The defining property is simple: every dependency the agent needs at runtime is already inside the enclave, with no path out.

We build Pinchy, a self-hosted AI agent platform that can run fully offline, so we have a stake in this. The most useful thing this guide can do is kill a common and expensive confusion, which is the next section.

Self-hosted is not the same as air-gapped

This is the distinction that catches teams out. "Self-hosted" means the application runs on your own servers. It says nothing about where the intelligence comes from. A self-hosted agent that calls OpenAI, Anthropic, or Google for its model is sending your prompt, and whatever context it carries, out to that provider on every single message. The app is on your infrastructure. Your data is not staying there.

Air-gapping adds the missing requirement: the model itself has to be local. The agent reasons using an open-weight model served inside your network, so the prompt never leaves. Self-hosting controls where the code runs. Air-gapping controls where the thinking happens, and for sensitive data that is the line that actually matters. A platform can be genuinely self-hosted and still not air-gapped, and a lot of "your data never leaves" marketing quietly relies on you not noticing the difference. And location alone does not close the gap: a US provider running a datacenter in the EU still falls under the US CLOUD Act, so where the servers sit is not the same as who can be compelled to hand the data over. A local model inside your own network never raises the question.

Why air-gap an AI agent

Air-gapping is not paranoia, it is a fit for specific situations:

Required by mandate. Defense and government work with classified or controlled unclassified information runs on physically air-gapped networks because the rules say so. There is no cloud-API option to evaluate.
Too sensitive to expose. Healthcare and finance increasingly deploy generative AI without ever touching the public internet, keeping patient and customer data off any external service by construction rather than by promise.
Simpler to prove. The least obvious reason is often the deciding one. Even where no regulation demands air-gapping, the audit burden of proving that a connected system is secure can exceed the operational burden of keeping it disconnected (TrueFoundry). "It has no internet connection" is a much shorter compliance conversation than "here is how we secure its internet connection."

What it actually takes

An air-gapped agent is more than a model on a GPU. Every component in the path has to be local, or the gap is not really there:

The model, served locally. An open-weight model (the Llama, Mistral, Qwen, and similar families) running on your own hardware, for example through a local runner like Ollama, with no external API in the loop.
Local knowledge. If the agent reads your documents, the ingestion, embedding, vector storage, and retrieval all run inside the enclave too. A local model that calls out to a hosted vector database is not air-gapped.
Inward-only tools. The agent's tools reach systems inside the network and nothing outside it.
No phone-home. The platform itself must not call out, for telemetry, license checks, or updates. A single activation ping breaks the air gap as surely as a model API call.

The honest trade-off is model quality. The largest frontier models are cloud-only, so going air-gapped means using the best open-weight model your hardware can serve instead of the absolute state of the art. For a great many tasks (drafting, extraction, classification, agentic workflows over your own data) the gap is small and shrinking, and for the workloads that require air-gapping it is not a choice anyway.

Air-gapping does not remove the governance problem

It is tempting to think a disconnected agent is a safe agent. It is not, it is a contained one. Air-gapping closes the path for data to leave, which is one leg of the lethal trifecta, but the other risks live entirely inside the boundary. An offline agent can still read records it should not, change data it was not meant to, or follow a malicious instruction hidden in a document that is already inside the enclave. The exfiltration route is gone; the over-access and the prompt-injection problems are not.

So the same controls apply offline as on: a default-deny permission model so the agent can only touch what its job needs, and a tamper-evident audit trail so every action is recorded and provable. Air-gap is the outer boundary. Governance is what keeps order inside it. You want both.

How Pinchy does it

This part is about our own product. Pinchy is model-agnostic, which includes running entirely on local models via Ollama, so an agent can reason without any cloud API in the path. It is self-hosted, and it is built to make a real air gap possible: license validation is fully offline, there is no license server, no activation ping, and no telemetry, so the platform itself does not phone home. Run it with local models on a disconnected network and nothing crosses the boundary, while the governance layer keeps working inside it, concretely: send an email, read a file, change a record. Everything is denied by default until you grant the agent each capability one at a time (the default-deny allow-list). Every action lands HMAC-signed in the audit trail, one row at a time, so a record altered after the fact shows up at once.

To be straight about it: if you point Pinchy at a cloud model provider, it is self-hosted but not air-gapped, and your prompts go to that provider like any other client. The offline guarantee is real, but it is a guarantee about a local-model deployment, not about self-hosting alone. That is the same distinction this whole page is built around, and we hold our own claims to it.

Learn More

Related Pages

FAQ

Frequently asked questions.

What is an air-gapped AI agent?

An air-gapped AI agent runs with no route to the public internet. The model, the data it works on, and the tools it uses all live inside a sealed network, so no request and no data ever leaves the boundary. Every runtime dependency is already inside the enclave. It is the strongest form of data isolation available, used where data exposure is not an acceptable risk.

Is a self-hosted AI agent the same as an air-gapped one?

No, and the difference matters. A self-hosted agent runs the application on your own servers, but if it calls a cloud model API (OpenAI, Anthropic, Google) for its intelligence, your data still leaves your network on every request. Self-hosting controls where the app runs. Air-gapping additionally requires the model itself to be local, so that nothing, not even the prompt, crosses the boundary.

Can an AI agent run completely offline?

Yes, if every component is local: an open-weight model served locally (for example via Ollama), local document storage and retrieval, and tools that only reach systems inside the network. You trade access to the largest frontier cloud models for full sovereignty. For many regulated and sensitive workloads that trade is the entire point, and capable open-weight models make it practical.

Why do organizations air-gap their AI?

Defense and government workloads with classified or controlled unclassified information run on physically air-gapped networks by requirement. Healthcare and finance air-gap to keep sensitive data off any external service. And even where no rule demands it, the audit burden of proving a connected system is secure can exceed the operational burden of simply keeping it disconnected, so air-gapping often simplifies the compliance story rather than complicating it.

Does air-gapping remove the need for agent permissions and audit?

No. Air-gapping closes the path for data to leave the network, which is one leg of the risk, but a disconnected agent can still read more than it should, change records, or be steered by a malicious document already inside the enclave. Permissions and a tamper-evident audit trail still apply offline. Air-gap is a boundary, not a substitute for governance inside it.

Run agents that never touch the internet.

Pinchy runs on local models with no telemetry and offline license validation, so a local-model deployment stays fully air-gapped. Open source, self-hosted, free to run.

Book a Call → Self-host it free →

Or email us: info@heypinchy.com

Air-gapped AI agents:running agents with no route to the internet.

Self-hosted is not the same as air-gapped

Why air-gap an AI agent

What it actually takes

Air-gapping does not remove the governance problem

How Pinchy does it

Related Pages

Self-Hosted AI Agents

Local Models via Ollama

Air-Gapped LLM Hardware

AI Agent Governance

GDPR & AI Agents