Guide
For some data, "we host it ourselves" is not enough. If the agent still calls a cloud model to think, your data leaves the building on every message. An air-gapped agent keeps the model, the data, and the tools inside a sealed network, with nothing crossing the boundary. This guide covers what that actually takes, when it is worth it, and the trap of confusing self-hosted with offline.
An air-gapped AI agent runs with no route to the public internet. The model that powers it, the data it works on, and the tools it calls all live inside a sealed network, so no request and no data ever crosses the boundary. The defining property is simple: every dependency the agent needs at runtime is already inside the enclave, with no path out.
We build Pinchy, a self-hosted AI agent platform that can run fully offline, so we have a stake in this. The most useful thing this guide can do is kill a common and expensive confusion, which is the next section.
This is the distinction that catches teams out. "Self-hosted" means the application runs on your own servers. It says nothing about where the intelligence comes from. A self-hosted agent that calls OpenAI, Anthropic, or Google for its model is sending your prompt, and whatever context it carries, out to that provider on every single message. The app is on your infrastructure. Your data is not staying there.
Air-gapping adds the missing requirement: the model itself has to be local. The agent reasons using an open-weight model served inside your network, so the prompt never leaves. Self-hosting controls where the code runs. Air-gapping controls where the thinking happens, and for sensitive data that is the line that actually matters. A platform can be genuinely self-hosted and still not air-gapped, and a lot of "your data never leaves" marketing quietly relies on you not noticing the difference. And location alone does not close the gap: a US provider running a datacenter in the EU still falls under the US CLOUD Act, so where the servers sit is not the same as who can be compelled to hand the data over. A local model inside your own network never raises the question.
Air-gapping is not paranoia, it is a fit for specific situations:
An air-gapped agent is more than a model on a GPU. Every component in the path has to be local, or the gap is not really there:
The honest trade-off is model quality. The largest frontier models are cloud-only, so going air-gapped means using the best open-weight model your hardware can serve instead of the absolute state of the art. For a great many tasks (drafting, extraction, classification, agentic workflows over your own data) the gap is small and shrinking, and for the workloads that require air-gapping it is not a choice anyway.
It is tempting to think a disconnected agent is a safe agent. It is not, it is a contained one. Air-gapping closes the path for data to leave, which is one leg of the lethal trifecta, but the other risks live entirely inside the boundary. An offline agent can still read records it should not, change data it was not meant to, or follow a malicious instruction hidden in a document that is already inside the enclave. The exfiltration route is gone; the over-access and the prompt-injection problems are not.
So the same controls apply offline as on: a default-deny permission model so the agent can only touch what its job needs, and a tamper-evident audit trail so every action is recorded and provable. Air-gap is the outer boundary. Governance is what keeps order inside it. You want both.
This part is about our own product. Pinchy is model-agnostic, which includes running entirely on local models via Ollama, so an agent can reason without any cloud API in the path. It is self-hosted, and it is built to make a real air gap possible: license validation is fully offline, there is no license server, no activation ping, and no telemetry, so the platform itself does not phone home. Run it with local models on a disconnected network and nothing crosses the boundary, while the governance layer keeps working inside it, concretely: send an email, read a file, change a record. Everything is denied by default until you grant the agent each capability one at a time (the default-deny allow-list). Every action lands HMAC-signed in the audit trail, one row at a time, so a record altered after the fact shows up at once.
To be straight about it: if you point Pinchy at a cloud model provider, it is self-hosted but not air-gapped, and your prompts go to that provider like any other client. The offline guarantee is real, but it is a guarantee about a local-model deployment, not about self-hosting alone. That is the same distinction this whole page is built around, and we hold our own claims to it.
FAQ
An air-gapped AI agent runs with no route to the public internet. The model, the data it works on, and the tools it uses all live inside a sealed network, so no request and no data ever leaves the boundary. Every runtime dependency is already inside the enclave. It is the strongest form of data isolation available, used where data exposure is not an acceptable risk.
No, and the difference matters. A self-hosted agent runs the application on your own servers, but if it calls a cloud model API (OpenAI, Anthropic, Google) for its intelligence, your data still leaves your network on every request. Self-hosting controls where the app runs. Air-gapping additionally requires the model itself to be local, so that nothing, not even the prompt, crosses the boundary.
Yes, if every component is local: an open-weight model served locally (for example via Ollama), local document storage and retrieval, and tools that only reach systems inside the network. You trade access to the largest frontier cloud models for full sovereignty. For many regulated and sensitive workloads that trade is the entire point, and capable open-weight models make it practical.
Defense and government workloads with classified or controlled unclassified information run on physically air-gapped networks by requirement. Healthcare and finance air-gap to keep sensitive data off any external service. And even where no rule demands it, the audit burden of proving a connected system is secure can exceed the operational burden of simply keeping it disconnected, so air-gapping often simplifies the compliance story rather than complicating it.
No. Air-gapping closes the path for data to leave the network, which is one leg of the risk, but a disconnected agent can still read more than it should, change records, or be steered by a malicious document already inside the enclave. Permissions and a tamper-evident audit trail still apply offline. Air-gap is a boundary, not a substitute for governance inside it.
Pinchy runs on local models with no telemetry and offline license validation, so a local-model deployment stays fully air-gapped. Open source, self-hosted, free to run.
Or email us: info@heypinchy.com