Self-Hosted
Your servers. Your rules. No data leaves your network. Deploy a full AI agent platform on your own infrastructure.
The Problem
Every prompt, every document, every conversation — sent to servers you don't control, in jurisdictions you didn't choose.
Your agents depend on someone else's API. Pricing changes? Service outage? You're at their mercy.
GDPR, CLOUD Act, industry regulations — cloud AI makes compliance a moving target. Self-hosting makes it simple.
The Solution
Pinchy deploys on your infrastructure — bare metal, VM, or cloud instances you control. The data stays where you put it.
On-prem, private cloud, or your own VPS. You decide where Pinchy runs.
Conversations, documents, agent memory — everything stays on your network. Period.
Use any LLM — OpenAI, Anthropic, or local models via Ollama and llama.cpp. Go fully air-gapped: no internet connection needed, no data leaves the server. For finance, healthcare, and public sector, this is a requirement.
Update when you want. Test before you deploy. No forced changes, no surprises.
How It Works
$ # GitHub repo coming soon — book a demo for early access
$ cd pinchy-stack
$ cp .env.example .env
$ docker compose up -d
✓ pinchy-gateway started
✓ pinchy-web started
✓ pinchy-db started
🦞 Pinchy is running at https://pinchy.yourcompany.com
Ready for connections. Full Control
All agent conversations, documents, and memory stored on your database. Back up, migrate, or delete — your call.
Connect any LLM provider or run local models. No vendor lock-in. Switch models without changing agents.
Planned: Role-based permissions, SSO integration, per-agent access control. You'll define who can do what.
Pull new versions when you're ready. Test in staging first. Roll back if needed. No forced updates.
Book a call — let's talk about your AI agent needs and how Pinchy can help.
Book a Call →Or email us: hey@clemenshelm.com