Question 1

Does Pinchy support Ollama?

Accepted Answer

Yes. Pinchy supports Ollama as a first-class model provider, for both local Ollama installations and Ollama Cloud. You pick the model per agent — a local Qwen or Llama for sensitive work, a hosted frontier model for another agent, the choice is per-agent.

Question 2

Can I run Pinchy fully offline with Ollama?

Accepted Answer

Yes. Pair Pinchy with local Ollama and nothing leaves your network. No external API calls, no cloud dependency, no telemetry leak. For regulated industries or air-gapped environments, this is the whole point.

Question 3

What's the difference between local Ollama and Ollama Cloud?

Accepted Answer

Local Ollama runs models on your own hardware — your CPU or GPU, your latency, your limits. Ollama Cloud hosts models on Ollama's infrastructure with an API contract very similar to the local runtime. Pinchy talks to both through the same provider; you switch by changing a base URL.

Question 4

Can I mix Ollama with other providers?

Accepted Answer

Yes. Different agents can use different providers. A finance agent might use local Qwen via Ollama, a customer-facing agent might use Claude, an internal writing helper might use GPT. Pinchy is model-agnostic per agent.

Ollama — Local, Cloud, or Both

Local Models Stop Being a Compromise

Local Ollama or Ollama Cloud

💻 Local Ollama

☁️ Ollama Cloud

🔁 Per-Agent Choice

Self-Hosting Friendly

Related

Ollama — Local, Cloud, or Both

Local Models Stop Being a Compromise

Local Ollama or Ollama Cloud

💻 Local Ollama

☁️ Ollama Cloud

🔁 Per-Agent Choice

Self-Hosting Friendly

Related

Self-Hosted AI Agents

GDPR-Compliant AI

All Integrations

Cost Control