← Back to Blog

Day 58: v0.4.0 Is Out

v0.4.0 shipped. The tag is cut, the GHCR image is pushed, the blog post is queued, the docs are updated. What I did not expect to spend the last few hours on was the Ollama Cloud integration — which, on paper, had been "done" for weeks. It turned out to have three bugs stacked on top of each other, each one invisible until you fixed the one above it.

The Usage Dashboard Showed Nothing

I sent a test message through an Ollama Cloud model, waited for the usage poller to tick, and checked the dashboard. "No usage data available." I'd just had a real conversation that definitely cost tokens. The dashboard was supposed to be the centerpiece of the release. It was showing empty.

Digging in: OpenClaw's session entries had no inputTokens or outputTokens for the Ollama Cloud request. The poller found nothing, so it recorded nothing. The streaming response had simply never included a final usage chunk.

Ollama Cloud's /v1/chat/completions endpoint only emits that chunk when the request carries stream_options: { include_usage: true }. OpenClaw's auto-detection for configured non-OpenAI endpoints defaults supportsUsageInStreaming to false, so the flag was never sent. Ollama Cloud silently dropped the usage data, OpenClaw had nothing to record, and the dashboard told the truth: no data.

Fix: set compat: { supportsUsageInStreaming: true } on every Ollama Cloud model in the generated OpenClaw config. Guard test locks the flag onto every entry so a future refactor of the shared model list can't silently drop it again. Verified the behavior live by calling the upstream endpoint directly, with and without the flag, and comparing the streams. Without it, the stream ends after finish_reason: "stop". With it, a final chunk carries prompt_tokens, completion_tokens, and total_tokens.

Gemma 4 Was Secretly Text-Only

Next finding was cosmetic until you realize it wasn't. Reviewing the model configs for release, gemma4 was marked text-only in Pinchy's UI. ollama.com/library/gemma4 lists "Text, Image." A broader audit of OpenClaw's model definition schema surfaced three per-model fields Pinchy had never been setting: reasoning, input, and cost.

input is the actual vision gate. Without it, multimodal models — gemma4, devstral-small-2, the qwen3-vl variants — were effectively text-only in the UI, regardless of what the model could actually do. reasoning was missing entirely, so thinking-capable models couldn't advertise themselves to the runtime and the UI had no capability to branch on. cost is required by the schema; Ollama Cloud bills by subscription plan (Free, Pro, Max) rather than per token, so the honest value is zero — but zero deliberately set beats zero defaulted-to because nobody knew the field existed.

Verified every flag against ollama.com/library/<name> rather than the aggregate search pages, which under-report vision and thinking tags in the cloud filter. Regression guards added to the config tests so a future model can't land with reasoning or input missing. Also refactored model-vision.ts to derive the cloud vision set from the shared constant via exact match, instead of prefix-matching a hand-maintained list — prefix matching is the kind of mechanism that works until one day a vendor ships qwen3.5-coder-next and silently inherits a flag from a cousin.

The ":cloud" Suffix That Wasn't

The last one was the worst because it meant no Ollama Cloud models were showing up at all. During v0.4.x, Ollama Cloud dropped the :cloud / -cloud suffixes from model IDs in their /v1/models endpoint. Pinchy's allowlist still required the old suffixes. fetchProviderModels() filtered every returned model out. The provider surfaced with models: []. Admins who added Ollama Cloud as a provider saw only their other providers in the agent model picker.

Updated three production surfaces to match the live API — the allowlist, the fallback list, and the hardcoded config baked into openclaw.json. Then, since the fetch now worked, took the opportunity to expand the curated list from four hand-picked models to the 31 tool-capable models Ollama actually publishes. Verified each one against its library page for the tools capability tag; kept non-tool-capable ones filtered so agents never silently pick a model that drops tool calls.

And while going through every library page, fixed the context windows. The previous default of 128k was "conservative" for big models but dangerously over-generous for rnj-1:8b, which only supports 32k. We would have cheerfully pushed 4x the real window at it and gotten silent failures. All 31 models now have their published context window written down, with a test that locks the groupings so a wrong value can't land quietly.

Post-Release: The Site Doesn't Match the Product

Release day isn't finished at git tag. I spent the rest of it clicking through heypinchy.com the way a prospect would, and the site was full of claims that hadn't caught up with reality.

The obvious ones were easy. v0.4.0 "in active development" dev-banners still plastered across pages. Old OpenClaw branding in headings and URLs. "Slack" in feature tables that should have been gone weeks ago. Dropped the banners, added a trust strip (147 stars, 18 forks, AGPL-3.0, built in public), rewrote the enterprise OSS/Ent table with real v0.4.0 features, added a three-tier pricing band, wrote two new comparison pages (Open WebUI, LibreChat). 301 redirects for the renamed /openclaw-* URLs so nothing that was indexed dies.

The less obvious ones I only caught because a review came back saying "your How It Works steps are wrong." They were. The homepage's three-step flow claimed you needed to install a license key before chatting. You don't. OSS Pinchy works without one — the wizard creates an admin, you add a provider API key, you talk to Smithers. The copy had been aspirational since the key-first days and nobody had updated it.

That prompted a second pass, this time against the actual Pinchy docs in the private repo. Which surfaced a dozen more drifted claims:

Same pattern as the Ollama Cloud bugs, one layer up. The code ran. The tests passed. The lies were in the gap between what the marketing said we'd done and what we'd actually done. The only way to catch them was to read both halves side by side.

Day 58

None of these bugs showed up in CI. All of them needed a human to use the product — and, for the marketing ones, to read the product next to the docs. Usage dashboard empty after a test message. Vision flag missing on a model I'd just seen a screenshot of. Model picker with zero options after adding a provider. "RBAC + SSO" on a page selling a tool that has neither. That's why dogfooding matters in the last week before a release — it's the only thing that catches the lies a passing test suite is comfortable letting through.

v0.4.0 is out. Odoo templates. Telegram per agent. Ollama Cloud, now actually working. Usage dashboard, now with numbers in it. Audit trail v2 (per-row HMAC, not a chain — the website now agrees). GHCR images. Knowledge base. Insecure-mode banner. A lot of surface area for one release — and one fewer day of "coming soon" on the website.

← Day 57: A UI That Stops Lying Day 59: Three Releases Before Lunch →

Pinchy is open source and ready to deploy. Clone the repo, run docker compose up, and your first agent is live in minutes.