← Back to Blog

Day 89: Two Weeks of Production

Sunday. No commits today, in any repo. The kind of day that turns up in a build-in-public log as a gap unless I write what's instead been on my mind for the week. The honest answer is that two weeks of actually using Pinchy myself — not the team's dev deployment, the production one running my real bookkeeping — has done more to shape the next stretch of work than any feature request from outside. The post for today is what that's surfaced.

Eating What I Cook

The Odoo integration has been carrying my own books for about two weeks now, in the kind of daily-use shape where a misbehaving tool isn't a Slack DM to fix later — it's a receipt I have to book before the end of the month. That changes which bugs feel small. The quote-wrapped-key fixes that landed on Day 87 are an example: I noticed them because my own odoo_create calls were failing in a way I couldn't ignore, and the fix took an evening rather than the week it might have taken at a remove. The compact-schema split into odoo_list_models + odoo_describe_model that's queued for v0.5.4 is another — the original odoo_schema tool was burning ~18 kB of context per model on every read, and the cost of that was the agents in my own setup failing on the third-or-fourth schema-discovery call. The fix is one I'd have queued behind something more visible at a remove; using it daily, it moved to the front.

The attachment work from Day 81 and Day 82 is the longer version of the same story. On Day 70-something, uploading a PDF to the chat composer didn't work. Not worked badly — didn't work at all. Two weeks of I want to attach this receipt to the Bookkeeper, right now ran into that wall every day. By the time Day 81 and Day 82 came around the question had stopped being should we ship file attachments and started being what's the smallest thing I can ship that makes my own daily workflow not suck. The PDF-preview modal, the authenticated GET on uploads, the OpenClaw-side workspace path — none of that was a roadmap item. It was what the friction surfaced once the friction was mine to bear.

Customers Report It's Easy

The other thing two weeks have surfaced is that the customer calls this stretch have a consistent piece of feedback, and it's not the one I expected. The thing people say back is that Pinchy is easy to use. Not powerful, not open, not fast — the word that keeps coming up is some variant of I could figure out what to do. The investments that paid off there are the ones that didn't have their own release notes: the chat error bubbles that name what went wrong with a path forward (Day 81), the silent stream-end fixes that turned where did my message go into retry, here's what happened (Day 83), the dispatch-coverage system that catches a class of this tool isn't actually wired up bug before any customer hits it (Day 87). None of those are features; the absence of any one of them is what a UX consultant would have flagged as the thing that's making the product feel sharp-edged.

The shape of two weeks of production use confirming the UX investments is what makes me feel safe spending the next stretch on the harder questions.

The Multi-Integration Agent Question

Here's the thing the calls are about that doesn't have a fix yet. Every agent template Pinchy ships handles exactly one integration. The Bookkeeper has Odoo permissions and Odoo-shaped instructions. The Inbox Triage agent (when it ships) will have email permissions and email-shaped instructions. The shape works as long as a user's actual workflow is also single-integration. It often isn't. The customer who wants a Bookkeeper that can also draft an email to the supplier when an invoice arrives needs both Odoo and email permissions, plus instructions that know which one to reach for in which situation.

Adding the permissions is the easy part — Pinchy's permission model has been designed for exactly this. The hard part is the instructions. A user editing an agent's system prompt by hand to bolt email guidance onto a Bookkeeper template is doing work they shouldn't have to do, and the result is going to be worse than the single-integration starting point unless they get the prompt-engineering right. The shape that's been circling in my head is a second agent: an admin-only agent editor that can examine an existing agent, ask the admin what they want it to additionally do, and produce the permission diff plus the instruction diff in one move. The admin reviews the diff; the changes apply; the original agent gets the new capability with instructions that were written, not bolted on.

It's an editor-as-agent pattern, which is the shape of the work itself — you don't ask a Bookkeeper to also do email, you ask a meta-agent to teach the Bookkeeper to also do email. I'm not sure yet whether that's the right architecture or whether it's the first idea that seemed plausible. The fact that it would be admin-only by default solves one class of safety problem (a Member can't escalate an agent past their own permissions) but creates another (how does the meta-agent itself authenticate against the things it's wiring up). Two weeks more of thinking before this becomes a PR.

The Task Model Question

The other architecture question this stretch has surfaced is about sessions. Pinchy today has one long-running session per agent, compacted as it grows. That model is simple to explain and simple to use — open the chat, see history, continue — but it has a failure mode I've now hit several times in my own workflow. When the session is long, the compaction is doing real work to keep the context budget under control, and the agent's behaviour starts drifting in ways that map roughly to the early context that's now gone was load-bearing. The fix the agent ergonomics literature recommends is shorter scopes — a session per task rather than a session per agent.

What I haven't figured out yet is the UI shape for that. New chat as a button is a non-answer; people are conditioned by every other chat product to treat that as a thread split with no semantic meaning. New task is closer to the right word — a task is a thing with a goal, a beginning, and an end. But the friction of start a new task every time the user wants the agent to do something is exactly the friction the single-session model was designed to remove. The right answer probably involves the agent deciding when a task is done and quietly closing the session rather than asking the user to do it, which is its own design problem.

There's a parallel thread here that ties into background jobs, which has been on the roadmap for a while and lives on a branch I haven't merged. A scheduled daily inbox triage at 9am is structurally the same shape as a manual triage my inbox now task — both are a scoped run with a goal, a clear end, and an audit trail. If the right model is tasks, manual and background are two ways to start the same thing, and the agent's task list (which has lived in my head as the chat history) becomes a real first-class surface. None of this is in stone yet — I want to live with the open question for another week before reaching for a design.

MCP, and the Trade-Off It Carries

The other thing in flight that hasn't shown up in the daily posts yet is the MCP integration work, which has been living on a feature branch for the last week or two. Model Context Protocol is the standard a lot of the AI ecosystem is settling on for tool integrations, and getting Pinchy to speak it means a long tail of third-party connections — Atlassian, GitLab, Stripe, Cloudflare, Intercom, HighLevel, Notion, Linear, GitHub — become available in a single batch rather than one Pinchy-native plugin at a time.

The trade-off MCP carries is the one I've been chewing on. Pinchy-native plugins have a permission model I can be specific about: this agent can read invoices but not write them, this agent can read its own emails but not send. MCP exposes a flat list of tools per server, and the granularity is this tool is on, or off. For a customer whose security posture wants this Jira integration can comment but not transition tickets, MCP alone doesn't deliver. The shape I'm landing on is that MCP is the default path for breadth, and the offer to customers who need a deeper integration is that I build the matching Pinchy-native plugin for them. The MCP version is on for everyone the day v0.5.5 ships; the Pinchy-native version is a conversation we have when the permission model needs to be sharper than MCP can be.

Day 89

The week ahead is v0.5.4, and the scope is already what it's going to be. Today is the last quiet day before that push starts. The three threads above — multi-integration agents, the task model, MCP's permission ceiling — are the ones I'd be writing release notes against in two or three months if I'm picking the right next problems. Two weeks of production use is what made me confident that's where the work goes. The shape of building-in-public this far in is that the days when nothing ships are sometimes the days where the most useful clarification happens.

← Day 88: The Saturday Drain Day 90: A Monday on Odoo →

Pinchy is open source and ready to deploy. Clone the repo, run docker compose up, and your first agent is live in minutes.