← Back to Blog

Day 79: Personal Means Personal

Thursday. The commit list reads as two stories that turn out to be one. The first is a security fix in a single helper function. The second is the long E2E pass that's been running in parallel for two days, the one that landed the kind of test that surfaces this class of bug. The fix would have been a one-liner without the test; the test is the reason the one-liner exists.

The Admin Fast-Path Was Too Fast

The bug. assertAgentAccess is the helper every /api/agents/:agentId handler runs as its first line — it takes the session, the agent ID, the requested mutation, and returns either the agent record or a 403. The intent is that workspace admins can read or modify any shared agent in their workspace, and regular users can only touch agents that are visible to them by group membership or by being the owner. Personal agents — agents created with visibility = personal, which the UI shows under Your agents — are meant to be owner-only, regardless of the requester's role.

The implementation was checking the requester's role first as a fast-path, then falling through to the per-agent visibility check for non-admins. The fast-path was missing one branch: if this is a personal agent and the requester isn't the owner, deny — even if the requester is an admin. Without the branch, an admin who guessed (or harvested from an audit log) another user's personal-agent ID could fetch its full record, patch its system prompt, or delete it. The UI never showed those agents to anyone but the owner, so the bug had been invisible on every screen Pinchy actually renders. It was visible, narrowly, on a direct curl to the route.

The fix is one line of code and two tests. The line: an isPersonal && ownerId !== requesterId check ahead of the admin fast-path. The first test is a unit-level assertion against assertAgentAccess directly: with the new branch removed, the assertion fails red; with the branch in place, it passes. The second test is an E2E spec that drives a real two-user fixture — admin logs in, opens the API for the other user's personal agent, expects 403. The unit test will catch a future refactor that removes the branch; the E2E test will catch a future refactor that adds a new code path that re-introduces the same shape.

One small follow-up landed in the same PR: an additional regression test for the case where the admin is trying to delete their own personal agent, which is supposed to work — the boundary is the owner, not the role. The failure mode the new branch could introduce is over-correction. The test pins down that it doesn't.

The Two-User E2E Fixture

The security fix only existed because the two-user fixture existed. Until this week, Pinchy's E2E suite was single-user: a Playwright project with one logged-in admin, asserting screens render the way they should. A two-user suite is structurally different — you need a second authenticated context, a way to switch between them, a way to assert that user A cannot see what user B can. The week's earlier commits laid the foundation: a createSecondUser helper, a loginAs helper that drops the existing session cookies before logging in fresh, a smoke spec that asserts the helpers themselves don't regress.

Today's commits stack the actual coverage on top. The agent permissions two-user boundary spec: user A creates a private agent, user B opens the agents page and the agent isn't there. The matching API-level assertion: B's GET /api/agents/:id on A's agent returns 403. Then the inversion — A flips the visibility to All users, and B's request starts succeeding. The shape of the spec is what every access-control test should be: the boundary changes only when the configuration changes, and the test asserts both halves.

Groups, Invites, Audit, Knowledge Base

The same fixture is also the substrate for the other four E2E specs that landed today. The groups CRUD spec covers the happy path (create, edit, delete) plus the error toast that should surface when the API rejects a create — the kind of assertion that pins down the fix to group create now shows an error toast from a few days back. The invite create, claim, and revocation spec drives the full lifecycle: admin sends an invite, second user claims it through the magic link, admin revokes a different invite and asserts the revoked link no longer works. The audit log write, read, and authorization spec asserts both that admin-only actions appear in the log and that non-admins can't read the log endpoint at all. The knowledge base file edit spec covers both the editor save round-trip and the write-permission boundary — a viewer-role user can read the file but the save endpoint returns 403.

A handful of the new specs needed framework-level fixes to stabilise. The restricted-visibility spec was racing the page's auth bootstrap and asserting on stale UI; the fix is to isolate the non-admin into a fresh BrowserContext and assert the boundary via the API rather than via a sidebar UI check. clearSession got reworked to drop cookies rather than navigate to a logout URL (the URL form was sometimes leaving session state behind). The audit-detail sheet click got a scoped getByText so it stops finding the same text in the row and the sheet. Each of these is a one-line fix; together they're the difference between a flaky suite and one you can leave on in CI.

Groups UI Polish

Two smaller UI fixes landed in Settings → Groups in the same window. The first: field-level validation errors are now inline (matching the rest of the app's error-display policy) rather than dumped into a toast — wrong-shape input belongs in the field, only API/server errors belong in toasts. The second: two-step group operations — create the group, then assign initial members — used to fail silently on partial success. The group would land, the member assignment would fail, the dialog would close, and the user would have a half-built group with no signal that the member assignment hadn't happened. The fix surfaces the partial failure explicitly: the dialog stays open with a state-aware error message that says the group was created but member assignment failed, with a Retry button that runs only the failed step.

Typed API Client

One refactor that's been queued for a while landed today: a typed apiPost / apiPatch helper with a structured ApiError. Before today, every settings page wrote its own fetch call, parsed the response, hand-coded the error-toast surfacing. The helper consolidates the shape into one place — the response is parsed with the route's shared Zod schema (extracted from the route module into lib/schemas so the client can import it), the error response gets typed as ApiError with a code/message/field shape, and the helper signature returns either the parsed success body or throws an ApiError. settings-groups is the first page converted; the rest of the settings pages get the same treatment over the next sprint.

A small fix landed in the helper itself during the conversion: an empty 2xx response (a 204 No Content, or a 200 with no body) used to throw a JSON-parse error. The fix returns undefined for that shape and types the helper accordingly, with a regression test for the 204 case.

Day 79

The week's been building toward the two-user fixture without quite saying so. The five specs that landed today wouldn't have been possible against a single-user suite; with the suite in place, each spec is a few hundred lines of well-shaped test code. The personal-agent boundary fix is the kind of thing that pays for the cost of the fixture in one shot — a vulnerability that the UI never exposed, the API never blocked, and that no individual screen could have surfaced. The next stretch is converting the rest of the access-control code to the same shape: every route's 403 path has a two-user E2E test, every fast-path has a unit test that exercises the boundary it was supposed to enforce. The boundary is the product. The tests are the boundary.

← Day 78: v0.5.1 and the Host Rewrite Day 80: v0.5.2 and the Drain That Doesn't Stop →

Pinchy is open source and ready to deploy. Clone the repo, run docker compose up, and your first agent is live in minutes.