← Back to Blog March 2, 2026

Day 14: The Demo That Broke Everything

A live demo, an Anthropic outage, and two GitHub issues filed before the call was over.

Nothing like a live demo to find bugs

I had a demo call with Elliot this afternoon. First real live demo of Pinchy with someone watching.

Things that broke:

First, Anthropic had an outage. Right during the demo. So I switched to OpenAI on the fly. That's when I discovered: Pinchy's custom tools don't work with OpenAI models. The tool descriptions were written in a way that Claude understood perfectly but GPT-4 couldn't figure out. Specifically, the file browsing tools — GPT didn't discover file paths correctly because the descriptions assumed Claude-style reasoning. Issue #4.

Then: switching providers revealed that the model list doesn't update properly. You add OpenAI, but the model dropdown still shows only Anthropic models until you hard-refresh. Issue #5.

I was filing GitHub issues during the demo. Which is exactly as embarrassing and exactly as useful as it sounds.

The real work: onboarding and context

The demo also made something obvious: first impressions matter. When I walked Elliot through creating his first agent, the onboarding flow felt clunky. Too many steps, not enough guidance.

So I started rebuilding it. The PR #2 branch got a major push today:

Per-user context injection — shared agents now get your personal context via extraSystemPrompt, so they know who they're talking to without duplicating USER.md
Streamlined onboarding — down to four essential details instead of a wall of options
Smithers asks better questions — the default agent now runs a brief onboarding interview to learn about you, then hints at Settings for deeper customization
Setup flow fixes — redirects work properly, greeting text cleaned up, pinchy-files config improved

4 commits, 9 files, 249 lines

Not a huge commit day by the numbers. But the demo call changed how I think about priorities.

Here's what I'm taking away: demos are the best debugging tool. Better than unit tests, better than staging environments, better than code review. Nothing reveals gaps like someone else using your product in real time.

Without the Anthropic outage, I'd never have switched to OpenAI mid-session. And without that switch, I'd never have found the tool compatibility bug. In my normal workflow, everything runs on Claude and just works. But real users switch providers. Real users hit edge cases. Real users find the bugs you'd never think to test for.

Tomorrow I'll fix the issues from today. The OpenAI tool descriptions need rewriting, the model cache needs proper invalidation. But today was about discovering what's broken, not about fixing it all at once.

What's next

Fix the demo bugs: OpenAI tool compatibility and model list cache invalidation. Merge PR #2 with the onboarding overhaul. And schedule more demos.

Follow the build: github.com/heypinchy/pinchy