All posts

Building tools we actually use

We don't build for hypothetical users. We build for ourselves — and then discover other developers have the same problems.

Every product we ship started the same way: something in our workflow was broken, nothing existing fixed it well enough, and we got tired of waiting for someone else to care.

Hover over each card to see the frustration — and what we built instead:

"Here's my API key..."

Pasting secrets into AI chats, Slack messages, and terminal prompts. Every day.

Hover to see what we built
The solution

NoxKey

A macOS secrets manager with Touch ID. Secrets live in the Keychain. AI agents get encrypted references, never raw values. The secret never enters the conversation.

"Which tab is Claude in?"

Running 4 terminal tabs. No idea which one has an AI agent active, idle, or stuck.

Hover to see what we built
The solution

NoxTerm

A native macOS terminal built for AI development. Split panes with live status badges showing what each agent is doing. No more tab hunting.

"The button is broken"

Bug reports with zero context. No screenshot, no URL, no device info. Just a Slack message.

Hover to see what we built
The solution

Blindspot

Embedded bug reporter. Users screenshot, annotate, and submit — right from the app. Full context captured automatically. The report is the fix brief.

Why dogfooding builds better developer tools

Using the thing you build, every day, in real work. It sounds obvious. Most teams don't do it.

They have QA teams, beta testers, feedback forms. They hear about friction secondhand. We feel it firsthand. That changes what gets prioritized.

"The best feature requests come from the moment you're using your own tool and think: this should be faster." — Internal rule, informally enforced

How often we actually use our own tools

Daily usage across the team

Average interactions per day, measured over 30 days.

NoxKey
~40x
NoxTerm
~30x
Blindspot
~8x

Hover to animate

NoxKey gets used 40+ times a day because credential access is constant during development. That volume reveals friction that testing never would. A 0.3-second delay that's invisible in a demo becomes infuriating after the 20th time in an hour.

What dogfooding gives you that user research can't

  • Immediate context. We know exactly what we were doing when something felt wrong. No reproduction steps needed.
  • Emotional weight. We don't just know a feature is slow — we feel it. That urgency drives prioritization better than any ticket.
  • Edge cases. Power users find bugs that casual testing misses. We're the most extreme users of our own tools.
  • Taste. "This works" and "this feels right" are different standards. You can only judge the second through sustained, daily use.

The deployment gate

We don't ship anything we wouldn't use ourselves. If it doesn't survive a week of real use internally, it doesn't go out. Sometimes that means features get cut. Sometimes deadlines slip. But the thing that ships is something we trust with our own work.

And if we're wrong? We'll feel it tomorrow morning. That's the point.