We don't build for hypothetical users. We build for ourselves — and then discover other developers have the same problems.
Every product we ship started the same way: something in our workflow was broken, nothing existing fixed it well enough, and we got tired of waiting for someone else to care.
Hover over each card to see the frustration — and what we built instead:
"Here's my API key..."
Pasting secrets into AI chats, Slack messages, and terminal prompts. Every day.
Hover to see what we builtNoxKey
A macOS secrets manager with Touch ID. Secrets live in the Keychain. AI agents get encrypted references, never raw values. The secret never enters the conversation.
"Which tab is Claude in?"
Running 4 terminal tabs. No idea which one has an AI agent active, idle, or stuck.
Hover to see what we builtNoxTerm
A native macOS terminal built for AI development. Split panes with live status badges showing what each agent is doing. No more tab hunting.
"The button is broken"
Bug reports with zero context. No screenshot, no URL, no device info. Just a Slack message.
Hover to see what we builtBlindspot
Embedded bug reporter. Users screenshot, annotate, and submit — right from the app. Full context captured automatically. The report is the fix brief.
Why dogfooding builds better developer tools
Using the thing you build, every day, in real work. It sounds obvious. Most teams don't do it.
They have QA teams, beta testers, feedback forms. They hear about friction secondhand. We feel it firsthand. That changes what gets prioritized.
How often we actually use our own tools
Daily usage across the team
Average interactions per day, measured over 30 days.
Hover to animate
NoxKey gets used 40+ times a day because credential access is constant during development. That volume reveals friction that testing never would. A 0.3-second delay that's invisible in a demo becomes infuriating after the 20th time in an hour.
What dogfooding gives you that user research can't
- Immediate context. We know exactly what we were doing when something felt wrong. No reproduction steps needed.
- Emotional weight. We don't just know a feature is slow — we feel it. That urgency drives prioritization better than any ticket.
- Edge cases. Power users find bugs that casual testing misses. We're the most extreme users of our own tools.
- Taste. "This works" and "this feels right" are different standards. You can only judge the second through sustained, daily use.
The deployment gate
We don't ship anything we wouldn't use ourselves. If it doesn't survive a week of real use internally, it doesn't go out. Sometimes that means features get cut. Sometimes deadlines slip. But the thing that ships is something we trust with our own work.
And if we're wrong? We'll feel it tomorrow morning. That's the point.