My AI dev stack — what I run, and what I deleted
The full tour of how I run Claude Code: Opus 4.8 drives, Sonnet 4.6 reviews, ten MCP servers, claude-mem for memory, a curated set of agents and skills, and the hooks that keep autonomy safe. The interesting part is what I deleted.

People keep asking what my "AI setup" actually looks like — which model, which tools, how the autonomy doesn't turn into chaos. So here's the whole thing, top to bottom: the models, the MCP servers, the agents, the memory layer, the skills, and the hooks that keep it honest.
The short version: Claude Code is the runtime, and everything else is plumbing I bolted on until the loop stopped fighting me. The interesting part isn't what I added — it's what I deleted.
The brains — which models actually run
Two Claude models do ~95% of the work, on purpose:
- Opus 4.8 (1M context) — the driver. Planning, multi-file edits, the hard reasoning, anything that touches architecture. The 1M window means I rarely have to babysit context on a big repo.
- Sonnet 4.6 — the reviewer. My pre-commit code-review hook runs on Sonnet, not Opus. It's fast, it's cheap, and for "did this diff just introduce a SQL injection" it's more than enough. No reason to burn Opus on a gate that fires on every commit.
On top of that I keep a model playground wired in through Qubu (more on that below), so I can fan a prompt across other models when I want a second opinion or need a non-Claude modality. But the daily loop is Opus-drives-Sonnet-reviews, and that split alone cut my cost without me noticing any quality drop.
The hands — MCP servers
MCP is where Claude stops being a chatbot and starts being able to do things. Ten servers run globally for me; the rest are per-project. The global set:
- qubu — my own AI platform. Models, datasets, notebooks, inference, a generation playground, even the forum/journal. This is also what I use to generate media.
- context7 — live, version-accurate library docs. Knowledge cutoffs lie about minor versions; this doesn't. I reach for it before trusting my own memory on any framework API.
- figma-desktop + figma-developer-mcp — design → code without the screenshot-and-guess dance. Real variables, real metadata, real layout.
- kie-ai — image/video generation when I need media inline.
- cloudflare-api — DNS, Workers, the whole CF surface across several accounts, from the terminal.
- mt-glitch-blog — yes, the server I'm publishing this post through. Create, update, upload covers — the blog is just another tool call.
- agent-browser + playwright — two browser drivers. Snapshot a page, click through a flow, scrape, or run a real E2E check.
- markdowner — turn any URL/document into clean Markdown so it's actually usable in context.
Per-project I layer in more: SSH MCPs into infra, Postgres/pgbouncer MCPs, Sentry, Vercel, a YouGile board MCP. The point isn't to have — it's that each repo only sees the servers it needs.