The Ship You Can't Dock: Architectural Debt in the AI Era
In the fast-moving AI space, architectural debt isn't just about cutting corners—it's about reasonable decisions being invalidated by a shifting environment.
Writing
Thoughts on AI systems, language models, and the craft of building production-grade software.
In the fast-moving AI space, architectural debt isn't just about cutting corners—it's about reasonable decisions being invalidated by a shifting environment.
A deep dive into why application-level pooling fails for long-running AI workflows and how to implement PgBouncer with statement-level pooling to handle 30x the load with 10x fewer resources.
A practical guide to keeping your data private when using LLM APIs. Covers zero-retention endpoints, self-hosting, compliance requirements, and data protection patterns.
Gemma 4's 31B model is outscoring systems with 10x more parameters on Arena Elo. Here's the architectural reasoning behind why that's possible.
After three years of building LLM applications, I've learned that LLM memory is fundamentally different from human memory.