Daily Reading List – May 5, 2026 (#777)

I’ve been thinking more about how it’s premature to build any large custom AI engines inside companies. Use off-the-shelf products to get rolling. Too many things are changing right now, and you can get a lot done by stuffing context into available products. The Stripe item below reinforced that for me.

[article] Uber Shares What Happens When 1.500 AI Agents Hit Production. We’re going to see more of these lessons get published. At scale, you realize you need some centralized services and proxies.

[article] The Map of System Topologies. Some impressive analysis of common architectures that most tech systems end up falling into.

[blog] AI Slop & the Vulnerability Treadmill. Lengthy and important piece by Kate that’s a must-read for security teams AND executives. It’s time to rethink things.

[article] Local AI. I’ll admit that open models didn’t get me that fired up a year ago. Why run one yourself when you can use a SOTA one as a service? But token usage has skyrocketed, sovereign needs are more clear now, and open models have continued to innovate. So, I get it now.

[article] Google Is A Full Stack AI Player, And Is Playing Well. A lot of long bets have paid off, or are showing signs of paying off. It’s cheaper and faster to get wins through partnerships (e.g. Microsoft), but you’re left exposed without owning more of your supply chain.

[blog] State of Routing in Model Serving. Netflix has a legit ML serving platform and needed to evolve their routing approach. Cool deep dive here.

[blog] Accelerating Gemma 4: faster inference with multi-token prediction drafters. I had to deep-dive into this a bit to understand what it meant. But speculative decoding makes sense as a fast-forward for AI conversations.

[blog] Reading List #1. This “reading list” thing is catching on! Even if no one else read my notes every day, I’d still get a lot of value from the discipline of reading and writing it.

[article] This week on How I AI: The internal AI tool that’s transforming how Stripe designs products. Take away some useful lessons learned here. Get more people building, don’t obsess over “platform” right away, and spend time investing in context.

[blog] Gemini API File Search is now multimodal: build efficient, verifiable RAG. Here’s a terrific update. This is basically RAG as a service, and with multimodal sources.

[article] AI finds 20-year-old bugs in PostgreSQL and MariaDB. I don’t think we should have AI “slow down” because we can’t handle all the code issues it finds. We need better ways to quickly triage issues and incorporate fixes.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Comments

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.