I’m in Sunnyvale for my last trip of the year, and had an excellent day. I saw lots of colleagues and friends, and even got a couple of AI demos working during breaks. Also, I read a lot, which you’ll see below.
[blog] Deploy Gemma 2 LLM with Text Generation Inference (TGI) on Google Cloud GPU. If you’re using open LLMs, you’ll need some way to serve model predictions. The Hugging Face TGI toolkit is popular, and this post shows off how to use it.
[article] Optimizing Java Applications on Kubernetes: beyond the Basics. My friend Bruno shares some advice in this video (and transcript) from a recent InfoQ event.
[article] How To Add Persistence and Long-Term Memory to AI Agents. You can fake “memory” by continuing to pass all the history into the LLM on each request, but having a persistence layer makes sense.
[blog] Communication Structures in a Growing Organization. As teams, orgs, and companies grow, the communication pattern needs to evolve. Jessica offers helpful guidance here.
[youtube-video] Simon Willison: The Future of Open Source and AI. You’d expect that a conversation between Logan and Simon would be interesting. You would be right.
[blog] Heroku Open Sources the Twelve-Factor App Definition. I didn’t realize it was closed source, or even considered “owned” by anyone, even Heroku. But now it’s a community-owned document.
[blog] Unlocking the power of time-series data with multimodal models. Might you start using generative AI for legit data analysis? This blog from Google Research might spark some ideas for you.
[site] The State of Frontend, 2024. I don’t think I saw this before. Check out these survey results for a glimpse into what devs are using to build the frontend.
[blog] Comparative Analysis of OWASP Top 10 for LLM Applications (2023 vs. 2025). What are the latest risks to mitigate with LLMs? Here are the vulnerabilities to protect yourself against.
[article] Which IDEs do software engineers love, and why? IDEs are among the stickiest tools for developers. But are preferences changing before our eyes? It seems like it.
[blog] Tracing with Langtrace and Gemini. This is a solution to a problem you wouldn’t have had a couple of years ago. But now you may be figuring out latency in your LLM calls, and need some traceability.
[blog Streamline Kubernetes cluster management with new Amazon EKS Auto Mode. I’ll be tracking what AWS is up to at re:Invent this week. In particular, which Google Cloud features they’re finally adopting! In this case, it’s a more automated Kubernetes. Also, hybrid cluster management. Both are things we’ve had for years.
[blog] PayPal’s Real-Time Revolution: Migrating to Google Cloud for Streaming Analytics. We need more stories of smart tech companies who swap self-managed for commercial products after realizing that homegrown solutions weren’t differentiated any longer.
Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below: