Today’s my Monday after spending the first two days of this week in Texas visiting my son’s future college. Great trip. Now digging out from a couple of days of email and unread blog posts. Expect some big reading lists this week!
[blog] How to think about agent frameworks. The gloves are off! LangChain folks took umbrage at OpenAI’s guidance for building agents and published a terrific piece on agentic systems and agent frameworks.
[blog] SaaS delivery made easy: Meet SaaS Runtime. Designing, deploying, and operating multi-tenant solutions is no joke. Make the wrong choice and you’ve got a management nightmare. This new managed offering looks like a nice way to reduce risk.
[blog] CI/CD Security Best Practices. I’d venture that most aren’t probably GREAT at securing their automated deployment pipelines, so advice like this is still helpful.
[article] Why CIOs should prioritize IT modernization. I don’t see how you have any chance to succeed with technology over the next decade if you don’t have some modern fundamentals in place.
Did you have a productive week? I did, and that makes it easier to take next Monday and Tuesday off to do a college visit with my son. Back here on Wednesday!
[article] The Cursor Mirage. I have no doubt that “real” engineering teams are using tools like Cursor. That’s cool, but you’ve got to be clear-eyed about what you’re introducing.
[blog] Vibe Coding is for PMs. Sort of related to the previous item. Maybe these vibe coding tools are best for those sharing product requirements in new ways, versus doing actual production-grade work.
[blog] Spring cleaning with FinOps Hub 2.0. I can’t find a lot of developers who are obsessed about the cost of apps, but plenty of other folks keep an eye on spend. This looks like a nice upgrade.
[blog] Image segmentation using Gemini 2.5. We’re not going to slow down our release cadence, but I’m glad folks are picking up “hidden” features that don’t get prime time coverage.
[article] Optimizing the 90%: Where Dev Time Really Gets Stuck. If devs are really only spending 10% of their time writing new code for new apps, there’s a lot of other places besides the IDE to optimize things.
Today’s reading list is chock-full of advice. Whether you’re hiring people, leading teams, planning AI agents, or dealing with disappointment, there’s something for you.
[blog] Start building with Gemini 2.5 Flash. Big week for AI models—there were some excellent OpenAI releases—and this preview edition of Gemini 2.5 Flash has outstanding benchmark results.
[blog] Build AI Agents your way on Google Cloud. There’s more than one approach to building AI agents, and Karl iterates through some of the decision points along the way.
[article] Tech hiring: is this an inflection point? If you haven’t changed your hiring process in this AI era, you’ve made a big mistake. Gergely points out why remote hiring approaches can’t be trusted any longer.
Another good day as I wrote the blog post I teased yesterday, and had some productive meetings. Also, deployed my first AI agent using Agent Development Kit. Have you carved out time this week to go deep on something new? Still a couple days left!
[blog] What are AI Agents? why do they matter? Addy goes deep on this topic and helps us better understand what agents do, why they’re a difference-maker, and what the tech landscape looks like.
[blog] How Much Should I Be Spending On Observability? Fantastic post from Charity that looks at misperceptions and confusion about how much you should spend on observability data and tools.
[docs] Harness CI/CD pipeline for RAG applications. What’s a viable architecture to do continuous delivery of a RAG-style AI app? That’s the focus of this architecture guide. It includes a link to a repo with the corresponding Terraform.
I’m a believer in platform engineering as a concept. Bringing standardization and golden paths to developers so that they can ship software quickly and safely sounds awesome. And it is. But it’s also been a slog to land it. Measurement has been inconsistent, devs are wildly unhappy with the state of self-service, and the tech landscape is disjointed with tons of tools and a high cost of integration. Smart teams are finding success, but this should be easier. Maybe now it is.
Last week at Google Cloud Next ’25, we took the wraps off the concept of a Cloud Internal Developer Platform (IDP). Could we take the best parts of platform engineering—consistent config management, infrastructure orchestration, environment management, deployment services, and role-based access—and deliver them as a vertically-integrated experience? Can we shift down instead of putting so much responsibility on the developer? I think we can. We have to! Our goal at Google Cloud is to deliver a Cloud IDP that is complete, integrated, and application-centric. The cloud has typically been a pile of infrastructure services, loosely organized through tags or other flawed grouping mechanisms. We’re long overdue for an app-centric lens on the cloud.
Enough talking. Let me show you by walking through an end-to-end platform engineering scenario. I want to design and deploy an application using architecture templates, organize the deployed artifacts into an “application”, troubleshoot an issue, and then get visibility into the overall health of the application.
Design and deploy app architectures with Application Design Center
To make life difficult IDP also stands for “internal developer portal.” That’s not confusing at all. Such a portal can serve as the front-door for a dev team that’s interacting with the platform. Application Design Center (ADC) is now in public preview, and offers functionality for creating templates, storing templates in catalogs, sharing templates, and deploying instances of templates.
I can start with an existing ADC template or create a brand new one. Or, I can use the ever-present Cloud Assist chat to describe my desired architecture in natural language, iterate on it, and then create an ADC template from that. Super cool!
For the rest of this example, I’ll use an existing app template in ADC. This one consists of many different components. Notice that I’ve got Cloud Run (serverless) components, virtual machines, storage buckets, secrets, load balancers, and more. Kubernetes coming soon!
I can add to this architecture by dropping and configuring new assets onto the canvas. I can also use natural language! From the Cloud Assist chat, I asked to “add a cache to the movie-frontend service” and you can see that I got a Redis cache added. And the option to accept or reject the suggestion.
Worried that you’re just working in a graphical design surface? Everything on the canvas is represented as Terraform. Switching from “Design” to “Code” at the top reveals the clean Terraform generated by ADC. Use our managed Terraform service or whatever you want for your infrastructure orchestration workflow with Terraform.
When I’m done with the template and want to instantiate my architecture, I can turn this into a deployed app. Google Cloud takes care of all the provisioning, and the assets are held together in an application grouping.
ADC is powerful for a few reasons. It works across different runtimes and isn’t just a Kubernetes solution. ADC offers good template cataloging and sharing capabilities. Its support for natural language is going to be very useful. And its direct integration with other parts of the platform engineering journey is important. Let’s see that now.
Organize apps with App Hub
An “app” represents many components, as we just saw. They might even span “projects” in your cloud account. And an application should have clearly identified owners and criticality. Google Cloud App Hub is generally available, and acts as a real-time registry of resources and applications.
App Hub auto-discovers resources in your projects (a couple dozen types so far, many more to come) and lets you automatically (via ADC) or manually group them into applications.
For a given app, I can see key metadata like its criticality and environment. I can also see who the development, business, and operations owners are. And of course, I can see a list of all the resources that make up this application.
Instead of this being a static registry, App Hub maintains links to the physical resources you’ve deployed. Once I have an application, then what?
Observe app-centric metrics in Cloud Monitoring
It’s not easy to see how apps or app-related components are performing. Now it is. We just enabled the preview of Application Monitoring in our Cloud Monitoring service.
From here, I can a list of all my App Hub apps, and the component performance of each.
When I drill into the “web server” resource, I get some terrific metrics and logs, all within whatever timeframe I specify. This is a high-density view, and I like the data points we surface here.
Again, we’re seeing a smart, integrated set of technologies here versus a series of independent stack pieces that aren’t deeply aware of the other.
Resolve issues using Cloud Assist Investigations
In that dashboard above, I’m seeing that container restarts are a real issue in this application. It’s time to troubleshoot!
Within this dashboard, I see embedded logs, and notice a warming about back-off restarting with my pods. I don’t love reading piles of JSON to try and figure out the problem, nor can I see all the ancillary content just by looking at this log entry. In private preview we have this new Investigate button.
Clicking that button sparks a new Investigation. These are AI-fueled evaluations based on a given error, and a host of related application data points. It’s meant to be a holistic exploration.
Heres where all that shared context is so valuable. In under a minute, I see the details of the Investigation. These details show the issue itself and then a series of “relevant observations.” An Investigation can be edited and re-run, downloaded, and more.
Most importantly, there’s a “Hypothesis” section that helps the app owner or SRE pinpoint the problem area to focus on. These seem well-described with clear recommendations.
I’m confident that this will be a supremely useful tool for those trying to quickly resolve application issues.
Manage the overall health of the application in Cloud Hub
What’s your “home page” for the applications you manage? That’s the idea behind the preview of the Cloud Hub. It offers app owners a starting point for the management, health, and optimization of the apps their care about.
I might start each day looking at any platform-wide incidents impacting my app, any deployment issues, service health, and more.
One private preview feature I’ll show you here is the “Optimization” view. I’m getting app-level cost and utilization summaries! It’s easy to view this for different time periods, and even drill into on a specific product within the app. What a useful view for identifying the actual cost of a running application in dev, test, or prod.
Summary
While platform engineering has been around a while, and cloud computing even longer, neither has been easy for people who just want to build and run apps. Google Cloud is uniquely set up to make this better, and this new Cloud IDP experience might be an important step forward. Try out some of the components yourself!
I forgot what it was like to have open blocks on my calendar. It’s been delightful this week while my peers recover from their Cloud Next hangover. Tomorrow, I may even write a blog post (besides this one). Stay tuned!
[article] Google Is Winning on Every AI Front. I don’t know a single person at work who’s acting cocky. Confident, maybe. But we have plenty of work to do. That said, it feels good to see the pieces coming together.
[blog] Optimizing Our E2E Pipeline. The engineering team at Slack realized their frontend builds were taking a long time, even when nothing had changed. Here’s how they optimized their pipeline.
[article] Google Cloud’s second chance at the enterprise. It’s a messy landscape with all sorts of vendors in the mix. But we’ve moved from an “edgy bet” to a “safe bet” really quickly.
Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:
I’m digging out of a reading backlog after last week’s full-time conference program. Today’s reading list has some remaining Cloud Next content, but also other interesting tidbits.
[blog] Announcing Genkit for Python and Go. This general purpose framework for building AI apps keeps expanding its list of supported languages.
[blog] What’s new in Firebase at Cloud Next 2025. Related, a huge week from Firebase. From relational databases to AI-driven dev experience, Firebase is a reclaiming mindshare.
After eight days in Las Vegas (gulp!), I’m heading home. It’s been a memorable and exhausting week. I wouldn’t trade our cloud platform for anyone else’s, and our team here is second to none. What a privilege to get to work with these cats.
[blog] Vibe Coding: Revolution or Reckless Abandon? Another banger from Addy who explains the good and bad of vibe coding. Read this for some excellent advice on how to approach it.
[blog] How to Deploy ADK Agents onto Google Cloud Run. I’ve got a lot more confidence that I can use this agent framework after watching some pros walk us through it. Karl does a great job here.
Just finished my keynote (which you can watch here) and am looking forward to hanging out with my team tonight. There’s still a healthy amount of things to check out in the reading list today!
[blog] GitLab vs GitHub : Key Differences in 2025. I don’t think I’ve seen a good comparison in a while, so this was welcome. Both are great choices, and they aren’t offering the same thing.
Today is the start of Google Cloud Next ’25. Most of today’s reading list relates to it, as that’s most of what I read today! Even if you’re not (yet) a customer, it’s worth a spin through the content.
[blog] Welcome to Next ‘25. Our CEO does a roundup of the major headlines and themes for our event this week. The customer stories of who is successful with AI *right now* are awesome.