Category: Cloud

Beyond Web Apps: Designing Database with Google Antigravity
We’re only getting started with what you can build with agentic tools. Sure, vibe coding platforms like Lovable make it super simple to develop full-featured web apps. But developers are also building all sorts of software with AI products like Claude Code and Google Antigravity.

Antigravity doesn’t just plan wide-ranging work; it does it too!

Antigravity can do more than ship code and you don’t even have to leave your editor.

In this demo, the agent reads a blog post, extracts the core narrative, and builds a Google Slides deck from scratch, handling the research and initial build for you. pic.twitter.com/CB0S5JKP4M
— Google Antigravity (@antigravity) January 17, 2026

Tweet from the Antigravity account showing a non-coding use case

Reading that tweet gave me an idea. Could I build out a complex database solution? Not an “app”, but the schema for a multi-tenant SaaS billing system? One that takes advantage of Antigravity’s browser use, builder tools, and CLI support?

Yes, yes I can. I took a single prompt to flex some of the best parts of this product, and, to generate an outcome in minutes that would have taken me hours or days to get right.

I started by opening an empty folder in Antigravity.

An empty Google Antigravity session

Here’s my prompt that took advantage of Antigravity’s unique surfaces:
```
I want to architect a professional-grade PostgreSQL schema for a multi-tenant SaaS billing system (think Stripe-lite).

Phase 1: Research & Best Practices
Use the Antigravity Browser to research modern best practices for SaaS subscription modeling, focusing specifically on 'point-in-time' billing, handling plan upgrades/downgrades, and PostgreSQL indexing strategies for multi-tenant performance. Summarize your findings in a Research Artifact.

Phase 2: Schema Design
Based on the research, generate a multi-file SQL project in the /schema directory. Include DDL for tables, constraints, and optimized indexes. Ensure you account for data isolation between tenants.

Phase 3: Verification & Load Testing
Once the scripts are ready, use the Terminal to spin up a local PostgreSQL database. Apply the scripts and then write a Python script to generate 100 rows of synthetic billing data to verify the indexing strategy.

Requirements:
Start by providing a high-level Implementation Plan and Task List.
Wait for my approval before moving between phases.
```
Note that I’m using Antigravity’s “planning” mode (versus Fast action-oriented mode) and Gemini 3 Flash.

A few seconds after feeding that prompt into Antigravity, I got two artifacts to review. The first is a high-level task list.

Google Antigravity creating a task list for our database project

I also got an implementation plan. This listed objectives and steps for each phase of work. It also called out a verification approach. As you can see in the screenshot, I can comment on any step and refine the tasks or overall plan at any time.

An AI-generated implementation plan for the database project

I chose to proceed and let the agent get to work on phase 1. This was awesome to watch. Antigravity spun up a Chrome browser and began to quickly run Google searches and “read” the results.

A view of Antigravity’s browser use where it searched for web pages and browsed relevant sites

Once it decided which links it wanted to follow, Antigravity asked me for permission to navigate to specific web pages that provided more information on SaaS billing schemas.

Google Antigravity asking permission before browsing a web site

When the research phase finished, I had a research summary that summarized the architecture, patterns, and details that represented our solution. It also embedded a video overview of the agent’s search process. I never had this paper trail when I build software manually!

Research summary including a video capture of Antigravity’s browser search process

Note that Antigravity also kept my task list up to date. The first phase was all checked off.

Maintained task list

Because I was doing this all in one session, I added a note to the chat that indicated I was ready to proceed. If I had walked away and forgot where I was, I could always go into the Antigravity Agent Manager and see my open tasks in the Inbox.

Antigravity Agent Manager inbox where we can see actions needing our attention

It took less than 25 seconds for the next phase to complete. When it was over, I had a handful of SQL script files in the project folder.

Generated scripts for our database project

At this point, I could ask Google Antigravity to do another evaluation for completeness, or ask for detailed explanations of its decisions. I’m in control, and can intervene at any point to redirect the work or make sure I understand what’s happened so far.

But I was ready to keep going to phase 3 where we tested this schema with actual data. I gave the “ok” to proceed.

This was fun too! I relocated the agent terminal to my local terminal window so that I could see all the action happening. Notice here that Antigravity created seed data, a data generation script, and then started up my local PostgreSQL instance. It loaded the data in, and ran a handful of tests. All I did was watch!

Google Antigravity using terminal commands to test our database solution

That was it. When the process wrapped up, Antigravity generated a final Walkthrough artifact that explained what it did, and even offered a couple of possible next steps for my data architecture.

Complete walkthrough of how Google Antigravity built this solution

Is your mind swirling on use cases right now? Mine still is. Maybe infrastructure-as-code artifact generation based on analyzing your deployed architecture? Maybe create data pipelines or Kubernetes YAML? Use Google Antigravity to build apps, but don’t discount how powerful it is for any software solution.
January 20, 2026
Will Google Antigravity let me implement a terrible app idea?
Yes, there are such things as stupid questions. No, you can’t do anything you set your mind to. Yes, some ideas are terrible and don’t warrant further attention. That concludes our reality check and pep talk for today.

But hey, sometimes a bad idea can evolve to a less-bad idea. Do modern agentic coding tools keep us from doing terrible things, or do they simply help us do bad things faster? The answer to both is “sort of.”

They’re tools. They follow our instructions, and provide moments to pause and reflect. Whether we choose to take those, or ask the right questions, is up to us.

Let’s see an example. In almost thirty years of coding, I’ve never had as much fun as I’m having now, thanks to Google Antigravity. I can go from idea to implementation quickly, and iterate non-stop on almost any problem. But what if I have a dumb idea? Like an app where I’ll click a button every time I take a breath. Here’s my prompt to Antigravity:
```
Let's build a web app where I can track each breath I take. Make a button that I click when I take a breath in, and increment a counter. Call Gemini 3 Flash Preview with a variable holding my API key XXX-XXXXX-XXXXXX-XXXXX-XXXX and return an inspirational quote when I load the app for the first time. Store the hourly results of my breathing stats in an XML file in the app folder.
```
There’s probably like eight things wrong with this “app.” The idea is unsustainable, I shouldn’t store API keys as variables, and stashing results in the local file system is silly.

Does Antigravity stop me? Tell me I’ve been sniffing glue? It does not. But, our first moment of reflection is the resulting Implementation Plan and Task List. Antigravity dutifully sketches out a solution per my instructions, but I have a chance to evaluate what’s about to happen.

But I’ll stubbornly stay on point. Antigravity shrugs in quiet resignation and builds out my dumb app idea. Within a minute or two, I have the Antigravity-controlled Chrome instance that loads my app. The agent tests it, and proves that I have a Gemini-generated quote, and a way to track every breath I take. Yay?

My app implements the bad practices I asked for, and uses local XML for persistent storage.

I could stop here and “ship it.” I could also eat paint chips for lunch. Life is about choices.

But good product thinking isn’t about solving problems. It’s about finding the right problems. Asking “why”, repeatedly. Thinking through assumptions and considerations and arriving at a well-analyzed place.

What I should have done after getting my implementation plan from Antigravity is asked Gemini (or literally anyone) whether this was a good idea. Heck, I should I have asked for validation before it even started planning anything. But, we can still do it now. I asked Antigravity “Let’s pause. Is this app even a good idea?”

Here’s the full answer, which is pretty darn good. It doesn’t call me an idiot outright, but steers me towards areas I should be thinking of instead.
That is a great question to ask at this stage! Now that the “how” is solved, it’s worth reflecting on the “why.”

Whether this app is a “good idea” depends entirely on your goal. Here’s a breakdown of how it might be perceived:

Theoretically: It’s a Mindfulness Powerhouse

In many meditation and breathwork practices, conscious awareness is the goal. By requiring a manual click for every breath, you are forcing your brain into a state of absolute presence.
- The Benefit: It prevents “mindless” breathing. You can’t click the button if you’re stuck in a stressful thought loop.
- The Insight: The hourly XML data could reveal exactly when you’re most “unconscious” or stressed during the day.
Practically: It’s High-Friction

From a UX perspective, there are some significant hurdles:
- The “Click Fatigue”: An average adult takes 12–20 breaths per minute. That’s nearly 1,000 clicks an hour. Even a dedicated monk might find that distracting!
- Counter-Productivity: The act of tracking might actually disrupt the relaxation you’re trying to achieve. You might find yourself focusing on the button rather than the breath.
The “Sweet Spot” Strategy

If you’re feeling skeptical about tracking every breath, you could pivot the app’s purpose slightly without losing the work we’ve done:
- Focused Sessions: Instead of an “all day” tracker, use it for a “Mindful 5 Minutes” exercise.
- Stress Intervention: Keep it as a tool to use specifically when you feel anxious—a way to “anchor” yourself back to reality through physical interaction.
- Automated Evolution: If you love the data but hate the clicking, the next step would be integrating with a wearable (like an Apple Watch) to fuel that XML file automatically.
Verdict: As a primary way to live your life, it’s probably too much work. But as a training tool or a “digital fidget spinner” for mindfulness, it’s a fascinating experiment in human-computer interaction.

What do you think? Does the “High-Friction” part make it less appealing, or does that friction actually feel like the “point” of the exercise to you?
As a next step, I asked Antigravity about continuing down this “mindful breathing” path and what sort of metrics might indicate success. Thinking through metrics is a great way to slow down and consider the measurable value of whatever you’re building.

This is great. I get different types of metrics to think about, and the answer concludes with ideas of how we might evolve the app to be more useful and practical.

I prompt Antigravity to take a few of these ideas and update the implementation plan that accounts for these new ideas and project goals.

I like it. In addition, I want to revisit some of the lazy tech assumptions I made early. My next prompt into Antigravity was this:
```
Now let's consider the app we've built so far. Does it follow good practices? Am I doing things securely, and with long-term maintenance in mind? What should we do differently for this to be more production ready?
```
I’m finally doing better work, more slowly. Challenging assumptions, and improving the quality attributes of the app. Now my plan factors in putting the Gemini API key in an environment variable, cleaning up project structure, and reconsidering the persistence layer.

I tell Google Antigravity to go ahead and implement the updated plan. It goes off to improve the quality of the code itself, but also the relevance of the idea. In a minute, I have an updated app that helps me do measured breathing for two minutes at a time.

It even adds pre-and-post mood checkers that can help determine if this app is making a positive difference.

Did Google Antigravity prevent me from doing dumb things? No. But I’m not sure that it should. Tools like this (or Conductor in the Gemini CLI) inject an explicit “planning” phase that give me an option to go slow and think through a problem. This should be the time when I validate my thinking, versus outsourcing my thinking to the AI.

I did like Antigravity’s useful response when we explored our “why” and pressed into the idea of building something genuinely useful. We should always start here. Planning is cheap, implementation is (relatively) expensive.

These are tools. We should still own the responsibility of using them well!
January 13, 2026
Stop following tutorials and learn by building (with Antigravity) instead

Don’t get me wrong, I like a good tutorial. Might be in a blog, book, video, or training platform. I’ve probably created a hundred (including in dozens of Pluralsight courses) and consumed a thousand. But lately? I don’t like be constrained by the author’s use case, and I wonder if all I’ve learned how to do is follow someone else’s specific instructions.

This popped for me twice in the past few days as I finally took some “should try” technologies off my backlog. Instead of hunting for a hello-world tutorial to show me a few attributes of Angular Signals, I simply built a demo app using Google Antigravity. No local infrastructure setup, wrangling with libraries, or figuring out what machinery I needed to actually see the technology in action.

You know, even a few months ago, I'd read a post like this about @angular Signals (https://t.co/bKxfJmnFMw) and think "that's cool, I should try that out sometime."

Now? I just pop into @antigravity, have it build me a working demo app in minutes, and explain how it works. pic.twitter.com/R7Gu9m22Mh
— Richard Seroter (@rseroter) December 5, 2025

I did it again a couple of days later! The Go version of the Agent Development Kit came out a bit ago. I’ve been meaning to try it. The walkthrough tutorials are fine, but I wanted something more. So, I just built a working solution right away.

This is how I want to work.

I used @GoogleAIStudio to brainstorm some agent ideas. I then went into @antigravity to implement the idea with the ADK for Go.

35 minutes later, I have a Go agent that pulls from a queue and writes to a bucket. No syntax wrangling, just results.
— Richard Seroter (@rseroter) December 6, 2025

I still enjoy reading content about how something works. That doesn’t go away. And truly deep learning still requires more than vibe coding an app. But I’m not defaulting to tutorials any more. Instead, I can just feed them into the LLM and build something personalized for me. Here’s an example.

Take the cases above. I jumped into Google AI Studio to get inspiration on interesting async agent use cases. I liked this one. Create a feed where an agent picks up a news headline and then does some research into related stories before offering some analysis. It’ll read from a queue, and then drop analysis to a Cloud Storage bucket.

Prompting for agent ideas in Google AI Studio

With my use case in hand, I jumped into Antigravity to sketch out a design. Notice that I just fed the tutorial link into Antigravity to ensure it’d get seeded with up-to-date info for this new library.

Sparking an application build in Google Antigravity

Antigravity started whirring away on creating implementation plans and a task list. Because I can comment on its plans and iterate on the ideas before building begins, I’m not stressed about making the first prompt perfect. Notice here that it flags a big assumption, so I provided a comment confirming that I want a JSON payload for this background worker.

Google Antigravity provides a way to offer feedback on implementation plans

After Antigravity started building, I noticed the generated code used a package the IDE flagged as deprecated. I popped into the chat (or I could have commented in the task list) and directed the AI tool to use the latest version and ensure the code still built successfully.

Steering Google Antigravity to pick a newer version of the library it selected

Constantly, I’m focused on the outcomes I’m after, not the syntax of agent building. It’s refreshing. When reviewing the code, I started to realize I wanted more data in the incoming payload. A directive later, and my code reflected it.

Iterating on software with Antigravity

This started with me wanting to learn ADK for Go. It was easy to review the generated agent code, ask Antigravity questions about it, and see “how” to do it all without typing it all out myself. Will it stick in my brain as much as if I wrote it myself? No. But that wasn’t my goal. I wanted to fit ADK for Go into a real use case.

Code that sets up an agent in ADK for Go

This solution should feel “real”, and not just be a vibe-and-go. How about using CI/CD? I never remember the syntax for Google Cloud Build, and getting my pipeline right can swallow up half my dev time. No problem.

I express my intent for a Cloud Build pipeline, and moments later I have a fully valid YAML definition, along with a generated Dockerfile.

Antigravity created a Google Cloud Build pipeline for me

Next I asked Antigravity to add a deployment step so that the container image is pushed to a Cloud Run worker pool after a successful build. I needed to point Antigravity to a tutorial for worker pools for it to know about this new feature.

I’m using an API key in this solution, and didn’t want that stored as a regular environment variable or visible during deployment. Vibe coding doesn’t have to be insecure. I asked Antigravity to come up with a better way. It chose Google Cloud Secret Manager, gave me the commands to issue, and showed me what the Cloud Run deployment command would now look like.

Getting a proper, security-conscious deployment command for Cloud Run worker pools

I then told Antigravity to introduce this updated Cloud Run command to complete the build + deploy pipeline.

A complete, AI-generated CI/CD pipeline for my agentic app

Amazing! I wanted to test this out before putting an Angular frontend into the solution. Antigravity reminded my of the right way to format a Cloud Build command given the substitution variables and I was off.

Building and deploying this agent with Cloud Build

Within a few minutes, I had a container image in Artifact Registry, and a Cloud Run worker pool listening for work.

My running workload in Cloud Run worker pools

To test it out, I needed to publish a message to Google Cloud Pub/Sub. Antigravity gave me a sample JSON message structure that agent expected to receive. I went to Techmeme.com to grab a recent news headline as my source. Pub/Sub has a UI for manually sending a message into a Topic, so I used that.

Publishing a message to Pub/Sub to trigger my agent

After a moment, I saw a new JSON doc in my Cloud Storage bucket. Opening it up revealed a set of related news, and some interesting insights.

News analysis performed by my ADK agent and dropped into a Cloud Storage bucket

I also wanted to see more of Angular Signals in action, so I started a new project and prompted Antigravity to build out a site where I could submit news stories to my Pub/Sub topic. Once again, I passed in a reference guide into my prompt as context.

Prompting Antigravity to create a frontend app using Angular Signals

I asked Antigravity to show me how Angular Signals were used, and even asked it to sketch a diagram of the interaction. This is a much better way to learn a feature than hoping a static tutorial covers everything!

The first build turned out ok, but I wanted better handling of the calls to Google Cloud Pub/Sub. Specifically, I wanted this executed server side and after adding a comment to the the implementation plan, Antigravity came up with a backend-for-frontend pattern.

Directing Antigravity to support a backend for calls to Google Cloud

After a couple of iterations on look-and-feel, and one debugging session which revealed I was using the wrong Pub/Sub topic name, I had a fully working app.

After starting the server side component and the frontend component, I viewed my app interface.

The interface Antigravity built

Grabbing another headline from Techmeme gave me a chance to try this out. Angular Signals seems super smooth.

Adding a headline and seeing a dynamic frontend in action

Once again, my Cloud Storage bucket had some related links and analysis generated by ADK agent sitting in Cloud Run worker pools.

News analysis initiated from my Angular app

It took my longer to write this post than it did to build a fully working solution. How great is that?

For me, tutorials are now LLM input only. They’re useful context for LLMs teaching me things or building apps with my direction. How about you?

December 17, 2025
Go from prompt to production using a set of AI tools, or just one (Google Antigravity)

We’ve passed the first phase of AI dev tooling. When I first saw AI-assisted code completion and generation, I was wow-ed. Still am. Amazing stuff. Then agentic coding environments went a step further. We could generate entire apps with products like Replit or v0! Following that, we all got new types of agentic IDEs, CLIs, background coding agents, and more. With all these options, there isn’t just one way to work with AI in software engineering.

I’m noticing that I’m using AI tools to command (perform actions on my environment or codebase), to code (write or review code), and to conduct (coordinate agents who work on my behalf). Whether these are done via separate tools or the same one, this seems like a paradigm that will persist for a while.

Let’s see this in action. I’ll first do this with a set of popular tools—Google AI Studio, Gemini CLI, Gemini Code Assist, and Jules—and then do the same exercise with the new Google Antigravity agent-first development platform.

Architecture diagram generated with nano-banana

I’ve accepted that I’ll never be a professional baseball player. It’s just not in the cards. But can I use AI to help me pretend that I played! Let’s build an application that uses AI to take an uploaded picture and generate images of that person in various real-life baseball situations.

Build with a set of AI tools

Gemini 3 Pro is excellent at frontend code and Google AI Studio is a fantastic way to get started building my app. I went to the “Build” section where I could provide a natural language prompt to start vibe-coding my baseball app. Here’s an example of “commanding” with AI tools.

Google AI Studio

After a few seconds of thinking, I saw a stash of files created for my application. Then a preview popped up that I could actually interact with.

Vibe coded app in Google AI Studio

Jeez, only one prompt and I have an awesome AI app. How cool is that? The Nano Banana model is just remarkable.

Now I wanted to do more with this app and bring it into my IDE to make some updates before deploying it. In the top right of the screen, there’s a GitHub icon. After I clicked that, I was asked to authenticate with my GitHub account. Next, I had to provide details about which repo to create for this new codebase.

Create GitHub repo from Google AI Studio

Then Google AI Studio showed me all the changes it made in the local repo. I get one last chance to review things before staging and committing the changes.

Push changes to GitHub

A moment later, I had a fully populated GitHub repo. This gave me the intermediate storage I needed to pick up and continue with my IDE and agentic CLI.

Vibe coded app code in my GitHub repo

I jumped into Visual Studio Code with the installed Gemini Code Assist plugin. I’ve also got the Gemini CLI integration set up, so everything is all in one place.

Visual Studio Code with Gemini Code Assist and the Gemini CLI

Here, I can command and code my way to a finished app. I could ask (command) for a summary of the application itself and how it’s put together. But even more useful, I issued a command asking for how this app was authenticating with the Gemini API.

Gemini Code Assist helping me understand the codebase

Very helpful! Notice that it found a config file that shows a mapping from GEMINI_API_KEY (which is the environment variable I need to set) to the API_KEY referred to in code. Good to know.

Here’s where I could continue to code my way through the app with AI assistance if there were specific changes I felt like making ahead of deploying it. I wrote a mix of code (and used the Gemini CLI) to add a Node server to serve this static content and access the environment variable from the runtime.

Let’s do some conducting. I didn’t feel like writing up a whole README and wanted some help from AI. Here’s where Jules comes in, and its extension for the Gemini CLI. Notice that I have Gemini CLI extensions for Jules and Cloud Run already installed.

Two MCP servers added to the Gemini CLI

I can go ahead and ask Jules to create a better README, and then continue on my work. Agents working on my behalf!

Using the Gemini CLI to trigger a background task in Jules

After doing some other work, I came back and checked the status of the Jules job (/jules status) and saw that the task was done. The Jules extension asked me if I wanted a new branch, or to apply the changes locally. I chose the former option and reviewed the PR before merging.

Reviewing a branch with a README updated by Jules

Finally, I was ready to deploy this to Google Cloud Run. Here, I also used a command approach and instructed the Gemini CLI to deploy this app with the help of the extension for Cloud Run.

Using a natural language request from me, the Gemini CLI crafted the correct gcloud CLI command to deploy my app.

Doing a deployment to Cloud Run from the Gemini CLI

That finished in a few seconds, and I had my vibe-coded app, with some additional changes, deployed and running in Google Cloud.

App running on Google Cloud

So we commanded Google AI Studio to build the fundamentals of the app, used Gemini Code Assist and the Gemini CLI to code and command towards deployment, and Jules to conduct background agents on our behalf. Not particularly difficult, and the handoffs via a Git repo worked well.

This process works great if you have distinct roles with handoffs (designer –> developer –> deployment team) or want to use distinct products at each stage.

Build with Google Antigravity

Google Antigravity isn’t a code editor. It’s not an IDE. It’s something more. Yes, you can edit code and do classic IDE things. What’s different is that it’s agent-first, and supports a rich set of surfaces in a single experience. I can kick off a series of agents to do work, trigger Computer Use in a dedicated browser, and extend behavior through MCP servers. Basically, I can do everything I did above, but within a single experience.

Starting point with Google Antigravity

I fed it the same prompt I gave to Google AI Studio. Immediately, Google Antigravity got to work building an implementation plan.

Giving a prompt to Antigravity to build out an application

I love that I can review this implementation plan, and add comments to sections I want to update. This feels like a very natural way to iterate on this specification. Right away, I asked for Node server host for this app, and am building it that way from the start.

Implementation Plan, with comments

The AI agent recognizes my comments and refreshes its plans.

Antigravity using the Implementation Plan to begin its work

At this point, the agent is rolling. It built out the entire project structure, created all the code files, and plowed through its task list. Yes, it creates and maintains a task list so we can track what’s going on.

Task List maintained by Antigravity

The “Agent Manager” interface is wild. From here I can see my inbox of agent tasks, and monitor what my agents are currently doing. This one is running shell commands.

Agent Manager view for triggering and managing agent work

The little “drawer” at the bottom of the main chat window also keeps tabs on what’s going on across all the various agents. Here I could see what docs need my attention, which processes are running (e.g. web servers), and which artifacts are part of the current conversation.

View of processes, documents, and conversation artifacts

The whole app building processed finished in just a few minutes. It looked good! And because Google Antigravity has built-in support for Computer Use with a Chrome browser, it launched a browser instance and showed me how the app worked. I can also prompt Computer Use interactions any time via chat.

Computer Use driving the finished application

Antigravity saved the steps it followed into an artifact called Walkthrough. Including a screenshot!

Generated walkthrough including screenshots

How about fixing the README? In the previous example, I threw that to a background task in Jules. I could still do that here, but Antigravity is also adept at doing asynchronous work. I went into the Agent Manager and asked for a clean README with screenshots and diagrams. Then I closed Agent Manager and did some other things. Never breaking flow!

Triggering a background agent to update the README

Later, I noticed that the work was completed. The Agent Manager showed me what it did, and gave me a preview of the finished README. Nice job.

Finished README with diagrams and screenshots

I wanted to see the whole process through, so how about using Google Antigravity to deploy this final app to Google Cloud Run?

This product also supports extension via MCP. During this product preview, it comes with a couple dozen MCP servers in the “MCP Store.” These include ones for Google products, Figma, GitHub, Stripe, Notion, Supabase, and more.

MCP servers available out of the box

We don’t yet include one for Cloud Run, but I can add that myself. The “manage MCP servers” is empty to start, but it shows you the format you need to add to the configuration file. I added the configuration for the local Cloud Run MCP server.

Configuration for the Cloud Run MCP server

After saving that configuration, I refreshed the “manage MCP servers” screen and saw all the tools at my disposal.

Tools available from the Cloud Run MCP server

Sweet! I went back to the chat window and asked Google Antigravity to deploy this app to Cloud Run.

Antigravity deploying the app to Google Cloud Run

The first time, the deployment failed but Google Antigravity picked up the error and updated the app to start on the proper port and tweak how it handled wildcard paths. It then redeployed, and worked.

Chat transcript of attempt to deploy to Google Cloud Run

Fantastic. Sure enough, browsing the URL showed my app running and working flawlessly. Without a doubt, this would have been hours or days of work for me. Especially on the frontend stuff since I’m terrible at it. Instead, the whole process took less than an hour.

Finished application running in Google Cloud Run

I’m very impressed! For at least the next few years, software engineering will likely include a mix of commands, coding, and conducting. As I showed you here, you can do that with distinct tools that enable distinct stages and offer one or more of those paradigms. Products like Google Antigravity offer a fresh perspective, and make it possible to design, build, optimize, and deploy all from one product. And I can now seamlessly issue commands, write code, and conduct agents without ever breaking flow. Pretty awesome.

November 30, 2025
How to build and deploy a portable AI agent that uses a managed memory service
I enjoy building with new frameworks and services. Do you? It’s fun to break new ground. That said, I’m often filled with regret as a I navigate incomplete docs, non-existent search results, and a dearth of human experts to bother. Now add LLMs that try to help but accidentally set you back. Good times. But we persevere. My goal? Build an AI agent—it helps you plan a career change—that retains memory through long-running conversations, and is portable enough that it can run on most any host. Easy enough, yes?

My weapons of choice were the Agent Development Kit (Python), the new fully-managed Vertex AI Memory Bank service, and runtime hosts including Google Cloud Run and Vertex AI Agent Engine. Most every sample I found for this tech combination was either PhD level coding with excessive functionality, a hard-coded “hello world” that didn’t feel realistic, or a notebook-like flow that didn’t translate to an independent agent. I craved a simple, yet complete, example of what a real, hosted, and memory-infused agent looks like. I finally got it all working, it’s very cool, and wanted to share steps to reproduce it.

Vertex AI Memory Bank showing memories from my AI agent

Let’s go through this step by step, and I’ll explain the various gotchas and such that weren’t clear from the docs or existing samples. Note that I am NOT a Python developer, but I think I follow some decent practices here.

First, I wanted a new Python virtual environment for the folder containing my app.
```
 python3 -m venv venv
source venv/bin/activate
```
I installed the latest version of the Google ADK.
```
pip install google-adk
```
My source code is here, so you can just download the requirements.txt file and install the local dependencies you need.
```
pip install -r requirements.txt
```
I’ve got an __init__.py file that simply contains:
```
from . import agent
```
Now the agent.py itself where all the logic lives. Let’s go step by step, but this all is from a single file.
```
import os
import sys
from google.adk.agents import Agent
from google.adk.tools import agent_tool
from google.adk.tools import google_search

from google import adk
from google.adk.runners import Runner
from google.adk.sessions import VertexAiSessionService
from google.adk.memory import VertexAiMemoryBankService
from google.api_core import exceptions
```
Nothing earth-shattering here. But I use a mix of built-in tools including Google Search. And I’m using durable storage for sessions and memory (versus the default in-memory options) and importing those references.
```
app_name = 'career_agent'

# Retrieve the agent engine ID needed for the memory service
agent_engine_id = os.environ.get("GOOGLE_CLOUD_AGENT_ENGINE_ID")
```
Our agent app needs a name for the purpose of storing sessions and memory through ADK. And that agent_engine_id is important for environments where it’s not preloaded (e.g. outside of Vertex AI Agent Engine).
```
# Create a durable session for our agent
session_service = VertexAiSessionService()
print("Vertex session service created")

# Instantiate the long term memory service, needs agent_engine parameter from environment or doesn't work right
memory_service = VertexAiMemoryBankService(
    agent_engine_id=agent_engine_id)
print("Vertex memory service created")
```
Here I create instances of the VertexAiSessionService and VertexAiMemoryBankService. These refer to fully managed, no ops needed, services that you can use standalone wherever your agent runs.
```
# Use for callback to save the session info to memory
async def auto_save_session_to_memory_callback(callback_context):
    try:
        await memory_service.add_session_to_memory(
            callback_context._invocation_context.session
        )
        print("\n****Triggered memory generation****\n")
    except exceptions.GoogleAPICallError as e:
        print(f"Error during memory generation: {e}")
```
Now we’re getting somewhere. This function (thanks to my colleague Megan who I believe came up with it) will be invoked as a callback during session turns.
```
# Agent that does Google search
career_search_agent_memory = Agent(
    name="career_search_agent_memory",
    model="gemini-2.5-flash",
    description=(
        "Agent answers questions career options for a given city or country"
    ),
    instruction=(
        "You are an agent that helps people figure out what types of jobs they should consider based on where they want to live."
    ),
    tools=[google_search],
)
```
That’s agent number one. It’s a secondary agent that just does a real-time search to supplement the LLM’s knowledge with real data about a given job in a particular city.
```
# Root agent that retrieves memories and saves them as part of career plan assistance
root_agent = Agent(
    name="career_advisor_agent_memory",
    model="gemini-2.5-pro", # Using a more capable model for orchestration
    description=(
        "Agent to help someone come up with a career plan"
    ),
    instruction=(
        """
        **Persona:** You are a helpful and knowledgeable career advisor.

        **Goal:** Your primary goal is to provide personalized career recommendations to users based on their skills, interests, and desired geographical location.

        **Workflow:**

        1.  **Information Gathering:** Your first step is to interact with the user to gather essential information. You must ask about:
            *   Their skills and areas of expertise.
            *   Their interests and passions.
            *   The city or country where they want to work.

        2.  **Tool Utilization:** Once you have identified a potential career and a specific geographical location from the user, you **must** use the `career_search_agent_memory` tool to find up-to-date information about job prospects.

        3.  **Synthesize and Respond:** After obtaining the information from the `career_search_agent_memory` tool, you will combine that with the user's stated skills and interests to provide a comprehensive and helpful career plan.

        **Important:** Do not try to answer questions about career options in a specific city or country from your own knowledge. Always use the `career_search_agent_memory` tool for such queries to ensure the information is current and accurate.
        """
    ),
    tools=[adk.tools.preload_memory_tool.PreloadMemoryTool(), agent_tool.AgentTool(career_search_agent_memory), ],
    after_agent_callback=auto_save_session_to_memory_callback,
)
```
That’s the root agent. Let’s unpack it. I’ve got some fairly detailed instructions to help it use the tool correctly and give a good response. Also note the tools. I’m preloading memory so that it gets context about existing memories, even if they happened five sessions ago. It’s got a tool reference to that “search” agent I defined above. And then after the agent generates a response, we save the key memories to the Memory Bank.
```
runner = Runner(
    agent=root_agent,
    app_name=app_name,
    session_service=session_service,
    memory_service=memory_service)
```
Finally, I’ve got a Runner. I’m not positive this is even used when the agent runs on Vertex AI Agent Engine, but it plays a part when running elsewhere.

That’s it. 87 lines in one file. Writing the code wasn’t the hard part; knowing what to do and how to shape the agent was where all the work happened.

Let’s deploy, and test it all out with cURL commands. To deploy this to the fully-managed Vertex AI Agent Engine, it’s a single ADK command now. You need to provide it a Cloud Storage bucket name (for storing artifacts), but that’s about it.
```
adk deploy agent_engine \
    --project=seroter-project-base \
    --region=us-central1 \
    --staging_bucket=gs://seroter-agent-memory-staging \
    --display_name="Career Agent with Memory" \
    --trace_to_cloud \
    career_agent_memory/
```
When this finishes, I saw a bucket loaded up with code and other artifacts.

Files generated and stored by ADK for my deployed agent

More importantly, I had an agent. Vertex AI Agent Engine has a bunch of pre-built observability dashboards, and an integrated view of sessions and memory.

Vertex AI Agent Engine dashboard in the Google Cloud Console

Let’s use this agent, and see if it does what it’s supposed to. I’m going to use cURL commands, so that it’s super clear as to what’s happening.
```
curl \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/seroter-project-base/locations/us-central1/reasoningEngines/8479666769873600512:query \
-d '{"class_method": "create_session", "input": {"user_id": "u_123"},}'
```
This first command creates a new session for our agent chat. The authorization comes from injecting a Google Cloud token into the header. I plugged in the “resource name” of the Agent Engine instance into the URI and set a user ID. I get back something like this:
```
{
  "output": {
    "userId": "u_123",
    "id": "5926526278264946688",
    "events": [],
    "appName": "8479666769873600512",
    "state": {},
    "lastUpdateTime": 1760395538.0874159
  }
}
```
That “id” value matches the session ID now visible in the Vertex AI Session list. This session is for the given user, u_123.

A session created for the agent running in the Vertex AI Agent Engine

Now I can chat with my career agent. Here’s the cURL request for submitting a query. This will trigger my root agent, call my secondary agent, and store the key memories of the interaction as a callback.
```
curl \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/seroter-project-base/locations/us-central1/reasoningEngines/8479666769873600512:streamQuery?alt=sse \
-d '{"class_method": "stream_query","input": {"user_id": "u_123","session_id": "5926526278264946688","message": "I am currently a beekeeper in New Mexico. I have been to college for economics, but that was a long time ago. I am thinking about moving to Los Angeles CA and get a technology job. What are my job prospects in that region and how should I start?",}}'
```
Note that the engine ID is still in the URI, and payload contains the user ID and session ID. What I got back was a giant answer with some usable advice on how I can take my lucrative career as a beekeeper and make my mark on the technology sector.

What got automatically saved as a memory? Switching to the Memories view in Vertex AI, I see that a few key details about my context were durably stored.

Memories automatically parsed and stored in the Vertex AI Memory Bank

Now if I delete my session, come back tomorrow and start a new one, any memories for this user ID (and agent engine instance) will be preloaded into every agent request. Very cool!

Let’s quickly prove it. I can destroy my session with this cURL command.
```
curl \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/168267934565/locations/us-central1/reasoningEngines/8479666769873600512:query?alt=sse \
-d '{"class_method": "delete_session","input": {"user_id": "u_123","session_id": "5926526278264946688",}}'
```
No more session, but my Memories remain. I can then request another session (for the same user) using the earlier command:
```
curl \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/seroter-project-base/locations/us-central1/reasoningEngines/8479666769873600512:query \
-d '{"class_method": "create_session", "input": {"user_id": "u_123"},}'
```
At this point, I could ask something like “what do you already know about me?” in my query to see if it retrieves the memories it stored before.
```
curl \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/seroter-project-base/locations/us-central1/reasoningEngines/8479666769873600512:streamQuery?alt=sse \
-d '{"class_method": "stream_query","input": {"user_id": "u_123","session_id": "3132042709481553920","message": "What do you already know about me?",}}'
```
Here’s what I got back:
```
{"content": {"parts": [{"thought_signature": "CrgEAR_M...twKw==", "text": "You have an economics degree and are currently a beekeeper in New Mexico. You're considering a move to Los Angeles for a job in the technology sector."}], "role": "model"}, "finish_reason": "STOP", "usage_metadata": {"candidates_token_count": 32, "candidates_tokens_details": [{"modality": "TEXT", "token_count": 32}], "prompt_token_count": 530, "prompt_tokens_details": [{"modality": "TEXT", "token_count": 530}], "thoughts_token_count": 127, "total_token_count": 689, "traffic_type": "ON_DEMAND"}, "avg_logprobs": -0.8719542026519775, "invocation_id": "e-53e94a44-ad6b-4e97-9297-51612f4e77a9", "author": "career_advisor_agent_memory", "actions": {"state_delta": {}, "artifact_delta": {}, "requested_auth_configs": {}, "requested_tool_confirmations": {}}, "id": "c9e484cd-e5f7-4e1e-94d7-7490a006137d", "timestamp": 1760396342.830469}
```
Excellent! With this approach, I have zero database management to do, yet my agents can retain context for each turn over an extended period of time.

Vertex AI Agent Engine is cool, but what if you want to serve up your agents on a different runtime? Maybe a VM, Kubernetes, or the best app platform available, Google Cloud Run. We can still take advantage of managed sessions and memory, even if our workload runs elsewhere.

The docs don’t explain how to do this, but I figured out the first step. You need that Agent Engine ID. When deploying to Vertex AI Agent Engine, it happened automatically. But now I need to explicitly submit an HTTP request to get back an ID to use for my agent. Here’s the request:
```
curl \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://aiplatform.googleapis.com/v1/projects/168267934565/locations/us-central1/reasoningEngines \
-d '{"displayName": "memory-bank-for-cloud-run"}'
```
I get back an ID value, and I see a new entry show up for me in Vertex AI Agent Engine.

Memory Bank instance for an agent in Cloud Run

The ADK also supports Google Cloud Run as a deployment target, so I’ll deploy this exact agent, no code changes, there too. First, I threw a few values into the shell’s environment variables to use for the CLI command.
```
export GOOGLE_CLOUD_PROJECT=seroter-project-base
export GOOGLE_CLOUD_LOCATION=us-central1 
export GOOGLE_GENAI_USE_VERTEXAI=True
```
Then I issued the single request to deploy the agent to Cloud Run. Notice some different things here. First, no Cloud Storage bucket. Cloud Run creates a container from the source code and uses that. Also, I explicitly set the –memory_service_uri and –session_service_uri to enable some of the pre-wiring to those services. It didn’t work without it, and the current docs don’t include the proper parameters. And I also figured out (undocumented) how to add Cloud Run environment variables, since the Agent Engine ID was also needed there.
```
adk deploy cloud_run \
--project=$GOOGLE_CLOUD_PROJECT \
--region=$GOOGLE_CLOUD_LOCATION \
--service_name=career-agent \
--app_name=career_agent \
--port=8080 \
--memory_service_uri=agentengine://8058017254761037824 \
--session_service_uri=agentengine://8058017254761037824 \
career_agent_memory/ \
-- --set-env-vars "GOOGLE_CLOUD_AGENT_ENGINE_ID=8058017254761037824"
```
In just a couple minutes, I ended up with an agent ready to serve on Cloud Run.

Agent running in Cloud Run

The URLs I use to interact with my agent are now different because we’re not calling the managed service endpoints of Vertex AI to invoke the agent. So if I want a new session to get going, I submit a cURL request like this:
```
curl -X POST -H "Content-Type: application/json" -d '{}' \
    https://career-agent-168267934565.us-central1.run.app/apps/career_agent/users/u_456/sessions
```
I’ve got no payload for this request, and specified the user name in the URL. I got back a session ID in a JSON payload like above. And I can see that session registered in my Agent Engine console.

Session created based on web request

Submitting queries to this agent is slightly different than when it was hosted in Vertex AI Agent Engine. For Cloud Run agents, the cURL request looks like this:
```
curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    https://career-agent-168267934565.us-central1.run.app/run_sse \
    -H "Content-Type: application/json" \
    -d '{
    "app_name": "career_agent",
    "user_id": "u_456",
    "session_id": "311768995957047296",
    "new_message": {
        "role": "user",
        "parts": [{
        "text": "I am currently a cowboy in Las Vegas. I have been to college for political science, but that was a long time ago. I am thinking about moving to San Francisco CA and getting a technology job. What are my job prospects in that region and how should I start?"
        }]
    },
    "streaming": false
    }'
```
After a moment, not only do I get a valid answer from my agent, but I also see that the callback fired and I’ve got durable memories in Vertex AI Memory Bank.

Memories saved for the Cloud Run agent

Just like before, I could end this session, start a new one, and the memories still apply . Very nice.

Access to sessions and memories that scale as your agent does, or survive compute restarts, seems like a big deal. You can use your own database to store these, but I like having a fully managed option that handles every part of it for me. Once you figure out the correct code and configurations, it’s fairly easy to use. You can try this all yourself in Google Cloud with your existing account, or a new account with a bunch of free credits.
October 14, 2025

Vibing options for whoever you are

My favorite movie is Tommy Boy. I apologize for nothing. You won’t find this flick listed on the top 100 movies of all time, and some people think it’s a dumb movie. That’s ok. Look, I don’t really get the Dave Matthews Band. They seem like nice chaps, but it’s not my thing. But they have a massive following of superfans. Everything isn’t for everyone. Not every product is built for you.

I recently looked at which AI coding tool was the right fit for a given situation. But what about Google’s hefty portfolio of products for those who want to vibe code and let AI take the wheel? In that case, it’s not just about the given situation, but also the type of person. Not every product is for each type of user.

Vibe coding is expanding the pool of people who build apps. It seems to consist of consumers who are non-tech folks who want to bring ideas to life. There are tech-adjacent professionals who do “knowledge work” and might be business analysts, product managers, program leads, and executives. And then you have software developers who have deep understanding of tech, and want to quickly produce new software.

What are we vibe-coding? To me, it seems like we’re building throwaway prototypes to just try something out quickly. We’re creating personal software that’s meant to improve our own productivity. And we’re delivering multi-user apps that are intended for legit use by others. So in my mind, it’s a grid like this, with my take on which Google tech fits where:

	Throw-away prototypes	Personal software	Multi-user apps
Consumers	n/a	Gemini Canvas Gemini Gems	Gemini Canvas
Tech-adjacent professionals	Gemini Canvas Google AI Studio	Opal Google AI Studio Agentspace	Gemini Canvas Google AI Studio Opal Firebase Studio
Software developers	Google AI Studio Gemini CLI	Gemini CLI Gemini Code Assist	Gemini CLI Gemini Code Assist Jules

Vibing as consumers

I’m not sure consumers are interested in throwaway prototypes. My non-tech friends wouldn’t want to geek out on tech. They have some sort of goal to solve a problem.

What about consumers building “personal software” that acts as a web app, agent, or tool? Sure. Gemini Canvas seems like a good choice for this. And, for building simple apps to share with others. Gemini Gems are a tool for building personal AI assistants without needing to be a tech expert. Some are creating fun consumer-grade demos with Google AI Studio, so I wouldn’t complain if you added that product into this row as well.

Let’s look at Gemini Canvas. You activate this in Gemini when you choose to “build.”

Let me provide it a simple prompt:

Beautiful exercise tracking app that lets me record the exercises I did in a given day, and for how long. I can also view past days and see trends over time.

When I enter that prompt, Gemini gets to work. It creates a single-file app where the code is not the focus. You can see the code, but it quickly switches the UI to a preview of the app. I can make changes via the chat and get a live look at the changes.

This is a fun to use, simple interface that’s consumer friendly. It’s easy to trigger, very fast at generating apps, has basic abilities to rollback changes, and offers sharing via a public link. It’s very opinionated on the tech stack, the code is all stuffed into a single artifact, and you don’t get many legit deployment options. Great for consumers who are building personal software or simple apps for a small group to use.

Vibing as tech-adjacent professionals

I think it’s awesome that anyone within a company can be a builder. It doesn’t matter if the HR person, executive assistant, program manager, or VP doesn’t know how to code. They can use Gemini Canvas as I showed above, along with other tools.

Some who have some tech familiarity might jump to Google AI Studio. It’s free to use and fantastic for builders. From the “build” menu, you can trigger a vibe coding experience that keeps the focus on the outcome, not the code.

I’ll use the same prompt as above, but you also get a few other configuration options, including the ability to choose between React or Angular.

Once I submit the prompt, Google AI Studio gets to work thinking through a plan and building out the components. The interface is terrific here. What’s different from Gemini Canvas is that you get a professionally structured project with code arranged across files. It takes longer to get to a Preview because it’s doing more (responsible) work, but it’s still very fast.

I like the suggestions offered for the app (above chat box), easy ability to download the app, GitHub integrations, and one-click deploys to Google Cloud Run. The code editor is basic, so I wouldn’t use this for sophisticated builds, but that’s not what it’s for.

Opal is a new Google experiment for building “mini-AI apps” and isn’t a standard vibe coding tool. Think of it as a way to build apps that are focused on generating content with AI.

My exercise tracking tool doesn’t make a ton of sense here. One use case for Opal could be to generate text for sharing content on each social media site.

Opal lets you define what you need to collect from the user, assets you have available (YouTube videos, documents, and more), can perform web searches, generate all sorts of media, and aggregate results.

It’s simple to preview and share these apps, and I like the concept. It’s not a traditional “vibe coding” tool, but I can see where non-developers would like using it to bring ideas to life.

You could also consider Agentspace a vibing tool for office workers. Agentspace is a unique platform for those who want a more useful and effective internal experience for getting work done. Besides offering a AI-enabled search and research, it also has an agent-building experience for those who want personal agents.

The agent builder interface is entirely no-code, and lets you tap into public internet searches, along with private enterprise data sources and tools. This is super helpful for those who want to automate repeatable tasks or build personal productivity solutions.

The final option I put into this row is Firebase Studio. This is a service that’s completely applicable to software developers, but also friendly to those who aren’t professionals in this space. Unlike most of the options I’ve listed so far, this isn’t only for front-end solutions. I can build backend Go or Java apps too. It also offers a vibing UI where you can start with a prompt and build the app. I’ll use the same prompt I did earlier.

When you start vibe coding here, Firebase Studio shares an app blueprint and then gets to work. I wouldn’t give this to a consumer persona—there’s still technical know-how you’d need to have—built it’s approachable to those who aren’t full-on software developers.

Vibing as software developers

If you’re a developer, you can use any of the options above. You might love the simplicity of Gemini Canvas, or prefer the opinionated Google AI Studio environment. Sounds good, live your life.

Many software people want to vibe code with tools already in their toolchain. Your best bets with Google are the Gemini CLI and Gemini Code Assist.

The Gemini CLI is a command line interface that’s free to use. You can authenticate with your Google ID (as an individual or corporate user), bring a Google AI Studio API key, or use Google Cloud Vertex AI. It has built-in tools (Google Search, shell, reading files, etc), supports MCP, has configurable memory, and can run anywhere. We recently added a supporting GitHub Action so that you can use it in your code repo. And Zed just integrated it into their next-gen code editor.

The Gemini CLI is a great vibe coding tool. I could use the same prompt above, but also ask for a technical spec first, define my language/framework preferences, and generally steer the build the way I want.

When I want maximum power during my vibe coding sessions (like I had yesterday), i use a combination of Gemini Code Assist with the Gemini CLI rolled in. Killer combo, as I get the conversational AI mode of Gemini Code Assist in my IDE/editor, but also the rich agentic power of the Gemini CLI with shared context. Yesterday I vibe coded a complete “travel app” after using the CLI to generate a spec and then incrementally implementing it with the CLI, and taking over control in the editor when I needed to.

Wrap up

There’s no wrong answer here. Use what fits your situation, and the role you’re playing. If you’re goofing around and just want a frontend app, anyone should use things like Google AI Studio. If you only want a code-centric experience for every situation, stay with IDE-style tools and CLIs. But I love that it’s so much simpler for people of any skill level to realize their ideas through software thanks to a range of vibe coding tools that suit each person’s taste.

August 28, 2025

Code was the least interesting part of my multi-agent app, and here’s what that means to me

At least 80% of the code I’ve ever written could have been written by AI, probably at higher quality. I’ve been “in tech” for twenty seven years and spent seven of those as a software developer. Even when I stopped getting paid for it, I never stopped coding. But little of it’s been truly novel; most of my code has been straightforward database access code, web APIs, presentation logic, and a handful of reasonably-complex systems. No doubt, many of you have done truly sophisticated things in code—compilers, performance-tuned algorithms, language frameworks—and AI isn’t replacing that any time soon. But I’d bet that much of the interesting tech work is moving away from raw code, and towards higher-order architecture.

writing the code is the easy part now. the bottlenecks are:

– defining the right product (insight, taste, timing)
– distribution (narrative, audience, partnerships)
– design that feels alive
– data access (auth, api, privacy)
– user motivation & trust (esp. w/ ai)
– platform…
— signüll (@signulll) July 6, 2025

I wanted to build out an agentic solution, and I used AI to generate 90% of the code. That code isn’t where the unique value was at. None of it was particularly noteworthy. You can find the whole app here. The most interesting work related to architectural decisions. Here are eight choices I had to make, and I suspect you’ll have fun wrestling with the same ones.

Choice #1 – What am I trying to accomplish and do agents make sense?

My goal was to build an app that could take in a customer’s roofing needs, create a service appointment, and generate a personalized invoice for the work. I’m cheating here, since this exercise started as “Richard wants to learn some agent tech.” So I did start with the end in mind. Judge me accordingly.

But in every legit situation, we start by evaluating the user need. What functional requirements do I need to satisfy? What performance or quality attributes are necessary? Can I solve this with a simple service, or modular monolith? Is the user flow deterministic or variable?

This scenario could certainly be solved by a simple data collection form and PDF generator. What requirements might make an agentic architecture the truly correct choice?

Data collection from the user requires image, video, and audio input to best scope the services and pricing we should offer.
The scheduling or invoicing process requires a dynamic workflow based on a variety of factors, and hard-coding all the conditions would be tricky.

Either way, this is always a critical choice before you write a single line of code.

Choice #2 – What data or services are available to work with?

Before we build anything new, what do we already have at our disposal?

In my case, let’s assume I already have an appointments web API for retrieving available appointment times and making new appointments. I’ve also got an existing database that stores promotional offers that I want to conditionally add to my customer invoice. And I’ve got an existing Cloud Storage bucket where I store customer invoice PDFs.

It’s easy to just jump into the application build, but pause for a few moments and take stock of your existing inventory and what you can build around.

Choice #3 – What (agent) framework should I use and why?

So. Many. Choices.

There’s AI app frameworks like Genkit, LlamaIndex, and Spring AI. There are agent frameworks like LangChain, LangGraph, Autogen, CrewAI, and more. Google recently shipped the Agent Development Kit, available for Python and Java developers. An agent built with something like ADK is basically made up of three things: a model, instructions, and tools. ADK adds sweeteners that give you a lot of flexibility. Things I like about ADK:

Support for multiple agent patterns
Flexible tools support, including function tools, agent tools, and MCP
Built-in tools like Google Search
Integrations with the Agent2Agent Protocol
Useful support for session state and long-term memory
Straightforward evaluations
Runtime-agnostic code that you can deploy anywhere
Good ecosystem, samples, and docs

And look, I like it because my employer invests in it. So, that’s a big factor. I also wanted to build agents in both Python and Java, and this made ADK a great choice.

Don’t get married to any framework, but learn the fundamentals of tool use, memory management, and agent patterns.

Choice #4 – How should I use tools in the appointment agent?

I suspect that tool selection will be a fascinating area for many builders in the years ahead. In this scenario, I had some decisions to make.

I don’t want to book any roof repairs on rainy days. But where can I get the weather forecast from? I chose the built-in Google Search tool instead of trying to find some weather API on the internet.

weather_agent = Agent(
    name="weather_agent",
    model="gemini-2.0-flash",
    description=(
        "Agent answers questions about the current and future weather in any city"
    ),
    instruction=(
        "You are an agent for Seroter Roofing. You can answer user questions about the weather in their city right now or in the near future"
),
    tools=[google_search],
)

For interacting with my existing appointments API, what’s the right tool choice? Using the OpenAPI tool baked into the ADK, I can just hand the agent an OpenAPI spec and it’ll figure out the right functions to call. For retrieving open appointment times, that’s a straightforward choice.

openapi_spec = openapi_spec_template.replace("{API_BASE_URL}", config.API_BASE_URL)

toolset = OpenAPIToolset(spec_str=openapi_spec, spec_str_type="json")
api_tool_get_appointments = toolset.get_tool("get_available_appointments")

But what about booking appointments? While that’s also an API operation, I want to piggyback a successful booking with a message to Google Cloud Pub/Sub that downstream subscribers can read from. That’s not part of the appointments API (nor should it be). Instead, I think a function tool makes sense here, where I manually invoke the appointments API, and then make as subsequent call to Pub/Sub.

def add_appointment(customer: str, slotid: str, address: str, services: List[str], tool_context: ToolContext) -> dict:
    """Adds a roofing appointment by calling the booking API and logs the conversation history.

    This function serves as a tool for the agent. It orchestrates the booking process by:
    1. Calling the internal `_book_appointment_api_call` function to make the actual API request.
    2. If the booking is successful, it retrieves the conversation history from the
       `tool_context` and logs it to a Pub/Sub topic via `_log_history_to_pubsub`.

    Args:
        customer: The name of the customer.
        slotid: The ID of the appointment slot to book.
        address: The full address for the appointment.
        services: A list of services to be booked for the appointment.
        tool_context: The context provided by the ADK, containing session information.

    Returns:
        A dictionary containing the booking confirmation details from the API,
        or an error dictionary if the booking failed.
    """
    booking_response = _book_appointment_api_call(customer, slotid, address, services)

    if "error" not in booking_response:
        history_list: List[Event] = tool_context._invocation_context.session.events # type: ignore
        _log_history_to_pubsub(history_list)
    
    return booking_response

Choice #5 – When/how do I separate agent boundaries?

There’s a good chance that an agentic app has more than one agent. Stuffing everything into a single agent with a complex prompt and a dozen tools seems … suboptimal.

But multi-agent doesn’t have to mean you’re sliding into a distributed system. You can include multiple agents in the same process space and deployment artifact. The Sequential Agent pattern in the ADK makes it simple to define distinct agents that run one and at time. So it seems wise to think of service boundaries for your agents, and only make a hard split when the context changes.

For me, that meant one set of agents handling all the appointment stuff, and another distinct set of agents that worked on invoices. These don’t depend on each other, and should run separately. Both sets of agents use the Sequential Agent pattern.

The appointment agent has sub-agents to look up the weather, and uses that agent as a tool within the primary root agent.

The invoicing agent is more complex with sub-agents to build up HTML out of the chat history, another agent that looks up the best promotional offers to attach to the invoice, and a final agent that generates a PDF.

private SequentialAgent createInvoiceAgent(
            PdfTool pdfTool,
            String mcpServerUrl,
            Resource htmlGeneratorPrompt,
            Resource bestOfferPrompt,
            Resource pdfWriterPrompt
    ) {
        String modelName = properties.getAgent().getModelName();

        LlmAgent htmlGeneratorAgent = LlmAgent.builder().model(modelName).name("htmlGeneratorAgent").description("Generates an HTML invoice from conversation data.").instruction(resourceToString(htmlGeneratorPrompt)).outputKey("invoicehtml").build();

        List<BaseTool> mcpTools = loadMcpTools(mcpServerUrl);

        LlmAgent bestOfferAgent = LlmAgent.builder().model(modelName).name("bestOfferAgent").description("Applies the best offers available to the invoice").instruction(resourceToString(bestOfferPrompt)).tools(mcpTools).outputKey("bestinvoicehtml").build();

        FunctionTool generatePdfTool = FunctionTool.create(PdfTool.class, "generatePdfFromHtml");

        LlmAgent pdfWriterAgent = LlmAgent.builder().model(modelName).name("pdfWriterAgent").description("Creates a PDF from HTML and saves it to cloud storage.").instruction(resourceToString(pdfWriterPrompt)).tools(List.of(generatePdfTool)).build();

        return SequentialAgent.builder().name(properties.getAgent().getAppName()).description("Execute the complete sequence to generate, improve, and publish an PDF invoice to Google Cloud Storage.").subAgents(htmlGeneratorAgent, bestOfferAgent, pdfWriterAgent).build();
    }

How should I connect these agents? I didn’t want hard-coded links between the services, as they can operate async and independently. You could imagine other services being interested in a booking too. So I put Google Cloud Pub/Sub in the middle. I used a push notification (to the invoice agent’s HTTP endpoint), but I’ll probably refactor it and make it a pull subscription that listens for work.

Choice #6 – What’s needed in my agent instructions?

I’m getting better at this. Still not great. But I’m using AI to help me, and learning more about what constraints and direction make the biggest impact.

For the booking agent, my goal was to collect all the data needed, while factoring in constraints such as weather. My agent instructions here included core principles, operational steps, the must-have data to collect, which decisions to make, and how to use the available tools.

root_agent = Agent(
name="root_agent",
model="gemini-2.5-flash",
description="This is the starting agent for Seroter Roofing and customers who want to book a roofing appointment",
instruction=(
"""
You are an AI agent specialized in booking roofing appointments. Your primary goal is to find available appointments for roofing services, and preferably on days where the weather forecast predicts dry weather.

## Core Principles:

* **Information First:** You must gather the necessary information from the user *before* attempting to use any tools.
* **Logical Flow:** Follow the steps outlined below strictly.
* **Professional & Helpful:** Maintain a polite, professional, and helpful tone throughout the interaction.

## Operational Steps:

1. **Greeting:**
* Start by politely greeting the user and stating your purpose (booking roofing appointments).
* *Example:* "Hello! I can help you book a roofing appointment. What kind of service are you looking for today?"

2. **Information Gathering:**
* You need two key pieces of information from the user:
* **Type of Service:** What kind of roofing service is needed? (e.g., repair, replacement, inspection, estimate)
* **Service Location:** What city is the service required in?
* Ask for this information clearly if the user doesn't provide it upfront. You *cannot* proceed to tool usage until you have both the service type and the city.
* *Example follow-up:* "Great, and in which city is the property located?"

3. **Tool Usage - Step 1: Check Appointment Availability (Filtered):**
* Get information about available appointment times:
* **[Use Tool: Appointment availability]** for the specified city.
* **Crucially:** When processing the results from the appointment tool, **filter** the available appointments to show *only* those that fall on the specific dates without rain in the forecast. You should also consider the service type if the booking tool supports filtering by type.

4. **Tool Usage - Step 2: Check Weather Forecast:**
* Once you have the service type and city, your next action is to check the weather.
* **[Use Tool: 7-day weather forecast]** for the specified city.
* Analyze the forecast data returned by the tool. Identify which days within the next 7 days are predicted to be 'sunny' or at least dry. Be specific about what constitutes 'dry' based on the tool's output.

5. **Decision Point 1: Are there Appointments on Dry Days?**
* If the appointment availability tool returns available slots *specifically* on the identified dry days:
* Present these available options clearly to the user, including the date, time, and potentially the service type (if applicable).
* Explain that these options meet the dry weather preference.
* Prompt the user to choose an option to book.
* *Example:* "Great news! The forecast for [City] shows dry weather on [Date 1], [Date 2], etc. I've checked our schedule and found these available appointments on those days: [List appointments]."

* If the appointment availability tool returns slots, but *none* of them fall on the identified sunny days (or if the tool returns no slots at all):
* Inform the user that while there are dry days coming up, there are currently no appointments available on those specific dry dates within the next 7 days.
* Explain that your search was limited to the dry days based on the forecast.
* Suggest they might want to try a different service type (if relevant) or check back later as availability changes.
* *Example:* "While the forecast for [City] does show some dry days coming up, I wasn't able to find any available appointments specifically on those dates within the next week. Our schedule on sunny days is quite popular. Please try again in a few days, as availability changes, or let me know if you need a different type of service."

6. **Confirmation/Booking (If Applicable):**
* Be sure to get the full name and full address of the location for the appointment.

**Tools**
You have access to the following tools to assist you:
`weather_agent`: use this tool to find the upcoming weather forecast and identify rainy days
`api_tool_get_appointments -> json`: use this OpenAPI tool to answer any questions about available appointments
`add_appointment(customer: str, slotid: str, address: str, services: List[str]) -> dict`: use this tool to add a new appointment
"""
),
tools=[agent_tool.AgentTool(weather_agent), api_tool_get_appointments, tools.add_appointment],
)

The invoicing agent had a more complex prompt as I wanted to shape the blob of chat history into a structured JSON and then into valid HTML. Of course, I could have (should have?) structured the raw data before it left the original agent, but I wanted try it this way. My agent instructions show an example of the preferred JSON, and also the valid HTML structure.

**Role:** You are a specialized agent designed to generate an HTML invoice from a successful appointment booking history.

**Task:** Process the entire user prompt, which contains conversation history in a JSON format. Your goal is to create a complete HTML invoice based on the details found in that JSON.

[...]

4.  **Invoice JSON Structure:** The JSON invoice you internally generate **must** strictly adhere to the format provided in the example below. Do not add extra fields or change field names. Ensure numbers are formatted correctly (e.g., 100.00, 0.00).
    ```json
    {
    "invoiceNumber": "INV-BOOKING-[Current Date YYYYMMDD]", // Generate based on date
    "issueDate": [YYYY, M, D], // Current Date
    "dueDate": [YYYY, M, D], // Current Date + 30 days
    "customerName": "[Extracted Customer Name]",
    "customerAddress": "[Extracted Customer Address]",
    "items": [
        {
        "description": "[Description of Booked Service]",
        "quantity": 1,
        "unitPrice": [Price of Service],
        "lineTotal": [Price of Service]
        }
    ],
    "subtotal": [Price of Service],
    "taxAmount": 0.00,
    "summary": "Invoice for booked [Service Name]",
    "totalAmount": [Price of Service]
    }
    ```

[...]

7.  ** Create an HTML string based on the example structure here **
```html
<!DOCTYPE html>
<html>
<head>
	<meta charset="UTF-8" />
	<title>Seroter Roofing Invoice</title>
	<style type="text/css">
		body { font-family: sans-serif; margin: 20px; }
		h1 { color: navy; }
		.header, .customer-info, .summary-block, .footer { margin-bottom: 20px; }
		.invoice-details { margin-top: 20px; padding: 10px; border: 1px solid #ccc; }
		.invoice-details p { margin: 5px 0; }
		table { width: 100%; border-collapse: collapse; margin-top: 20px; }
		.summary-block { padding: 10px; border: 1px dashed #eee; background-color: #f9f9f9; }
		th, td { border: 1px solid #ddd; padding: 8px; text-align: left; }
		th { background-color: #f2f2f2; }
		.text-right { text-align: right; }
	</style>
</head>
<body>
	<h1>Invoice</h1>

	<div class="header">
		<p><strong>Invoice Number:</strong>INV-001</p>
		<p><strong>Date Issued:</strong>January 01, 2024</p>
		<p><strong>Date Due:</strong>January 15, 2024</p>
	</div>

	<div class="customer-info">
		<h2>Bill To:</h2>
		<p>Customer Name</p>
		<p>123 Customer Street, Denver, CO 80012</p>
	</div>

	<div class="summary-block">
		<h2>Summary</h2>
		<p>Details about the appointment and order...</p>
	</div>

	<table>
		<thead>
			<tr>
				<th>Description</th>
				<th>Quantity</th>
				<th>Unit Price</th>
				<th>Line Total</th>
			</tr>
		</thead>
		<tbody>
			<tr >
				<td>Sample Item</td>
				<td class="text-right">1</td>
				<td class="text-right">10.00</td>
				<td class="text-right">10.00</td>
			</tr>
		</tbody>
	</table>

	<div class="invoice-details">
		<p class="text-right"><strong>Subtotal:</strong>>0.00</p>
		<p class="text-right"><strong>Tax:</strong>0.00</p>
		<p class="text-right"><strong>Total Amount:</strong> <strong>$123.45</strong></p>
	</div>
	<div class="footer">
		<p>Thank you for your business!</p>
	</div>
</body>
</html>
```

Doing this “context engineering” well is important. Think through the instructions, data, and tools that you’re giving an agent to work with.

Choice #7 – What’s the right approach to accessing Cloud services?

My agent solution sent data to Pub/Sub (addressed above), but also relied on data sitting in a PostgreSQL database. And PDF blobs sitting in Cloud Storage.

I had at least three implementation options here for PostgreSQL and Cloud Storage:

Function calling. Use functions that call the Cloud APIs directly, and leverage those functions as tools.
Model Context Protocol (MCP). Use MCP servers that act as API proxies for the LLM to use
YOLO mode. Ask the LLM to figure out the right API call to make for the given service.

The last option works (mostly), but would be an absurd choice to make in 99.98% of situations.

The appointment agent calls the Pub/Sub API directly by using that encompassing function as a tool. For the database access, I chose MCP. The MCP Toolbox for Databases is open source and fairly simple to use. It saves me from a lot of boilerplate database access code.

private List<BaseTool> loadMcpTools(String mcpServerUrl) {
        try {
            SseServerParameters params = SseServerParameters.builder().url(mcpServerUrl).build();
            logger.info("Initializing MCP toolset with params: {}", params);
            McpToolset.McpToolsAndToolsetResult result = McpToolset.fromServer(params, new ObjectMapper()).get();
            if (result.getTools() != null && !result.getTools().isEmpty()) {
                logger.info("MCP tools loaded: {}", result.getTools().size());
                return result.getTools().stream().map(mcpTool -> (BaseTool) mcpTool).collect(Collectors.toList());
            }
        } catch (Exception e) {
            logger.error("Error initializing MCP toolset", e);
        }
        return new ArrayList<>();
    }

When creating the PDF and adding it to Cloud Storage, I decided to use a robust function that I passed to the agent as a tool.

private Map<String, Object> generatePdfFromHtmlInternal(String htmlContent) throws IOException {
        if (htmlContent == null || htmlContent.trim().isEmpty()) {
            throw new IllegalArgumentException("HTML content cannot be null or empty.");
        }

        try (ByteArrayOutputStream baos = new ByteArrayOutputStream()) {
            ITextRenderer renderer = new ITextRenderer();
            renderer.setDocumentFromString(htmlContent);
            renderer.layout();
            renderer.createPDF(baos);

            String timestamp = LocalDateTime.now().format(DateTimeFormatter.ofPattern("yyyyMMddHHmmssSSS"));
            String uniquePdfFilename = OUTPUT_PDF_FILENAME.replace(".pdf", "_" + timestamp + ".pdf");
            String bucketName = properties.getGcs().getBucketName();

            BlobId blobId = BlobId.of(bucketName, uniquePdfFilename);
            BlobInfo blobInfo = BlobInfo.newBuilder(blobId).setContentType("application/pdf").build();

            storage.create(blobInfo, baos.toByteArray());

            String gcsPath = "gs://" + bucketName + "/" + uniquePdfFilename;
            logger.info("Successfully generated PDF and uploaded to GCS: {}", gcsPath);
            return Map.of("status", "success", "file_path", gcsPath);

        } catch (DocumentException e) {
            logger.error("Error during PDF document generation", e);
            throw new IOException("Error during PDF document generation: " + e.getMessage(), e);
        } catch (Exception e) {
            logger.error("Error during PDF generation or GCS upload", e);
            throw new IOException("Error during PDF generation or GCS upload: " + e.getMessage(), e);
        }
    }

Choice #8 – How do I package up and run the agents?

This choice may depend on who the agent is for (internal or external audiences), who has to support the agent, and how often you expect to update the agent.

I chose to containerize the components so that I had maximum flexibility. I could have easily used the ADK CLI to deploy directly to Vertex AI Agent Engine—which comes with convenient features like memory management—but wanted more control than that. So I have Dockerfiles for each agent, and deploy them to Google Cloud Run. Here I get easy scale, tons of optional configurations, and I don’t pay for anything when the agent is dormant.

In this case, I’m just treating the agent like any other type of code. You might make a different choice based on your use case.

The final solution in action

Let’s run this thing through. All the source code is sitting in my GitHub repo.

I start by opening the the appointment agent hosted in Cloud Run. I’m using the built-in ADK web UI to have a conversational chat with the initial agent. I mention that I might have a leaky roof and want an inspection or repair. The agent then follows its instructions. After checking the weather in the city I’m in, it retrieves appointments via the API. On the left, there’s a handy set of tools to trace events, do evals, and more.

At this point, I chose an available appointment, and the agent followed it’s next set of instructions. The appointment required two pieces of info (my name, and address), and wouldn’t proceed until I provided it. Once it had the data, it called the right function to make an appointment and publish a message to Pub/Sub.

That data flowed through Google Cloud Pub/Sub, and got pushed to another agent hosted in Cloud Run.

That agent immediately loaded up its MCP tools by calling the MCP server also hosted in Cloud Run. That server retrieved the list of offers for the city in question.

This agent runs unattended in the background, so there’s no chat interface or interactivity. Instead, I can track progress by reading the log stream.

When this agent got done converting the chat blob to JSON, then creating an HTML template, and calling the MCP tools to attach offers, it wrote the final PDF to Cloud Storage.

There you go. It’s not perfect and I have improvements I want to make. Heck, the example here has the wrong date in the invoice, which didn’t happen before. So I need better instructions there. I’d like to switch the second agent from a push to a pull. It’d be fun to add some video or audio intake to the initial agent.

Nobody knows the future, but it looks we’ll be building more agents, and fewer standalone apps. APIs matter more than ever, as do architectural decisions. Make good ones!

July 17, 2025

Quality-focused prompts for the vibe coding addict

You’ve vibe coded an app? Congrats. That’s cool. Maybe you’re completely new to programming, or you’ve been doing it awhile. It’s possible you’ve seen some of the horror stories online of folks vibe coding something, deploying it, and then watching it collapse as folks flooded and hacked it. Yikes. It’s important to pay attention to quality attributes. Sometimes called non-functional requirements, these are key aspects of a well-built software system. They include traits like usability, maintainability, security, reliability, scalability, availability, and flexibility. While you might expect (hope?) that your vibe coding service created software that always adheres to these concepts, you shouldn’t assume so. And either way, these are important characteristics to understand if you want to mature as a software creator.

You can get AI to help here, though. I used Google AI Studio and Gemini 2.5 Pro to generate some prompts you could apply after you’ve built an app using Bolt, Vercel’s v0, Lovable, or Gemini Canvas. Maybe you plug those generated prompts into the Gemini CLI after you’ve downloaded the code from your vibe coding session. Here’s my initial prompt, which you could take and adjust however you want:

I’ve vibe coded an app, and now want to make sure it’s of high enough quality to ship to production. I’m not a professional developer. Suggest 3-5 comprehensive and understandable LLM prompts I should now offer my agentic CLI to check for software quality attributes like usability, reliability, performance, observability, security, repeatability, scalability, availability, flexibility, and maintainability.

Below is the direct output from Gemini, including the proposed prompts that hardens your software while making it more fault tolerant and scalable.

Here are 4 comprehensive prompts designed to be fed to your agentic CLI. They are structured to cover all the quality attributes you mentioned in a logical, step-by-step manner.

The Strategy: From Code to Cloud

We’ll use a layered approach. Think of it like building a house:

The Blueprint & Foundation: Is the code itself clean, organized, and easy to understand? (Maintainability, Flexibility)
The Walls & Locks: Is the application secure and can it handle errors gracefully? (Security, Reliability)
The Engine & Plumbing: Can the app perform under load and tell you when it’s sick? (Performance, Scalability, Observability)
The Assembly Line: Can you build and deploy it consistently every single time? (Repeatability, Availability, Scalability)

Prompt 1: The Code Quality & Maintainability Audit

Goal: This prompt focuses on cleaning up the existing “vibe code.” It makes your app easier for you (and your AI agent) to understand, modify, and extend in the future.

Attributes Covered: Maintainability, Flexibility, Repeatability (at the code level).

Act as a senior software engineer and mentor conducting a thorough code review of my entire project. Your goal is to improve its long-term maintainability, flexibility, and readability. Analyze the whole codebase and provide the following:

1. **Refactoring for Clarity:** Identify the top 5 functions or classes that are too long, complex, or have unclear names ("code smells"). For each, provide a refactored version that is simpler, follows the Single Responsibility Principle, and has clearer naming.
2. **Configuration & Secrets:** Scan the code for any hardcoded configuration values (like API endpoints, database strings, or magic numbers). Suggest extracting them into a centralized configuration file (e.g., `config.js`, `.env`, or `settings.py`) and provide the template for this file. Flag any plain-text secrets immediately.
3. **Dependency Review:** List all external libraries and dependencies. Point out any that are deprecated, have known major issues, or could be replaced by a more standard/modern alternative.
4. **Automated Quality Gates:** Generate a configuration file for a standard linter and code formatter for my project's language (e.g., `.eslintrc.json` and `.prettierrc` for JavaScript/TypeScript, or `pyproject.toml` for Python with Black and Ruff). This ensures future code stays clean.
5. **Documentation:** Generate a template for a `README.md` file that includes a project description, setup instructions for a new developer, and an explanation of the core project structure.

Prompt 2: The Security & Reliability Hardening

Goal: This prompt switches focus to making your app robust against attacks and resilient to failure. It assumes the code is now cleaner from Prompt 1.

Attributes Covered: Security, Reliability, Availability.

Act as a paranoid but helpful security and reliability engineer. Your mission is to identify and help me fix potential security vulnerabilities and sources of unreliability in my application. Analyze the entire codebase and provide a report with actionable code suggestions for the following:

1. **Security Vulnerability Scan (OWASP Top 10):**
* **Input Validation:** Find all points where the application accepts user input (API endpoints, forms, etc.). Check for potential injection vulnerabilities (SQL, NoSQL, Command).
* **Cross-Site Scripting (XSS):** Check if output to the user is properly sanitized or escaped.
* **Authentication/Authorization:** Review how users are authenticated and how their permissions are checked. Look for common flaws.
* **Insecure Dependencies:** Scan my `package.json`, `requirements.txt`, etc., for dependencies with known security vulnerabilities (CVEs) and suggest updated, secure versions.

2. **Error Handling & Reliability:**
* Identify all critical code paths (e.g., database calls, external API requests, file I/O).
* Pinpoint areas lacking proper error handling (e.g., missing `try...catch` blocks or unchecked errors).
* For each area, suggest adding robust error handling that prevents the app from crashing and provides a clear error message or fallback.

3. **Availability Checkpoint:**
* Suggest creating a simple health check endpoint (e.g., `/healthz` or `/status`). This endpoint should return a `200 OK` status if the app is running and can connect to its essential services (like the database). Provide the code for this endpoint.

Prompt 3: The Performance, Scalability & Observability Tune-Up

Goal: Now that the app is clean and secure, let’s make it fast and ensure you can see what’s happening inside it when it’s running.

Attributes Covered: Performance, Scalability, Observability.

Act as a Site Reliability Engineer (SRE) focused on performance and observability. Your goal is to ensure my application can handle growth and that I can diagnose problems in production. Analyze the codebase and suggest improvements in these areas:

1. **Performance Bottlenecks:**
* **Database Queries:** Identify any database queries performed inside loops (N+1 query problem). Suggest how to optimize them into a single, more efficient query.
* **Heavy Computations:** Find any computationally expensive operations or inefficient algorithms that could block the main thread or slow down responses. Suggest optimizations or asynchronous execution.
* **Data Handling:** Look for places where the app loads very large amounts of data into memory at once. Suggest using pagination, streaming, or chunking.

2. **Observability - Logging & Metrics:**
* **Structured Logging:** Review my current logging (or lack thereof). Propose a structured logging strategy (e.g., JSON format). Refactor 3-5 key `console.log` or `print` statements to use this new structured logger, including important context like user ID or request ID.
* **Key Metrics:** Identify the 3 most important metrics for my application's health (e.g., API request latency, error rate, number of active users). Show me where and how to instrument the code to capture these metrics, even if it's just with a logging statement for now.

3. **Scalability Review:**
* Identify anything that would prevent me from running multiple instances of this application (horizontal scaling). This usually involves checking for in-memory state that should be moved to a shared store like a database or Redis (e.g., session stores, caches, locks).

Bonus Prompt 4: The Production-Ready Deployment Blueprint

Goal: This prompt bridges the gap between your laptop and a real server. It makes your deployment process automatic, reliable, and repeatable.

Attributes Covered: Repeatability, Availability, Scalability.

Act as a DevOps specialist. My application is now clean, secure, and performant, but I need a professional and repeatable way to deploy it. Based on my project's language and framework, generate the necessary files and configurations to containerize it and set up a basic deployment pipeline.

1. **Containerization:** Create a multi-stage `Dockerfile`. The first stage should build the application (installing dev dependencies and compiling assets), and the final stage should be a lean, production-optimized image containing only the necessary runtime files and dependencies. Add comments explaining each step.
2. **Local Development Environment:** Create a `docker-compose.yml` file. This file should define my application's service and any other services it depends on (like a database or a cache). This will allow any developer to spin up the entire stack with a single command (`docker-compose up`).
3. **Configuration Management:** Create a `.env.example` file that lists all the environment variables the application needs to run (from Prompt 1), with placeholder values. This file serves as a template for production, staging, and development environments.
4. **Continuous Integration (CI):** Generate a basic CI pipeline configuration file for GitHub Actions (`.github/workflows/ci.yml`). This pipeline should trigger on every push to the main branch and automatically:
* Install dependencies.
* Run the linter and code formatter checks (from Prompt 1).
* (If you have them) Run any automated tests.
* Build the Docker image to ensure it works.

By running your app through these four prompts, you’ll have systematically addressed the most critical software quality attributes, transforming your “vibe-coded” project into something you can be confident shipping to production.

Is this fun to do? I dunno. Maybe not as much fun as watching an agentic service crafting software right before your eyes. But it’s the right thing to do.

July 7, 2025

Here’s what AI-native engineers are doing differently than you

The “what” and the “how” in software engineering occasionally change at the same time. Often, one triggers the other. The introduction of mainframes ushered in batch practices that capitalized on the scarcity of computing power. As the Internet took off, developers needed to quickly update their apps and Agile took hold. Mobile computing and cloud computing happened, and DevOps emerged shortly thereafter. Our current moment seems different as the new “what” and “how” are happening simultaneously, but independently. The “what” that’s hot right now is AI-driven apps. Today’s fast-developing “how” is AI-native software engineering. I’m seeing all sorts of teams adopt AI to change how they work. What are they doing that you’re not?

AI natives always start (or end) with AI. The team at Pulley says “the typical workflow involves giving the task to an AI model first (via Cursor or a CLI program) to see how it performs, with the understanding that plenty of tasks are still hit or miss.” Studying a domain or competitor? Start with Gemini Deep Research or another AI research service. Find yourself stuck in an endless debate over some aspect of design? While you argued, the AI natives built three prototypes with AI to prove out the idea. Googlers are using it to build slides, debug production incidents, and much more. You might say “but I used an LLM before and it hallucinated while generating code with errors in it.” Stop it, so do you. Update your toolchain! Anybody seriously coding with AI today is using agents. Hallucinations are mostly a solved problem with proper context engineering and agentic loops. This doesn’t mean we become intellectually lazy. Learn to code, be an expert, and stay in charge. But it’s about regularly bringing AI in at the right time to make an impact.

I use the Gemini CLI to:

– debug production incidents,
– activate and actuate changes on my CI/CD,
– do risk analysis of changes and postmortems,
– write slides and docs,
– run capacity planning estimates.

is WAY MORE than coding. https://t.co/iqKkhA5Mpo
— Ramón Medrano Llamas (@rmedranollamas) June 26, 2025

AI natives switched to spec-driven development. It’s not about code-first. Heck, we’re practically hiding the code! Modern software engineers are creating (or asking AI) for implementation plans first. My GM at Google Keith Ballinger says he starts projects by “ask[ing] the tool to create a technical design (and save to a file like arch.md) and an implementation plan (saved to tasks.md).” Former Googler Brian Grant wrote a piece where he explained creating 8000 character instructions that steered the agent towards the goal. Those folks at Pulley say that they find themselves “thinking less about writing code and more about writing specifications – translating the ideas in my head into clear, repeatable instructions for the AI.” These design specs have massive follow-on value. Maybe it’s used to generate the requirements doc. Or the first round of product documentation. It might produce the deployment manifest, marketing message, and training deck for the sales field. Today’s best engineers are great at documenting intent that in-turn, spawns the technical solution.

I have been testing and using Gemini CLI in the last weeks. Here's what my setup looks like:

– For complex tasks, I *never* ask for code first. My initial prompt is to create a plan "Create a detailed implementation plan for [FEATURE, BUG]".
– Create multiple hierarchical GEMINI… pic.twitter.com/4brvJqv8nt
— Philipp Schmid (@_philschmid) June 25, 2025

AI natives have different engineer and team responsibilities. With AI agents, you orchestrate. You remain responsible for every commit into main, but focus more on defining and “assigning” the work to get there. Legitimate work is directed to background agents like Jules. Or give the Gemini CLI the task of chewing through an analysis or starting a code migration project. Either way, build lots of the right tools and empower your agents with them. Every engineer is a manager now. And the engineer needs to intentionally shape the codebase so that it’s easier for the AI to work with. That means rule files (e.g. GEMINI.md), good READMEs, and such. This puts the engineer into the role of supervisor, mentor, and validator. AI-first teams are smaller, able to accomplish more, capable of compressing steps of the SDLC and delivering better quality, faster. AI-native teams have “almost eliminated engineering effort as the current bottleneck to shopping product.”

AI Agents turn everyone into a manager. We thought the future of software was AI reviewing and editing our work. But in fact the opposite is true. The future of software is to kick off tasks to AI Agents and review, edit, and orchestrate their work. https://t.co/dAlgt3IlRD
— Aaron Levie (@levie) June 27, 2025

There are many implications for all this. Quality is still paramount. Don’t create slop. but to achieve the throughput, breadth, and quality your customers demand requires a leap forward in your approach. AI is overhyped and under-hyped at the same time, and it’s foolish to see AI as the solution to everything. But it’s a objectively valuable to a new approach. Many teams have already made the shift and have learned to continuously evaluate and incorporate new AI-first approaches. It’s awesome! If you’re ignoring AI entirely, you’re not some heroic code artisan; you’re just being unnecessarily stubborn and falling behind. Get uncomfortable, reassess how you work, and follow the lead of some AI-native pioneers blazing the trail.

July 1, 2025
The Gemini CLI might change how I work. Here are four prompts that prove it.

Yesterday morning, we took the wraps off one of the most interesting Google releases of 2025. The Gemini CLI is here, giving you nearly unlimited access to Gemini from directly within the terminal. This is a new space, but there are other great solutions already out there. Why is this different? Yes, it’s good at multi-step reasoning, code generation, and creative tasks. Build apps, fix code, parse images, build slides, analyze content, or whatever. But what’s truly unique is that It’s fully open source, no cost to use, usable anywhere, and super extensible. Use Gemini 2.5 Pro’s massive context window (1m tokens), multimodality, and strong reasoning ability to do some amazing stuff.

Requirements? Have Node installed, and a Google account. That’s it. You get lots of free queries against our best models. You get more by being a cloud customer if you need it. Let’s have a quick look around, and then I’ll show you four prompts that demonstrate what it can really do.

The slash command shows me what’s available here. I can see and resume previous chats, configure the editor environment, leverage memory via context files like GEMINI.md, change the theme, and use tools. Choosing that option shows us the available tools such as reading files and folders, finding files and folders, performing Google searches, running Shell commands, and more.

The Gemini CLI has many extensibility points, including use of MCP servers. I added the Cloud Run MCP server but you can add anything here.

I’m only scratching the surface here, so don’t forget to check out the official repo, docs, and blog post announcement. But now, let’s walk through four prompts that you can repeat to experience the power of the Gemini CLI, and why each is a big deal.

Prompt #1 – Do some research.

Software engineering is more than coding. You spend time researching, planning, and thinking. I want to build a new app, but I’m not sure which frontend framework I should use. And I don’t want stale answers from an LLM that was trained a year ago.

I’ve got a new research report on JavaScript frameworks, and also want to factor in web results. My prompt:

What JavaScript framework should I use to build my frontend app? I want something simple, standards-friendly, and popular. Use @report.pdf for some context, but also do a web search. Summarize the results in a way that will help me decide.

The Gemini CLI figured out some tools to use, successfully considered the file into the prompt, started off on its work searching the web, and preparing results.

The results were solid. I got tradeoff and analysis on three viable options. The summary was helpful and I could have continued going back and forth on clarifying questions. For architects, team leaders, and engineers, having a research partner in the terminal is powerful.

Why was this a big deal? This prompt showed the use of live Google Search, local (binary) file processing, and in-context learning for devs. These tools are changing how I do quick research.

Prompt #2 – Build an app.

These tools will absolutely change how folks build, fix, change, and modernize software. Let’s build something new.

I fed in this prompt, based on my new understanding of relevant JavaScript frameworks.

Let’s build a calendar app for my family to plan a vacation together. It should let us vote on weeks that work best, and then nominate activities for each day. Use Vue.js for the JavaScript framework.

Now to be sure, we didn’t build this to be excellent at one-shot results. Instead, it’s purposely built for an interactive back-and-forth with the software developer. You can start it with –yolo mode to have it automatically proceed without asking permission to do things, and even with –b to run it headless assuming no interactivity. But I want to stay in control here. So I’m not in YOLO mode.

I quickly got back a plan, and was asked if I wanted to proceed.

Gemini CLI also asks me about running Shell commands. I can allow it once, allow it always, or cancel. I like these options. It’s fun watching Gemini make decisions and narrate what it’s working on. Once it’s done building directories, writing code, and evaluating its results, the CLI even starts up a server so that I can test the application. The first draft was functional, but not attractive, so I asked for a cleanup.

The next result was solid, and I could have continued iterating on new features along with look and feel.

Why was this a big deal? This prompt showed iterative code development, important security (request permission) features, and more. We’ll also frequently offer to pop you into the IDE for further coding. This will change how I understand or bootstrap most of the code I work with.

Prompt #3 – Do a quick deploy to the cloud.

I’m terrible at remembering the syntax and flags for various CLI tools. The right git command or Google Cloud CLI request? Just hopeless. The Gemini CLI is my solution. I can ask for what I want, and the Gemini CLI figures out the right type of request to make.

We added MCP as a first-class citizen, so I added the Cloud Run MCP server, as mentioned above. I also made this work without it, as the Gemini CLI figured out the right way to directly call the Google Cloud CLI (gcloud) to deploy my app. But, MCP servers provide more structure and ensure consistent implementation. Here’s the prompt I tried to get this app deployed. Vibe deployment, FTW.

Ship this code to Cloud Run in us-west1 using my seroter-project-base project. Don’t create a Dockerfile or container, but just deploy the source files.

The Gemini CLI immediately recognizes that a known MCP tool can help, and shows me the tool it chose.

It got going, and shipped my code successfully to Cloud Run using the MCP server. But the app didn’t start correctly. The Gemini CLI noticed that by reading the service logs, and diagnosed the issue. We didn’t provide a reference for which port to listen on. No problem.

It came up with a fix, made the code changes, and redeployed.

Why was this a big deal? We saw the extensibility of MCP servers, and the ability to “forget” some details of exactly how other tools and CLIs work. Plus we observed that the Gemini CLI did some smart reasoning and resolved issues on its own. This is going to change how I deploy, and how much time I spend (waste?) deploying.

Prompt #4 – Do responsible CI/CD to the cloud.

The third prompt was cool and showed how you can quickly deploy to a cloud target, even without knowing the exact syntax to make it happen. I got it working with Kubernetes too. But can the Gemini CLI help me do proper CI/CD, even if I don’t know exactly how to do it? In this case I do know how to set up Google Cloud Build and Cloud Deploy, but let’s pretend I don’t. Here’s the prompt.

Create a Cloud Build file that would build a container out of this app code and store it in Artifact Registry. Then create the necessary Cloud Deploy files that defines a dev and production environment in Cloud Run. Create the Cloud Deploy pipeline, and then reference it in the Cloud Build file so that the deploy happens when a build succeeds. And then go ahead trigger the Cloud Build. Pay very careful attention for how to create the correct files and syntax needed for targeting Cloud Run from Cloud Deploy.

The Gemini CLI started by asking me for some info from my Google Cloud account (project name, target region) and then created YAML files for Cloud Build and Cloud Deploy. It also put together a CLI command to instantiate a Docker repo in Artifact Registry. Now, I know that the setup for Cloud Deploy working with Cloud Run has some specific syntax and formatting. Even with my above command, I can see that I didn’t get syntactically correct YAML in the skaffold file.

I rejected the request of the Gemini CLI to do a deployment, since I knew it would fail. Then I gave it the docs URL for setting up Cloud Run with Cloud Deploy and asked it to make a correction.

That Skaffold file doesn’t look correct. Take a look at the docs (https://cloud.google.com/deploy/docs/deploy-app-run), and follow its guidance for setting up the service YAML files, and referencing the right Skaffold version at the top. Show me the result before pushing a change to the Cloud Deploy pipeline.

Fortunately, the Gemini CLI can do a web fetch and process the latest product documentation. I did a couple of turns and got what I wanted. Then I asked it to go ahead and update the pipeline and trigger Cloud Build.

It failed at first because I didn’t have a Dockerfile, but after realizing that, automatically created one and started the build again.

It took a few iterations of failed builds for the Gemini CLI to land on the right syntax. But it kept dutifully trying, making changes, and redeploy until it got it right. Just like I would have if I were doing it myself!

After that back and forth a few times, I had all the right files, syntax, container artifacts, and pipelines going.

Some of my experiments went faster than others, but that’s the nature of these tools, and I still did this faster overall than I would have manually.

Why was this a big deal? This showcased some sophisticated file creation, iterative improvements, and Gemini CLI’s direct usage of the Google Cloud CLI to package, deploy, and observe running systems in a production-like way. It’ll change how confident I am doing more complex operations.

Background agents, orchestrated agents, conversational AI. All of these will play a part in how we design, build, deploy, and operate software. What does that mean to your team, your systems, and your expectations? We’re about to find out.

June 26, 2025