Category: AI/ML

  • Stop following tutorials and learn by building (with Antigravity) instead

    Don’t get me wrong, I like a good tutorial. Might be in a blog, book, video, or training platform. I’ve probably created a hundred (including in dozens of Pluralsight courses) and consumed a thousand. But lately? I don’t like be constrained by the author’s use case, and I wonder if all I’ve learned how to do is follow someone else’s specific instructions.

    This popped for me twice in the past few days as I finally took some “should try” technologies off my backlog. Instead of hunting for a hello-world tutorial to show me a few attributes of Angular Signals, I simply built a demo app using Google Antigravity. No local infrastructure setup, wrangling with libraries, or figuring out what machinery I needed to actually see the technology in action.

    I did it again a couple of days later! The Go version of the Agent Development Kit came out a bit ago. I’ve been meaning to try it. The walkthrough tutorials are fine, but I wanted something more. So, I just built a working solution right away.

    I still enjoy reading content about how something works. That doesn’t go away. And truly deep learning still requires more than vibe coding an app. But I’m not defaulting to tutorials any more. Instead, I can just feed them into the LLM and build something personalized for me. Here’s an example.

    Take the cases above. I jumped into Google AI Studio to get inspiration on interesting async agent use cases. I liked this one. Create a feed where an agent picks up a news headline and then does some research into related stories before offering some analysis. It’ll read from a queue, and then drop analysis to a Cloud Storage bucket.

    Prompting for agent ideas in Google AI Studio

    With my use case in hand, I jumped into Antigravity to sketch out a design. Notice that I just fed the tutorial link into Antigravity to ensure it’d get seeded with up-to-date info for this new library.

    Sparking an application build in Google Antigravity

    Antigravity started whirring away on creating implementation plans and a task list. Because I can comment on its plans and iterate on the ideas before building begins, I’m not stressed about making the first prompt perfect. Notice here that it flags a big assumption, so I provided a comment confirming that I want a JSON payload for this background worker.

    Google Antigravity provides a way to offer feedback on implementation plans

    After Antigravity started building, I noticed the generated code used a package the IDE flagged as deprecated. I popped into the chat (or I could have commented in the task list) and directed the AI tool to use the latest version and ensure the code still built successfully.

    Steering Google Antigravity to pick a newer version of the library it selected

    Constantly, I’m focused on the outcomes I’m after, not the syntax of agent building. It’s refreshing. When reviewing the code, I started to realize I wanted more data in the incoming payload. A directive later, and my code reflected it.

    Iterating on software with Antigravity

    This started with me wanting to learn ADK for Go. It was easy to review the generated agent code, ask Antigravity questions about it, and see “how” to do it all without typing it all out myself. Will it stick in my brain as much as if I wrote it myself? No. But that wasn’t my goal. I wanted to fit ADK for Go into a real use case.

    Code that sets up an agent in ADK for Go

    This solution should feel “real”, and not just be a vibe-and-go. How about using CI/CD? I never remember the syntax for Google Cloud Build, and getting my pipeline right can swallow up half my dev time. No problem.

    I express my intent for a Cloud Build pipeline, and moments later I have a fully valid YAML definition, along with a generated Dockerfile.

    Antigravity created a Google Cloud Build pipeline for me

    Next I asked Antigravity to add a deployment step so that the container image is pushed to a Cloud Run worker pool after a successful build. I needed to point Antigravity to a tutorial for worker pools for it to know about this new feature.

    I’m using an API key in this solution, and didn’t want that stored as a regular environment variable or visible during deployment. Vibe coding doesn’t have to be insecure. I asked Antigravity to come up with a better way. It chose Google Cloud Secret Manager, gave me the commands to issue, and showed me what the Cloud Run deployment command would now look like.

    Getting a proper, security-conscious deployment command for Cloud Run worker pools

    I then told Antigravity to introduce this updated Cloud Run command to complete the build + deploy pipeline.

    A complete, AI-generated CI/CD pipeline for my agentic app

    Amazing! I wanted to test this out before putting an Angular frontend into the solution. Antigravity reminded my of the right way to format a Cloud Build command given the substitution variables and I was off.

    Building and deploying this agent with Cloud Build

    Within a few minutes, I had a container image in Artifact Registry, and a Cloud Run worker pool listening for work.

    My running workload in Cloud Run worker pools

    To test it out, I needed to publish a message to Google Cloud Pub/Sub. Antigravity gave me a sample JSON message structure that agent expected to receive. I went to Techmeme.com to grab a recent news headline as my source. Pub/Sub has a UI for manually sending a message into a Topic, so I used that.

    Publishing a message to Pub/Sub to trigger my agent

    After a moment, I saw a new JSON doc in my Cloud Storage bucket. Opening it up revealed a set of related news, and some interesting insights.

    News analysis performed by my ADK agent and dropped into a Cloud Storage bucket

    I also wanted to see more of Angular Signals in action, so I started a new project and prompted Antigravity to build out a site where I could submit news stories to my Pub/Sub topic. Once again, I passed in a reference guide into my prompt as context.

    Prompting Antigravity to create a frontend app using Angular Signals

    I asked Antigravity to show me how Angular Signals were used, and even asked it to sketch a diagram of the interaction. This is a much better way to learn a feature than hoping a static tutorial covers everything!

    The first build turned out ok, but I wanted better handling of the calls to Google Cloud Pub/Sub. Specifically, I wanted this executed server side and after adding a comment to the the implementation plan, Antigravity came up with a backend-for-frontend pattern.

    Directing Antigravity to support a backend for calls to Google Cloud

    After a couple of iterations on look-and-feel, and one debugging session which revealed I was using the wrong Pub/Sub topic name, I had a fully working app.

    After starting the server side component and the frontend component, I viewed my app interface.

    The interface Antigravity built

    Grabbing another headline from Techmeme gave me a chance to try this out. Angular Signals seems super smooth.

    Adding a headline and seeing a dynamic frontend in action

    Once again, my Cloud Storage bucket had some related links and analysis generated by ADK agent sitting in Cloud Run worker pools.

    News analysis initiated from my Angular app

    It took my longer to write this post than it did to build a fully working solution. How great is that?

    For me, tutorials are now LLM input only. They’re useful context for LLMs teaching me things or building apps with my direction. How about you?

  • Go from prompt to production using a set of AI tools, or just one (Google Antigravity)

    Go from prompt to production using a set of AI tools, or just one (Google Antigravity)

    We’ve passed the first phase of AI dev tooling. When I first saw AI-assisted code completion and generation, I was wow-ed. Still am. Amazing stuff. Then agentic coding environments went a step further. We could generate entire apps with products like Replit or v0! Following that, we all got new types of agentic IDEs, CLIs, background coding agents, and more. With all these options, there isn’t just one way to work with AI in software engineering.

    I’m noticing that I’m using AI tools to command (perform actions on my environment or codebase), to code (write or review code), and to conduct (coordinate agents who work on my behalf). Whether these are done via separate tools or the same one, this seems like a paradigm that will persist for a while.

    Let’s see this in action. I’ll first do this with a set of popular tools—Google AI Studio, Gemini CLI, Gemini Code Assist, and Jules—and then do the same exercise with the new Google Antigravity agent-first development platform.

    Architecture diagram generated with nano-banana

    I’ve accepted that I’ll never be a professional baseball player. It’s just not in the cards. But can I use AI to help me pretend that I played! Let’s build an application that uses AI to take an uploaded picture and generate images of that person in various real-life baseball situations.

    Build with a set of AI tools

    Gemini 3 Pro is excellent at frontend code and Google AI Studio is a fantastic way to get started building my app. I went to the “Build” section where I could provide a natural language prompt to start vibe-coding my baseball app. Here’s an example of “commanding” with AI tools.

    Google AI Studio

    After a few seconds of thinking, I saw a stash of files created for my application. Then a preview popped up that I could actually interact with.

    Vibe coded app in Google AI Studio

    Jeez, only one prompt and I have an awesome AI app. How cool is that? The Nano Banana model is just remarkable.

    Now I wanted to do more with this app and bring it into my IDE to make some updates before deploying it. In the top right of the screen, there’s a GitHub icon. After I clicked that, I was asked to authenticate with my GitHub account. Next, I had to provide details about which repo to create for this new codebase.

    Create GitHub repo from Google AI Studio

    Then Google AI Studio showed me all the changes it made in the local repo. I get one last chance to review things before staging and committing the changes.

    Push changes to GitHub

    A moment later, I had a fully populated GitHub repo. This gave me the intermediate storage I needed to pick up and continue with my IDE and agentic CLI.

    Vibe coded app code in my GitHub repo

    I jumped into Visual Studio Code with the installed Gemini Code Assist plugin. I’ve also got the Gemini CLI integration set up, so everything is all in one place.

    Visual Studio Code with Gemini Code Assist and the Gemini CLI

    Here, I can command and code my way to a finished app. I could ask (command) for a summary of the application itself and how it’s put together. But even more useful, I issued a command asking for how this app was authenticating with the Gemini API.

    Gemini Code Assist helping me understand the codebase

    Very helpful! Notice that it found a config file that shows a mapping from GEMINI_API_KEY (which is the environment variable I need to set) to the API_KEY referred to in code. Good to know.

    Here’s where I could continue to code my way through the app with AI assistance if there were specific changes I felt like making ahead of deploying it. I wrote a mix of code (and used the Gemini CLI) to add a Node server to serve this static content and access the environment variable from the runtime.

    Let’s do some conducting. I didn’t feel like writing up a whole README and wanted some help from AI. Here’s where Jules comes in, and its extension for the Gemini CLI. Notice that I have Gemini CLI extensions for Jules and Cloud Run already installed.

    Two MCP servers added to the Gemini CLI

    I can go ahead and ask Jules to create a better README, and then continue on my work. Agents working on my behalf!

    Using the Gemini CLI to trigger a background task in Jules

    After doing some other work, I came back and checked the status of the Jules job (/jules status) and saw that the task was done. The Jules extension asked me if I wanted a new branch, or to apply the changes locally. I chose the former option and reviewed the PR before merging.

    Reviewing a branch with a README updated by Jules

    Finally, I was ready to deploy this to Google Cloud Run. Here, I also used a command approach and instructed the Gemini CLI to deploy this app with the help of the extension for Cloud Run.

    Using a natural language request from me, the Gemini CLI crafted the correct gcloud CLI command to deploy my app.

    Doing a deployment to Cloud Run from the Gemini CLI

    That finished in a few seconds, and I had my vibe-coded app, with some additional changes, deployed and running in Google Cloud.

    App running on Google Cloud

    So we commanded Google AI Studio to build the fundamentals of the app, used Gemini Code Assist and the Gemini CLI to code and command towards deployment, and Jules to conduct background agents on our behalf. Not particularly difficult, and the handoffs via a Git repo worked well.

    This process works great if you have distinct roles with handoffs (designer –> developer –> deployment team) or want to use distinct products at each stage.

    Build with Google Antigravity

    Google Antigravity isn’t a code editor. It’s not an IDE. It’s something more. Yes, you can edit code and do classic IDE things. What’s different is that it’s agent-first, and supports a rich set of surfaces in a single experience. I can kick off a series of agents to do work, trigger Computer Use in a dedicated browser, and extend behavior through MCP servers. Basically, I can do everything I did above, but within a single experience.

    Starting point with Google Antigravity

    I fed it the same prompt I gave to Google AI Studio. Immediately, Google Antigravity got to work building an implementation plan.

    Giving a prompt to Antigravity to build out an application

    I love that I can review this implementation plan, and add comments to sections I want to update. This feels like a very natural way to iterate on this specification. Right away, I asked for Node server host for this app, and am building it that way from the start.

    Implementation Plan, with comments

    The AI agent recognizes my comments and refreshes its plans.

    Antigravity using the Implementation Plan to begin its work

    At this point, the agent is rolling. It built out the entire project structure, created all the code files, and plowed through its task list. Yes, it creates and maintains a task list so we can track what’s going on.

    Task List maintained by Antigravity

    The “Agent Manager” interface is wild. From here I can see my inbox of agent tasks, and monitor what my agents are currently doing. This one is running shell commands.

    Agent Manager view for triggering and managing agent work

    The little “drawer” at the bottom of the main chat window also keeps tabs on what’s going on across all the various agents. Here I could see what docs need my attention, which processes are running (e.g. web servers), and which artifacts are part of the current conversation.

    View of processes, documents, and conversation artifacts

    The whole app building processed finished in just a few minutes. It looked good! And because Google Antigravity has built-in support for Computer Use with a Chrome browser, it launched a browser instance and showed me how the app worked. I can also prompt Computer Use interactions any time via chat.

    Computer Use driving the finished application

    Antigravity saved the steps it followed into an artifact called Walkthrough. Including a screenshot!

    Generated walkthrough including screenshots

    How about fixing the README? In the previous example, I threw that to a background task in Jules. I could still do that here, but Antigravity is also adept at doing asynchronous work. I went into the Agent Manager and asked for a clean README with screenshots and diagrams. Then I closed Agent Manager and did some other things. Never breaking flow!

    Triggering a background agent to update the README

    Later, I noticed that the work was completed. The Agent Manager showed me what it did, and gave me a preview of the finished README. Nice job.

    Finished README with diagrams and screenshots

    I wanted to see the whole process through, so how about using Google Antigravity to deploy this final app to Google Cloud Run?

    This product also supports extension via MCP. During this product preview, it comes with a couple dozen MCP servers in the “MCP Store.” These include ones for Google products, Figma, GitHub, Stripe, Notion, Supabase, and more.

    MCP servers available out of the box

    We don’t yet include one for Cloud Run, but I can add that myself. The “manage MCP servers” is empty to start, but it shows you the format you need to add to the configuration file. I added the configuration for the local Cloud Run MCP server.

    Configuration for the Cloud Run MCP server

    After saving that configuration, I refreshed the “manage MCP servers” screen and saw all the tools at my disposal.

    Tools available from the Cloud Run MCP server

    Sweet! I went back to the chat window and asked Google Antigravity to deploy this app to Cloud Run.

    Antigravity deploying the app to Google Cloud Run

    The first time, the deployment failed but Google Antigravity picked up the error and updated the app to start on the proper port and tweak how it handled wildcard paths. It then redeployed, and worked.

    Chat transcript of attempt to deploy to Google Cloud Run

    Fantastic. Sure enough, browsing the URL showed my app running and working flawlessly. Without a doubt, this would have been hours or days of work for me. Especially on the frontend stuff since I’m terrible at it. Instead, the whole process took less than an hour.

    Finished application running in Google Cloud Run

    I’m very impressed! For at least the next few years, software engineering will likely include a mix of commands, coding, and conducting. As I showed you here, you can do that with distinct tools that enable distinct stages and offer one or more of those paradigms. Products like Google Antigravity offer a fresh perspective, and make it possible to design, build, optimize, and deploy all from one product. And I can now seamlessly issue commands, write code, and conduct agents without ever breaking flow. Pretty awesome.

  • Vibing options for whoever you are

    Vibing options for whoever you are

    My favorite movie is Tommy Boy. I apologize for nothing. You won’t find this flick listed on the top 100 movies of all time, and some people think it’s a dumb movie. That’s ok. Look, I don’t really get the Dave Matthews Band. They seem like nice chaps, but it’s not my thing. But they have a massive following of superfans. Everything isn’t for everyone. Not every product is built for you.

    I recently looked at which AI coding tool was the right fit for a given situation. But what about Google’s hefty portfolio of products for those who want to vibe code and let AI take the wheel? In that case, it’s not just about the given situation, but also the type of person. Not every product is for each type of user.

    Vibe coding is expanding the pool of people who build apps. It seems to consist of consumers who are non-tech folks who want to bring ideas to life. There are tech-adjacent professionals who do “knowledge work” and might be business analysts, product managers, program leads, and executives. And then you have software developers who have deep understanding of tech, and want to quickly produce new software.

    What are we vibe-coding? To me, it seems like we’re building throwaway prototypes to just try something out quickly. We’re creating personal software that’s meant to improve our own productivity. And we’re delivering multi-user apps that are intended for legit use by others. So in my mind, it’s a grid like this, with my take on which Google tech fits where:

    Throw-away prototypesPersonal softwareMulti-user apps
    Consumersn/aGemini Canvas
    Gemini Gems
    Gemini Canvas
    Tech-adjacent professionalsGemini Canvas
    Google AI Studio
    Opal
    Google AI Studio
    Agentspace
    Gemini Canvas
    Google AI Studio
    Opal
    Firebase Studio
    Software developersGoogle AI Studio
    Gemini CLI
    Gemini CLI
    Gemini Code Assist
    Gemini CLI
    Gemini Code Assist
    Jules

    Vibing as consumers

    I’m not sure consumers are interested in throwaway prototypes. My non-tech friends wouldn’t want to geek out on tech. They have some sort of goal to solve a problem.

    What about consumers building “personal software” that acts as a web app, agent, or tool? Sure. Gemini Canvas seems like a good choice for this. And, for building simple apps to share with others. Gemini Gems are a tool for building personal AI assistants without needing to be a tech expert. Some are creating fun consumer-grade demos with Google AI Studio, so I wouldn’t complain if you added that product into this row as well.

    Let’s look at Gemini Canvas. You activate this in Gemini when you choose to “build.”

    Let me provide it a simple prompt:

    Beautiful exercise tracking app that lets me record the exercises I did in a given day, and for how long. I can also view past days and see trends over time.

    When I enter that prompt, Gemini gets to work. It creates a single-file app where the code is not the focus. You can see the code, but it quickly switches the UI to a preview of the app. I can make changes via the chat and get a live look at the changes.

    This is a fun to use, simple interface that’s consumer friendly. It’s easy to trigger, very fast at generating apps, has basic abilities to rollback changes, and offers sharing via a public link. It’s very opinionated on the tech stack, the code is all stuffed into a single artifact, and you don’t get many legit deployment options. Great for consumers who are building personal software or simple apps for a small group to use.

    Vibing as tech-adjacent professionals

    I think it’s awesome that anyone within a company can be a builder. It doesn’t matter if the HR person, executive assistant, program manager, or VP doesn’t know how to code. They can use Gemini Canvas as I showed above, along with other tools.

    Some who have some tech familiarity might jump to Google AI Studio. It’s free to use and fantastic for builders. From the “build” menu, you can trigger a vibe coding experience that keeps the focus on the outcome, not the code.

    I’ll use the same prompt as above, but you also get a few other configuration options, including the ability to choose between React or Angular.

    Once I submit the prompt, Google AI Studio gets to work thinking through a plan and building out the components. The interface is terrific here. What’s different from Gemini Canvas is that you get a professionally structured project with code arranged across files. It takes longer to get to a Preview because it’s doing more (responsible) work, but it’s still very fast.

    I like the suggestions offered for the app (above chat box), easy ability to download the app, GitHub integrations, and one-click deploys to Google Cloud Run. The code editor is basic, so I wouldn’t use this for sophisticated builds, but that’s not what it’s for.

    Opal is a new Google experiment for building “mini-AI apps” and isn’t a standard vibe coding tool. Think of it as a way to build apps that are focused on generating content with AI.

    My exercise tracking tool doesn’t make a ton of sense here. One use case for Opal could be to generate text for sharing content on each social media site.

    Opal lets you define what you need to collect from the user, assets you have available (YouTube videos, documents, and more), can perform web searches, generate all sorts of media, and aggregate results.

    It’s simple to preview and share these apps, and I like the concept. It’s not a traditional “vibe coding” tool, but I can see where non-developers would like using it to bring ideas to life.

    You could also consider Agentspace a vibing tool for office workers. Agentspace is a unique platform for those who want a more useful and effective internal experience for getting work done. Besides offering a AI-enabled search and research, it also has an agent-building experience for those who want personal agents.

    The agent builder interface is entirely no-code, and lets you tap into public internet searches, along with private enterprise data sources and tools. This is super helpful for those who want to automate repeatable tasks or build personal productivity solutions.

    The final option I put into this row is Firebase Studio. This is a service that’s completely applicable to software developers, but also friendly to those who aren’t professionals in this space. Unlike most of the options I’ve listed so far, this isn’t only for front-end solutions. I can build backend Go or Java apps too. It also offers a vibing UI where you can start with a prompt and build the app. I’ll use the same prompt I did earlier.

    When you start vibe coding here, Firebase Studio shares an app blueprint and then gets to work. I wouldn’t give this to a consumer persona—there’s still technical know-how you’d need to have—built it’s approachable to those who aren’t full-on software developers.

    Vibing as software developers

    If you’re a developer, you can use any of the options above. You might love the simplicity of Gemini Canvas, or prefer the opinionated Google AI Studio environment. Sounds good, live your life.

    Many software people want to vibe code with tools already in their toolchain. Your best bets with Google are the Gemini CLI and Gemini Code Assist.

    The Gemini CLI is a command line interface that’s free to use. You can authenticate with your Google ID (as an individual or corporate user), bring a Google AI Studio API key, or use Google Cloud Vertex AI. It has built-in tools (Google Search, shell, reading files, etc), supports MCP, has configurable memory, and can run anywhere. We recently added a supporting GitHub Action so that you can use it in your code repo. And Zed just integrated it into their next-gen code editor.

    The Gemini CLI is a great vibe coding tool. I could use the same prompt above, but also ask for a technical spec first, define my language/framework preferences, and generally steer the build the way I want.

    When I want maximum power during my vibe coding sessions (like I had yesterday), i use a combination of Gemini Code Assist with the Gemini CLI rolled in. Killer combo, as I get the conversational AI mode of Gemini Code Assist in my IDE/editor, but also the rich agentic power of the Gemini CLI with shared context. Yesterday I vibe coded a complete “travel app” after using the CLI to generate a spec and then incrementally implementing it with the CLI, and taking over control in the editor when I needed to.

    Wrap up

    There’s no wrong answer here. Use what fits your situation, and the role you’re playing. If you’re goofing around and just want a frontend app, anyone should use things like Google AI Studio. If you only want a code-centric experience for every situation, stay with IDE-style tools and CLIs. But I love that it’s so much simpler for people of any skill level to realize their ideas through software thanks to a range of vibe coding tools that suit each person’s taste.

  • Code was the least interesting part of my multi-agent app, and here’s what that means to me

    Code was the least interesting part of my multi-agent app, and here’s what that means to me

    At least 80% of the code I’ve ever written could have been written by AI, probably at higher quality. I’ve been “in tech” for twenty seven years and spent seven of those as a software developer. Even when I stopped getting paid for it, I never stopped coding. But little of it’s been truly novel; most of my code has been straightforward database access code, web APIs, presentation logic, and a handful of reasonably-complex systems. No doubt, many of you have done truly sophisticated things in code—compilers, performance-tuned algorithms, language frameworks—and AI isn’t replacing that any time soon. But I’d bet that much of the interesting tech work is moving away from raw code, and towards higher-order architecture.

    I wanted to build out an agentic solution, and I used AI to generate 90% of the code. That code isn’t where the unique value was at. None of it was particularly noteworthy. You can find the whole app here. The most interesting work related to architectural decisions. Here are eight choices I had to make, and I suspect you’ll have fun wrestling with the same ones.

    Choice #1 – What am I trying to accomplish and do agents make sense?

    My goal was to build an app that could take in a customer’s roofing needs, create a service appointment, and generate a personalized invoice for the work. I’m cheating here, since this exercise started as “Richard wants to learn some agent tech.” So I did start with the end in mind. Judge me accordingly.

    But in every legit situation, we start by evaluating the user need. What functional requirements do I need to satisfy? What performance or quality attributes are necessary? Can I solve this with a simple service, or modular monolith? Is the user flow deterministic or variable?

    This scenario could certainly be solved by a simple data collection form and PDF generator. What requirements might make an agentic architecture the truly correct choice?

    • Data collection from the user requires image, video, and audio input to best scope the services and pricing we should offer.
    • The scheduling or invoicing process requires a dynamic workflow based on a variety of factors, and hard-coding all the conditions would be tricky.

    Either way, this is always a critical choice before you write a single line of code.

    Choice #2 – What data or services are available to work with?

    Before we build anything new, what do we already have at our disposal?

    In my case, let’s assume I already have an appointments web API for retrieving available appointment times and making new appointments. I’ve also got an existing database that stores promotional offers that I want to conditionally add to my customer invoice. And I’ve got an existing Cloud Storage bucket where I store customer invoice PDFs.

    It’s easy to just jump into the application build, but pause for a few moments and take stock of your existing inventory and what you can build around.

    Choice #3 – What (agent) framework should I use and why?

    So. Many. Choices.

    There’s AI app frameworks like Genkit, LlamaIndex, and Spring AI. There are agent frameworks like LangChain, LangGraph, Autogen, CrewAI, and more. Google recently shipped the Agent Development Kit, available for Python and Java developers. An agent built with something like ADK is basically made up of three things: a model, instructions, and tools. ADK adds sweeteners that give you a lot of flexibility. Things I like about ADK:

    And look, I like it because my employer invests in it. So, that’s a big factor. I also wanted to build agents in both Python and Java, and this made ADK a great choice.

    Don’t get married to any framework, but learn the fundamentals of tool use, memory management, and agent patterns.

    Choice #4 – How should I use tools in the appointment agent?

    I suspect that tool selection will be a fascinating area for many builders in the years ahead. In this scenario, I had some decisions to make.

    I don’t want to book any roof repairs on rainy days. But where can I get the weather forecast from? I chose the built-in Google Search tool instead of trying to find some weather API on the internet.

    weather_agent = Agent(
        name="weather_agent",
        model="gemini-2.0-flash",
        description=(
            "Agent answers questions about the current and future weather in any city"
        ),
        instruction=(
            "You are an agent for Seroter Roofing. You can answer user questions about the weather in their city right now or in the near future"
    ),
        tools=[google_search],
    )
    

    For interacting with my existing appointments API, what’s the right tool choice? Using the OpenAPI tool baked into the ADK, I can just hand the agent an OpenAPI spec and it’ll figure out the right functions to call. For retrieving open appointment times, that’s a straightforward choice.

    openapi_spec = openapi_spec_template.replace("{API_BASE_URL}", config.API_BASE_URL)
    
    toolset = OpenAPIToolset(spec_str=openapi_spec, spec_str_type="json")
    api_tool_get_appointments = toolset.get_tool("get_available_appointments")
    

    But what about booking appointments? While that’s also an API operation, I want to piggyback a successful booking with a message to Google Cloud Pub/Sub that downstream subscribers can read from. That’s not part of the appointments API (nor should it be). Instead, I think a function tool makes sense here, where I manually invoke the appointments API, and then make as subsequent call to Pub/Sub.

    def add_appointment(customer: str, slotid: str, address: str, services: List[str], tool_context: ToolContext) -> dict:
        """Adds a roofing appointment by calling the booking API and logs the conversation history.
    
        This function serves as a tool for the agent. It orchestrates the booking process by:
        1. Calling the internal `_book_appointment_api_call` function to make the actual API request.
        2. If the booking is successful, it retrieves the conversation history from the
           `tool_context` and logs it to a Pub/Sub topic via `_log_history_to_pubsub`.
    
        Args:
            customer: The name of the customer.
            slotid: The ID of the appointment slot to book.
            address: The full address for the appointment.
            services: A list of services to be booked for the appointment.
            tool_context: The context provided by the ADK, containing session information.
    
        Returns:
            A dictionary containing the booking confirmation details from the API,
            or an error dictionary if the booking failed.
        """
        booking_response = _book_appointment_api_call(customer, slotid, address, services)
    
        if "error" not in booking_response:
            history_list: List[Event] = tool_context._invocation_context.session.events # type: ignore
            _log_history_to_pubsub(history_list)
        
        return booking_response
    

    Choice #5 – When/how do I separate agent boundaries?

    There’s a good chance that an agentic app has more than one agent. Stuffing everything into a single agent with a complex prompt and a dozen tools seems … suboptimal.

    But multi-agent doesn’t have to mean you’re sliding into a distributed system. You can include multiple agents in the same process space and deployment artifact. The Sequential Agent pattern in the ADK makes it simple to define distinct agents that run one and at time. So it seems wise to think of service boundaries for your agents, and only make a hard split when the context changes.

    For me, that meant one set of agents handling all the appointment stuff, and another distinct set of agents that worked on invoices. These don’t depend on each other, and should run separately. Both sets of agents use the Sequential Agent pattern.

    The appointment agent has sub-agents to look up the weather, and uses that agent as a tool within the primary root agent.

    The invoicing agent is more complex with sub-agents to build up HTML out of the chat history, another agent that looks up the best promotional offers to attach to the invoice, and a final agent that generates a PDF.

    private SequentialAgent createInvoiceAgent(
                PdfTool pdfTool,
                String mcpServerUrl,
                Resource htmlGeneratorPrompt,
                Resource bestOfferPrompt,
                Resource pdfWriterPrompt
        ) {
            String modelName = properties.getAgent().getModelName();
    
            LlmAgent htmlGeneratorAgent = LlmAgent.builder().model(modelName).name("htmlGeneratorAgent").description("Generates an HTML invoice from conversation data.").instruction(resourceToString(htmlGeneratorPrompt)).outputKey("invoicehtml").build();
    
            List<BaseTool> mcpTools = loadMcpTools(mcpServerUrl);
    
            LlmAgent bestOfferAgent = LlmAgent.builder().model(modelName).name("bestOfferAgent").description("Applies the best offers available to the invoice").instruction(resourceToString(bestOfferPrompt)).tools(mcpTools).outputKey("bestinvoicehtml").build();
    
            FunctionTool generatePdfTool = FunctionTool.create(PdfTool.class, "generatePdfFromHtml");
    
            LlmAgent pdfWriterAgent = LlmAgent.builder().model(modelName).name("pdfWriterAgent").description("Creates a PDF from HTML and saves it to cloud storage.").instruction(resourceToString(pdfWriterPrompt)).tools(List.of(generatePdfTool)).build();
    
            return SequentialAgent.builder().name(properties.getAgent().getAppName()).description("Execute the complete sequence to generate, improve, and publish an PDF invoice to Google Cloud Storage.").subAgents(htmlGeneratorAgent, bestOfferAgent, pdfWriterAgent).build();
        }
    

    How should I connect these agents? I didn’t want hard-coded links between the services, as they can operate async and independently. You could imagine other services being interested in a booking too. So I put Google Cloud Pub/Sub in the middle. I used a push notification (to the invoice agent’s HTTP endpoint), but I’ll probably refactor it and make it a pull subscription that listens for work.

    Choice #6 – What’s needed in my agent instructions?

    I’m getting better at this. Still not great. But I’m using AI to help me, and learning more about what constraints and direction make the biggest impact.

    For the booking agent, my goal was to collect all the data needed, while factoring in constraints such as weather. My agent instructions here included core principles, operational steps, the must-have data to collect, which decisions to make, and how to use the available tools.

    root_agent = Agent(
        name="root_agent",
        model="gemini-2.5-flash",
        description="This is the starting agent for Seroter Roofing and customers who want to book a roofing appointment",
        instruction=(
            """
    You are an AI agent specialized in booking roofing appointments. Your primary goal is to find available appointments for roofing services, and preferably on days where the weather forecast predicts dry weather.
    
    ## Core Principles:
    
        *   **Information First:** You must gather the necessary information from the user *before* attempting to use any tools.
        *   **Logical Flow:** Follow the steps outlined below strictly.
        *   **Professional & Helpful:** Maintain a polite, professional, and helpful tone throughout the interaction.
    
    ## Operational Steps:
    
    1.  **Greeting:**
        *   Start by politely greeting the user and stating your purpose (booking roofing appointments).
        *   *Example:* "Hello! I can help you book a roofing appointment. What kind of service are you looking for today?"
    
    2.  **Information Gathering:**
        *   You need two key pieces of information from the user:
            *   **Type of Service:** What kind of roofing service is needed? (e.g., repair, replacement, inspection, estimate)
            *   **Service Location:** What city is the service required in?
        *   Ask for this information clearly if the user doesn't provide it upfront. You *cannot* proceed to tool usage until you have both the service type and the city.
        *   *Example follow-up:* "Great, and in which city is the property located?"
    
    3.  **Tool Usage - Step 1: Check Appointment Availability (Filtered):**
        *   Get information about available appointment times:
        *   **[Use Tool: Appointment availability]** for the specified city.
        *   **Crucially:** When processing the results from the appointment tool, **filter** the available appointments to show *only* those that fall on the specific dates without rain in the forecast. You should also consider the service type if the booking tool supports filtering by type.
    
    4.  **Tool Usage - Step 2: Check Weather Forecast:**
        *   Once you have the service type and city, your next action is to check the weather.
        *   **[Use Tool: 7-day weather forecast]** for the specified city.
        *   Analyze the forecast data returned by the tool. Identify which days within the next 7 days are predicted to be 'sunny' or at least dry. Be specific about what constitutes 'dry' based on the tool's output.
    
    5.  **Decision Point 1: Are there Appointments on Dry Days?**
        *   If the appointment availability tool returns available slots *specifically* on the identified dry days:
            *   Present these available options clearly to the user, including the date, time, and potentially the service type (if applicable).
            *   Explain that these options meet the dry weather preference.
            *   Prompt the user to choose an option to book.
            *   *Example:* "Great news! The forecast for [City] shows dry weather on [Date 1], [Date 2], etc. I've checked our schedule and found these available appointments on those days: [List appointments]."
    
        *   If the appointment availability tool returns slots, but *none* of them fall on the identified sunny days (or if the tool returns no slots at all):
            *   Inform the user that while there are dry days coming up, there are currently no appointments available on those specific dry dates within the next 7 days.
            *   Explain that your search was limited to the dry days based on the forecast.
            *   Suggest they might want to try a different service type (if relevant) or check back later as availability changes.
            *   *Example:* "While the forecast for [City] does show some dry days coming up, I wasn't able to find any available appointments specifically on those dates within the next week. Our schedule on sunny days is quite popular. Please try again in a few days, as availability changes, or let me know if you need a different type of service."
    
    6.  **Confirmation/Booking (If Applicable):**
        *   Be sure to get the full name and full address of the location for the appointment.
             
    **Tools**
        You have access to the following tools to assist you:
        `weather_agent`: use this tool to find the upcoming weather forecast and identify rainy days
        `api_tool_get_appointments -> json`: use this OpenAPI tool to answer any questions about available appointments
        `add_appointment(customer: str, slotid: str, address: str, services: List[str]) -> dict`: use this tool to add a new appointment
    """
        ),
        tools=[agent_tool.AgentTool(weather_agent), api_tool_get_appointments, tools.add_appointment],
    )
    

    The invoicing agent had a more complex prompt as I wanted to shape the blob of chat history into a structured JSON and then into valid HTML. Of course, I could have (should have?) structured the raw data before it left the original agent, but I wanted try it this way. My agent instructions show an example of the preferred JSON, and also the valid HTML structure.

    **Role:** You are a specialized agent designed to generate an HTML invoice from a successful appointment booking history.
    
    **Task:** Process the entire user prompt, which contains conversation history in a JSON format. Your goal is to create a complete HTML invoice based on the details found in that JSON.
    
    [...]
    
    4.  **Invoice JSON Structure:** The JSON invoice you internally generate **must** strictly adhere to the format provided in the example below. Do not add extra fields or change field names. Ensure numbers are formatted correctly (e.g., 100.00, 0.00).
        ```json
        {
        "invoiceNumber": "INV-BOOKING-[Current Date YYYYMMDD]", // Generate based on date
        "issueDate": [YYYY, M, D], // Current Date
        "dueDate": [YYYY, M, D], // Current Date + 30 days
        "customerName": "[Extracted Customer Name]",
        "customerAddress": "[Extracted Customer Address]",
        "items": [
            {
            "description": "[Description of Booked Service]",
            "quantity": 1,
            "unitPrice": [Price of Service],
            "lineTotal": [Price of Service]
            }
        ],
        "subtotal": [Price of Service],
        "taxAmount": 0.00,
        "summary": "Invoice for booked [Service Name]",
        "totalAmount": [Price of Service]
        }
        ```
    
    [...]
    
    7.  ** Create an HTML string based on the example structure here **
    ```html
    <!DOCTYPE html>
    <html>
    <head>
    	<meta charset="UTF-8" />
    	<title>Seroter Roofing Invoice</title>
    	<style type="text/css">
    		body { font-family: sans-serif; margin: 20px; }
    		h1 { color: navy; }
    		.header, .customer-info, .summary-block, .footer { margin-bottom: 20px; }
    		.invoice-details { margin-top: 20px; padding: 10px; border: 1px solid #ccc; }
    		.invoice-details p { margin: 5px 0; }
    		table { width: 100%; border-collapse: collapse; margin-top: 20px; }
    		.summary-block { padding: 10px; border: 1px dashed #eee; background-color: #f9f9f9; }
    		th, td { border: 1px solid #ddd; padding: 8px; text-align: left; }
    		th { background-color: #f2f2f2; }
    		.text-right { text-align: right; }
    	</style>
    </head>
    <body>
    	<h1>Invoice</h1>
    
    	<div class="header">
    		<p><strong>Invoice Number:</strong>INV-001</p>
    		<p><strong>Date Issued:</strong>January 01, 2024</p>
    		<p><strong>Date Due:</strong>January 15, 2024</p>
    	</div>
    
    	<div class="customer-info">
    		<h2>Bill To:</h2>
    		<p>Customer Name</p>
    		<p>123 Customer Street, Denver, CO 80012</p>
    	</div>
    
    	<div class="summary-block">
    		<h2>Summary</h2>
    		<p>Details about the appointment and order...</p>
    	</div>
    
    	<table>
    		<thead>
    			<tr>
    				<th>Description</th>
    				<th>Quantity</th>
    				<th>Unit Price</th>
    				<th>Line Total</th>
    			</tr>
    		</thead>
    		<tbody>
    			<tr >
    				<td>Sample Item</td>
    				<td class="text-right">1</td>
    				<td class="text-right">10.00</td>
    				<td class="text-right">10.00</td>
    			</tr>
    		</tbody>
    	</table>
    
    	<div class="invoice-details">
    		<p class="text-right"><strong>Subtotal:</strong>>0.00</p>
    		<p class="text-right"><strong>Tax:</strong>0.00</p>
    		<p class="text-right"><strong>Total Amount:</strong> <strong>$123.45</strong></p>
    	</div>
    	<div class="footer">
    		<p>Thank you for your business!</p>
    	</div>
    </body>
    </html>
    ```
    

    Doing this “context engineering” well is important. Think through the instructions, data, and tools that you’re giving an agent to work with.

    Choice #7 – What’s the right approach to accessing Cloud services?

    My agent solution sent data to Pub/Sub (addressed above), but also relied on data sitting in a PostgreSQL database. And PDF blobs sitting in Cloud Storage.

    I had at least three implementation options here for PostgreSQL and Cloud Storage:

    • Function calling. Use functions that call the Cloud APIs directly, and leverage those functions as tools.
    • Model Context Protocol (MCP). Use MCP servers that act as API proxies for the LLM to use
    • YOLO mode. Ask the LLM to figure out the right API call to make for the given service.

    The last option works (mostly), but would be an absurd choice to make in 99.98% of situations.

    The appointment agent calls the Pub/Sub API directly by using that encompassing function as a tool. For the database access, I chose MCP. The MCP Toolbox for Databases is open source and fairly simple to use. It saves me from a lot of boilerplate database access code.

    private List<BaseTool> loadMcpTools(String mcpServerUrl) {
            try {
                SseServerParameters params = SseServerParameters.builder().url(mcpServerUrl).build();
                logger.info("Initializing MCP toolset with params: {}", params);
                McpToolset.McpToolsAndToolsetResult result = McpToolset.fromServer(params, new ObjectMapper()).get();
                if (result.getTools() != null && !result.getTools().isEmpty()) {
                    logger.info("MCP tools loaded: {}", result.getTools().size());
                    return result.getTools().stream().map(mcpTool -> (BaseTool) mcpTool).collect(Collectors.toList());
                }
            } catch (Exception e) {
                logger.error("Error initializing MCP toolset", e);
            }
            return new ArrayList<>();
        }
    

    When creating the PDF and adding it to Cloud Storage, I decided to use a robust function that I passed to the agent as a tool.

    private Map<String, Object> generatePdfFromHtmlInternal(String htmlContent) throws IOException {
            if (htmlContent == null || htmlContent.trim().isEmpty()) {
                throw new IllegalArgumentException("HTML content cannot be null or empty.");
            }
    
            try (ByteArrayOutputStream baos = new ByteArrayOutputStream()) {
                ITextRenderer renderer = new ITextRenderer();
                renderer.setDocumentFromString(htmlContent);
                renderer.layout();
                renderer.createPDF(baos);
    
                String timestamp = LocalDateTime.now().format(DateTimeFormatter.ofPattern("yyyyMMddHHmmssSSS"));
                String uniquePdfFilename = OUTPUT_PDF_FILENAME.replace(".pdf", "_" + timestamp + ".pdf");
                String bucketName = properties.getGcs().getBucketName();
    
                BlobId blobId = BlobId.of(bucketName, uniquePdfFilename);
                BlobInfo blobInfo = BlobInfo.newBuilder(blobId).setContentType("application/pdf").build();
    
                storage.create(blobInfo, baos.toByteArray());
    
                String gcsPath = "gs://" + bucketName + "/" + uniquePdfFilename;
                logger.info("Successfully generated PDF and uploaded to GCS: {}", gcsPath);
                return Map.of("status", "success", "file_path", gcsPath);
    
            } catch (DocumentException e) {
                logger.error("Error during PDF document generation", e);
                throw new IOException("Error during PDF document generation: " + e.getMessage(), e);
            } catch (Exception e) {
                logger.error("Error during PDF generation or GCS upload", e);
                throw new IOException("Error during PDF generation or GCS upload: " + e.getMessage(), e);
            }
        }
    

    Choice #8 – How do I package up and run the agents?

    This choice may depend on who the agent is for (internal or external audiences), who has to support the agent, and how often you expect to update the agent.

    I chose to containerize the components so that I had maximum flexibility. I could have easily used the ADK CLI to deploy directly to Vertex AI Agent Engine—which comes with convenient features like memory management—but wanted more control than that. So I have Dockerfiles for each agent, and deploy them to Google Cloud Run. Here I get easy scale, tons of optional configurations, and I don’t pay for anything when the agent is dormant.

    In this case, I’m just treating the agent like any other type of code. You might make a different choice based on your use case.

    The final solution in action

    Let’s run this thing through. All the source code is sitting in my GitHub repo.

    I start by opening the the appointment agent hosted in Cloud Run. I’m using the built-in ADK web UI to have a conversational chat with the initial agent. I mention that I might have a leaky roof and want an inspection or repair. The agent then follows its instructions. After checking the weather in the city I’m in, it retrieves appointments via the API. On the left, there’s a handy set of tools to trace events, do evals, and more.

    At this point, I chose an available appointment, and the agent followed it’s next set of instructions. The appointment required two pieces of info (my name, and address), and wouldn’t proceed until I provided it. Once it had the data, it called the right function to make an appointment and publish a message to Pub/Sub.

    That data flowed through Google Cloud Pub/Sub, and got pushed to another agent hosted in Cloud Run.

    That agent immediately loaded up its MCP tools by calling the MCP server also hosted in Cloud Run. That server retrieved the list of offers for the city in question.

    This agent runs unattended in the background, so there’s no chat interface or interactivity. Instead, I can track progress by reading the log stream.

    When this agent got done converting the chat blob to JSON, then creating an HTML template, and calling the MCP tools to attach offers, it wrote the final PDF to Cloud Storage.

    There you go. It’s not perfect and I have improvements I want to make. Heck, the example here has the wrong date in the invoice, which didn’t happen before. So I need better instructions there. I’d like to switch the second agent from a push to a pull. It’d be fun to add some video or audio intake to the initial agent.

    Nobody knows the future, but it looks we’ll be building more agents, and fewer standalone apps. APIs matter more than ever, as do architectural decisions. Make good ones!

  • Quality-focused prompts for the vibe coding addict

    You’ve vibe coded an app? Congrats. That’s cool. Maybe you’re completely new to programming, or you’ve been doing it awhile. It’s possible you’ve seen some of the horror stories online of folks vibe coding something, deploying it, and then watching it collapse as folks flooded and hacked it. Yikes. It’s important to pay attention to quality attributes. Sometimes called non-functional requirements, these are key aspects of a well-built software system. They include traits like usability, maintainability, security, reliability, scalability, availability, and flexibility. While you might expect (hope?) that your vibe coding service created software that always adheres to these concepts, you shouldn’t assume so. And either way, these are important characteristics to understand if you want to mature as a software creator.

    You can get AI to help here, though. I used Google AI Studio and Gemini 2.5 Pro to generate some prompts you could apply after you’ve built an app using Bolt, Vercel’s v0, Lovable, or Gemini Canvas. Maybe you plug those generated prompts into the Gemini CLI after you’ve downloaded the code from your vibe coding session. Here’s my initial prompt, which you could take and adjust however you want:

    I’ve vibe coded an app, and now want to make sure it’s of high enough quality to ship to production. I’m not a professional developer. Suggest 3-5 comprehensive and understandable LLM prompts I should now offer my agentic CLI to check for software quality attributes like usability, reliability, performance, observability, security, repeatability, scalability, availability, flexibility, and maintainability.

    Below is the direct output from Gemini, including the proposed prompts that hardens your software while making it more fault tolerant and scalable.


    Here are 4 comprehensive prompts designed to be fed to your agentic CLI. They are structured to cover all the quality attributes you mentioned in a logical, step-by-step manner.

    The Strategy: From Code to Cloud

    We’ll use a layered approach. Think of it like building a house:

    1. The Blueprint & Foundation: Is the code itself clean, organized, and easy to understand? (Maintainability, Flexibility)
    2. The Walls & Locks: Is the application secure and can it handle errors gracefully? (Security, Reliability)
    3. The Engine & Plumbing: Can the app perform under load and tell you when it’s sick? (Performance, Scalability, Observability)
    4. The Assembly Line: Can you build and deploy it consistently every single time? (Repeatability, Availability, Scalability)

    Prompt 1: The Code Quality & Maintainability Audit

    Goal: This prompt focuses on cleaning up the existing “vibe code.” It makes your app easier for you (and your AI agent) to understand, modify, and extend in the future.

    Attributes Covered: Maintainability, Flexibility, Repeatability (at the code level).

    Act as a senior software engineer and mentor conducting a thorough code review of my entire project. Your goal is to improve its long-term maintainability, flexibility, and readability. Analyze the whole codebase and provide the following:

    1. **Refactoring for Clarity:** Identify the top 5 functions or classes that are too long, complex, or have unclear names ("code smells"). For each, provide a refactored version that is simpler, follows the Single Responsibility Principle, and has clearer naming.
    2. **Configuration & Secrets:** Scan the code for any hardcoded configuration values (like API endpoints, database strings, or magic numbers). Suggest extracting them into a centralized configuration file (e.g., `config.js`, `.env`, or `settings.py`) and provide the template for this file. Flag any plain-text secrets immediately.
    3. **Dependency Review:** List all external libraries and dependencies. Point out any that are deprecated, have known major issues, or could be replaced by a more standard/modern alternative.
    4. **Automated Quality Gates:** Generate a configuration file for a standard linter and code formatter for my project's language (e.g., `.eslintrc.json` and `.prettierrc` for JavaScript/TypeScript, or `pyproject.toml` for Python with Black and Ruff). This ensures future code stays clean.
    5. **Documentation:** Generate a template for a `README.md` file that includes a project description, setup instructions for a new developer, and an explanation of the core project structure.

    Prompt 2: The Security & Reliability Hardening

    Goal: This prompt switches focus to making your app robust against attacks and resilient to failure. It assumes the code is now cleaner from Prompt 1.

    Attributes Covered: Security, Reliability, Availability.

    Act as a paranoid but helpful security and reliability engineer. Your mission is to identify and help me fix potential security vulnerabilities and sources of unreliability in my application. Analyze the entire codebase and provide a report with actionable code suggestions for the following:

    1. **Security Vulnerability Scan (OWASP Top 10):**
    * **Input Validation:** Find all points where the application accepts user input (API endpoints, forms, etc.). Check for potential injection vulnerabilities (SQL, NoSQL, Command).
    * **Cross-Site Scripting (XSS):** Check if output to the user is properly sanitized or escaped.
    * **Authentication/Authorization:** Review how users are authenticated and how their permissions are checked. Look for common flaws.
    * **Insecure Dependencies:** Scan my `package.json`, `requirements.txt`, etc., for dependencies with known security vulnerabilities (CVEs) and suggest updated, secure versions.

    2. **Error Handling & Reliability:**
    * Identify all critical code paths (e.g., database calls, external API requests, file I/O).
    * Pinpoint areas lacking proper error handling (e.g., missing `try...catch` blocks or unchecked errors).
    * For each area, suggest adding robust error handling that prevents the app from crashing and provides a clear error message or fallback.

    3. **Availability Checkpoint:**
    * Suggest creating a simple health check endpoint (e.g., `/healthz` or `/status`). This endpoint should return a `200 OK` status if the app is running and can connect to its essential services (like the database). Provide the code for this endpoint.

    Prompt 3: The Performance, Scalability & Observability Tune-Up

    Goal: Now that the app is clean and secure, let’s make it fast and ensure you can see what’s happening inside it when it’s running.

    Attributes Covered: Performance, Scalability, Observability.

    Act as a Site Reliability Engineer (SRE) focused on performance and observability. Your goal is to ensure my application can handle growth and that I can diagnose problems in production. Analyze the codebase and suggest improvements in these areas:

    1. **Performance Bottlenecks:**
    * **Database Queries:** Identify any database queries performed inside loops (N+1 query problem). Suggest how to optimize them into a single, more efficient query.
    * **Heavy Computations:** Find any computationally expensive operations or inefficient algorithms that could block the main thread or slow down responses. Suggest optimizations or asynchronous execution.
    * **Data Handling:** Look for places where the app loads very large amounts of data into memory at once. Suggest using pagination, streaming, or chunking.

    2. **Observability - Logging & Metrics:**
    * **Structured Logging:** Review my current logging (or lack thereof). Propose a structured logging strategy (e.g., JSON format). Refactor 3-5 key `console.log` or `print` statements to use this new structured logger, including important context like user ID or request ID.
    * **Key Metrics:** Identify the 3 most important metrics for my application's health (e.g., API request latency, error rate, number of active users). Show me where and how to instrument the code to capture these metrics, even if it's just with a logging statement for now.

    3. **Scalability Review:**
    * Identify anything that would prevent me from running multiple instances of this application (horizontal scaling). This usually involves checking for in-memory state that should be moved to a shared store like a database or Redis (e.g., session stores, caches, locks).

    Bonus Prompt 4: The Production-Ready Deployment Blueprint

    Goal: This prompt bridges the gap between your laptop and a real server. It makes your deployment process automatic, reliable, and repeatable.

    Attributes Covered: Repeatability, Availability, Scalability.

    Act as a DevOps specialist. My application is now clean, secure, and performant, but I need a professional and repeatable way to deploy it. Based on my project's language and framework, generate the necessary files and configurations to containerize it and set up a basic deployment pipeline.

    1. **Containerization:** Create a multi-stage `Dockerfile`. The first stage should build the application (installing dev dependencies and compiling assets), and the final stage should be a lean, production-optimized image containing only the necessary runtime files and dependencies. Add comments explaining each step.
    2. **Local Development Environment:** Create a `docker-compose.yml` file. This file should define my application's service and any other services it depends on (like a database or a cache). This will allow any developer to spin up the entire stack with a single command (`docker-compose up`).
    3. **Configuration Management:** Create a `.env.example` file that lists all the environment variables the application needs to run (from Prompt 1), with placeholder values. This file serves as a template for production, staging, and development environments.
    4. **Continuous Integration (CI):** Generate a basic CI pipeline configuration file for GitHub Actions (`.github/workflows/ci.yml`). This pipeline should trigger on every push to the main branch and automatically:
    * Install dependencies.
    * Run the linter and code formatter checks (from Prompt 1).
    * (If you have them) Run any automated tests.
    * Build the Docker image to ensure it works.

    By running your app through these four prompts, you’ll have systematically addressed the most critical software quality attributes, transforming your “vibe-coded” project into something you can be confident shipping to production.


    Is this fun to do? I dunno. Maybe not as much fun as watching an agentic service crafting software right before your eyes. But it’s the right thing to do.

  • Here’s what AI-native engineers are doing differently than you

    The “what” and the “how” in software engineering occasionally change at the same time. Often, one triggers the other. The introduction of mainframes ushered in batch practices that capitalized on the scarcity of computing power. As the Internet took off, developers needed to quickly update their apps and Agile took hold. Mobile computing and cloud computing happened, and DevOps emerged shortly thereafter. Our current moment seems different as the new “what” and “how” are happening simultaneously, but independently. The “what” that’s hot right now is AI-driven apps. Today’s fast-developing “how” is AI-native software engineering. I’m seeing all sorts of teams adopt AI to change how they work. What are they doing that you’re not?

    AI natives always start (or end) with AI. The team at Pulley says “the typical workflow involves giving the task to an AI model first (via Cursor or a CLI program) to see how it performs, with the understanding that plenty of tasks are still hit or miss.” Studying a domain or competitor? Start with Gemini Deep Research or another AI research service. Find yourself stuck in an endless debate over some aspect of design? While you argued, the AI natives built three prototypes with AI to prove out the idea. Googlers are using it to build slides, debug production incidents, and much more. You might say “but I used an LLM before and it hallucinated while generating code with errors in it.” Stop it, so do you. Update your toolchain! Anybody seriously coding with AI today is using agents. Hallucinations are mostly a solved problem with proper context engineering and agentic loops. This doesn’t mean we become intellectually lazy. Learn to code, be an expert, and stay in charge. But it’s about regularly bringing AI in at the right time to make an impact.

    AI natives switched to spec-driven development. It’s not about code-first. Heck, we’re practically hiding the code! Modern software engineers are creating (or asking AI) for implementation plans first. My GM at Google Keith Ballinger says he starts projects by “ask[ing] the tool to create a technical design (and save to a file like arch.md) and an implementation plan (saved to tasks.md).” Former Googler Brian Grant wrote a piece where he explained creating 8000 character instructions that steered the agent towards the goal. Those folks at Pulley say that they find themselves “thinking less about writing code and more about writing specifications – translating the ideas in my head into clear, repeatable instructions for the AI.” These design specs have massive follow-on value. Maybe it’s used to generate the requirements doc. Or the first round of product documentation. It might produce the deployment manifest, marketing message, and training deck for the sales field. Today’s best engineers are great at documenting intent that in-turn, spawns the technical solution.

    AI natives have different engineer and team responsibilities. With AI agents, you orchestrate. You remain responsible for every commit into main, but focus more on defining and “assigning” the work to get there. Legitimate work is directed to background agents like Jules. Or give the Gemini CLI the task of chewing through an analysis or starting a code migration project. Either way, build lots of the right tools and empower your agents with them. Every engineer is a manager now. And the engineer needs to intentionally shape the codebase so that it’s easier for the AI to work with. That means rule files (e.g. GEMINI.md), good READMEs, and such. This puts the engineer into the role of supervisor, mentor, and validator. AI-first teams are smaller, able to accomplish more, capable of compressing steps of the SDLC and delivering better quality, faster. AI-native teams have “almost eliminated engineering effort as the current bottleneck to shopping product.”

    There are many implications for all this. Quality is still paramount. Don’t create slop. but to achieve the throughput, breadth, and quality your customers demand requires a leap forward in your approach. AI is overhyped and under-hyped at the same time, and it’s foolish to see AI as the solution to everything. But it’s a objectively valuable to a new approach. Many teams have already made the shift and have learned to continuously evaluate and incorporate new AI-first approaches. It’s awesome! If you’re ignoring AI entirely, you’re not some heroic code artisan; you’re just being unnecessarily stubborn and falling behind. Get uncomfortable, reassess how you work, and follow the lead of some AI-native pioneers blazing the trail.

  • The Gemini CLI might change how I work. Here are four prompts that prove it.

    The Gemini CLI might change how I work. Here are four prompts that prove it.

    Yesterday morning, we took the wraps off one of the most interesting Google releases of 2025. The Gemini CLI is here, giving you nearly unlimited access to Gemini from directly within the terminal. This is a new space, but there are other great solutions already out there. Why is this different? Yes, it’s good at multi-step reasoning, code generation, and creative tasks. Build apps, fix code, parse images, build slides, analyze content, or whatever. But what’s truly unique is that It’s fully open source, no cost to use, usable anywhere, and super extensible. Use Gemini 2.5 Pro’s massive context window (1m tokens), multimodality, and strong reasoning ability to do some amazing stuff.

    Requirements? Have Node installed, and a Google account. That’s it. You get lots of free queries against our best models. You get more by being a cloud customer if you need it. Let’s have a quick look around, and then I’ll show you four prompts that demonstrate what it can really do.

    The slash command shows me what’s available here. I can see and resume previous chats, configure the editor environment, leverage memory via context files like GEMINI.md, change the theme, and use tools. Choosing that option shows us the available tools such as reading files and folders, finding files and folders, performing Google searches, running Shell commands, and more.

    The Gemini CLI has many extensibility points, including use of MCP servers. I added the Cloud Run MCP server but you can add anything here.

    I’m only scratching the surface here, so don’t forget to check out the official repo, docs, and blog post announcement. But now, let’s walk through four prompts that you can repeat to experience the power of the Gemini CLI, and why each is a big deal.

    Prompt #1 – Do some research.

    Software engineering is more than coding. You spend time researching, planning, and thinking. I want to build a new app, but I’m not sure which frontend framework I should use. And I don’t want stale answers from an LLM that was trained a year ago.

    I’ve got a new research report on JavaScript frameworks, and also want to factor in web results. My prompt:

    What JavaScript framework should I use to build my frontend app? I want something simple, standards-friendly, and popular. Use @report.pdf for some context, but also do a web search. Summarize the results in a way that will help me decide.

    The Gemini CLI figured out some tools to use, successfully considered the file into the prompt, started off on its work searching the web, and preparing results.

    The results were solid. I got tradeoff and analysis on three viable options. The summary was helpful and I could have continued going back and forth on clarifying questions. For architects, team leaders, and engineers, having a research partner in the terminal is powerful.

    Why was this a big deal? This prompt showed the use of live Google Search, local (binary) file processing, and in-context learning for devs. These tools are changing how I do quick research.

    Prompt #2 – Build an app.

    These tools will absolutely change how folks build, fix, change, and modernize software. Let’s build something new.

    I fed in this prompt, based on my new understanding of relevant JavaScript frameworks.

    Let’s build a calendar app for my family to plan a vacation together. It should let us vote on weeks that work best, and then nominate activities for each day. Use Vue.js for the JavaScript framework.

    Now to be sure, we didn’t build this to be excellent at one-shot results. Instead, it’s purposely built for an interactive back-and-forth with the software developer. You can start it with –yolo mode to have it automatically proceed without asking permission to do things, and even with –b to run it headless assuming no interactivity. But I want to stay in control here. So I’m not in YOLO mode.

    I quickly got back a plan, and was asked if I wanted to proceed.

    Gemini CLI also asks me about running Shell commands. I can allow it once, allow it always, or cancel. I like these options. It’s fun watching Gemini make decisions and narrate what it’s working on. Once it’s done building directories, writing code, and evaluating its results, the CLI even starts up a server so that I can test the application. The first draft was functional, but not attractive, so I asked for a cleanup.

    The next result was solid, and I could have continued iterating on new features along with look and feel.

    Why was this a big deal? This prompt showed iterative code development, important security (request permission) features, and more. We’ll also frequently offer to pop you into the IDE for further coding. This will change how I understand or bootstrap most of the code I work with.

    Prompt #3 – Do a quick deploy to the cloud.

    I’m terrible at remembering the syntax and flags for various CLI tools. The right git command or Google Cloud CLI request? Just hopeless. The Gemini CLI is my solution. I can ask for what I want, and the Gemini CLI figures out the right type of request to make.

    We added MCP as a first-class citizen, so I added the Cloud Run MCP server, as mentioned above. I also made this work without it, as the Gemini CLI figured out the right way to directly call the Google Cloud CLI (gcloud) to deploy my app. But, MCP servers provide more structure and ensure consistent implementation. Here’s the prompt I tried to get this app deployed. Vibe deployment, FTW.

    Ship this code to Cloud Run in us-west1 using my seroter-project-base project. Don’t create a Dockerfile or container, but just deploy the source files.

    The Gemini CLI immediately recognizes that a known MCP tool can help, and shows me the tool it chose.

    It got going, and shipped my code successfully to Cloud Run using the MCP server. But the app didn’t start correctly. The Gemini CLI noticed that by reading the service logs, and diagnosed the issue. We didn’t provide a reference for which port to listen on. No problem.

    It came up with a fix, made the code changes, and redeployed.

    Why was this a big deal? We saw the extensibility of MCP servers, and the ability to “forget” some details of exactly how other tools and CLIs work. Plus we observed that the Gemini CLI did some smart reasoning and resolved issues on its own. This is going to change how I deploy, and how much time I spend (waste?) deploying.

    Prompt #4 – Do responsible CI/CD to the cloud.

    The third prompt was cool and showed how you can quickly deploy to a cloud target, even without knowing the exact syntax to make it happen. I got it working with Kubernetes too. But can the Gemini CLI help me do proper CI/CD, even if I don’t know exactly how to do it? In this case I do know how to set up Google Cloud Build and Cloud Deploy, but let’s pretend I don’t. Here’s the prompt.

    Create a Cloud Build file that would build a container out of this app code and store it in Artifact Registry. Then create the necessary Cloud Deploy files that defines a dev and production environment in Cloud Run. Create the Cloud Deploy pipeline, and then reference it in the Cloud Build file so that the deploy happens when a build succeeds. And then go ahead trigger the Cloud Build. Pay very careful attention for how to create the correct files and syntax needed for targeting Cloud Run from Cloud Deploy.

    The Gemini CLI started by asking me for some info from my Google Cloud account (project name, target region) and then created YAML files for Cloud Build and Cloud Deploy. It also put together a CLI command to instantiate a Docker repo in Artifact Registry. Now, I know that the setup for Cloud Deploy working with Cloud Run has some specific syntax and formatting. Even with my above command, I can see that I didn’t get syntactically correct YAML in the skaffold file.

    I rejected the request of the Gemini CLI to do a deployment, since I knew it would fail. Then I gave it the docs URL for setting up Cloud Run with Cloud Deploy and asked it to make a correction.

    That Skaffold file doesn’t look correct. Take a look at the docs (https://cloud.google.com/deploy/docs/deploy-app-run), and follow its guidance for setting up the service YAML files, and referencing the right Skaffold version at the top. Show me the result before pushing a change to the Cloud Deploy pipeline.

    Fortunately, the Gemini CLI can do a web fetch and process the latest product documentation. I did a couple of turns and got what I wanted. Then I asked it to go ahead and update the pipeline and trigger Cloud Build.

    It failed at first because I didn’t have a Dockerfile, but after realizing that, automatically created one and started the build again.

    It took a few iterations of failed builds for the Gemini CLI to land on the right syntax. But it kept dutifully trying, making changes, and redeploy until it got it right. Just like I would have if I were doing it myself!

    After that back and forth a few times, I had all the right files, syntax, container artifacts, and pipelines going.

    Some of my experiments went faster than others, but that’s the nature of these tools, and I still did this faster overall than I would have manually.

    Why was this a big deal? This showcased some sophisticated file creation, iterative improvements, and Gemini CLI’s direct usage of the Google Cloud CLI to package, deploy, and observe running systems in a production-like way. It’ll change how confident I am doing more complex operations.

    Background agents, orchestrated agents, conversational AI. All of these will play a part in how we design, build, deploy, and operate software. What does that mean to your team, your systems, and your expectations? We’re about to find out.

  • From code to cloud: Check out six new integrations that make it easier to host your apps and models on Cloud Run

    From code to cloud: Check out six new integrations that make it easier to host your apps and models on Cloud Run

    Where you decide to run your web app is often a late-binding choice. Once you’ve finished coding something you like and done some localhost testing, you seek out a reasonable place that gives you a public IP address. Developers have no shortage of runtime host options, including hyperscalers, rented VMs from cheap regional providers, or targeted services from the likes of Firebase, Cloudflare, Vercel, Netlify, Fly.io, and a dozen others. I’m an unapologetic fanboy of Google Cloud Run—host scale-to-zero apps, functions, and jobs that offer huge resource configurations, concurrent calls, GPUs, and durable volumes with a generous free tier and straightforward pricing—and we just took the wraps of a handful of new ways to take a pile of code and turn it into a cloud endpoint.

    Vibe-code a web app in Google AI Studio and one-click deploy to Cloud Run

    Google AI Studio is really a remarkable. Build text prompts against our leading models, generate media with Gemini models, and even build apps. All at no cost. We just turned on the ability to do simple text-to-app scenarios, and added a button that deploys your app to Cloud Run.

    First, I went to the “Build” pane and added a text prompt for my new app. I wanted a motivational quote printed on top of an image of an AI generated dog.

    In one shot, I got the complete app including the correct backend AI calls to Gemini models for creating the motivational quote and generating a dog pic. So cool.

    Time to ship it. There’s rocket ship icon on the top right. Assuming you’ve connected Google AI Studio to a Google Cloud account, you’re able to pick a project and one-click deploy.

    It takes just a few seconds, and you get back the URL and a deep link to the app in Google Cloud.

    Clicking that link shows that this is a standard Cloud Run instance, with the Gemini key helpfully added as an environment variable (versus hard coded!).

    And of course, viewing the associated link takes me to my app that gives me simple motivation and happy dogs.

    That’s such a simple development loop!

    Create an .NET app in tools like Cursor and deploy it using the Cloud Run MCP server

    Let’s say you’re using one of the MANY agentic development tools that make it simpler to code with AI assistance. Lots of you like Cursor. It supports MCP as a way to reach into other systems via tools.

    We just shipped a Cloud Run MCP server, so you can make tools like Cursor aware of Cloud Run and support straightforward deployments.

    I started in Cursor and asked it to build a simple REST API and picked Gemini 2.5 Pro as my preferred model. Cursor does most (all?) of the coding work for you if you want it to.

    It went through a few iterations to land on the right code. I tested it locally to ensure the app would run.

    Cursor has native support for MCP. I added a .cursor directory to my project and dropped in a mcp.json file in there. Cursor picked up the MCP entry, validated it, and showed me the available tools.

    I asked Cursor to deploy my C# app. It explored the local folder and files to ensure it had what it needed.

    Cursor realized it had a tool that could help, and proposed the “deploy_local_folder” tool from the Cloud Run MCP server.

    After providing some requested values (location, etc), Cursor successfully deployed my .NET app.

    That was easy. And this Cloud Run MCP server will work with any of your tools that understand MCP.

    Push an open model from Google AI Studio directly to Cloud Run

    Want to deploy a model to Cloud Run? It’s the only serverless platform I know of that offers GPUs. You can use tools like Ollama to deploy any open model to Cloud Run, and I like that we made even easier for Gemma fans. To see this integration, you pick various Gemma 3 editions in Google AI Studio.

    Once you’ve done that, you’ll see a new icon that triggers a deployment directly to Cloud Run. Within minutes, you have an elastic endpoint providing inference.

    It’s not hard to deploy open models to Cloud Run. This option makes it that much easier.

    Deploy an Python agent built with the Agent Development Kit to Cloud Run with one command

    The Agent Development Kit is an open source framework and toolset that devs use to build robust AI agents. The Python version reached 1.0 yesterday, and we launched a new Java version too. Here, I started with a Python agent I built.

    Built into ADK are a few deployment options. It’s just code, so you can run it anywhere. But we’ve added shortcuts to services like Google Cloud’s Vertex AI Agent Engine and Cloud Run. Just one command puts my agent onto Cloud Run!

    We don’t yet have this CLI deployment option for the Java ADK. But it’s also simple to use the Google Cloud CLI command to deploy a Java app or agent to Cloud Run with one command too.

    Services like Cloud Run are ideal for your agents and AI apps. These built-in integrations for ADK help you get these agents online quickly.

    Use a Gradio instance in Cloud Run to experiment with prompts after one click from Vertex AI Studio

    How do you collaborate or share prompts with teammates? Maybe you’re using something like Google Cloud Vertex AI to iterate on a prompt yourself. Here, I wrote system instructions and a prompt for helping me prioritize my work items.

    Now, I can click “deploy an app” and get a Gradio instance for experimenting further with my app.

    This has public access by default, so I’ve got to give the ok.

    After a few moments, I have a running Cloud Run app! I’m shown this directly from Vertex AI and have a link to open the app.

    That link brings me to this Gradio instance that I can share with teammates.

    The scalable and accessible Cloud Run is ideal for spontaneous exploration of things like AI prompts. I like this integration!

    Ship your backend Java code to Cloud Run directly from Firebase Studio

    Our final example looks at Firebase Studio. Have you tried this yet? It’s a free to use, full-stack dev environment in the cloud for nearly any type of app. And it supports text-to-app scenarios if you don’t want to do much coding yourself. There are dozens of templates, including one for Java.

    I spun up a Java dev environment to build a web service.

    This IDE will look familiar. Bring in your favorite extensions, and we’ve also pre-loaded this with Gemini assistance, local testing tools, and more. See here that I used Gemini to add a new REST endpoint to my Java API.

    Here on the left is an option to deploy to Cloud Run!

    After authenticating to my cloud account and picking my cloud project, I could deploy. After a few moments, I had another running app in Cloud Run, and had a route to make continuous updates.

    Wow. That’s a lot of ways to go from code to cloud. Cloud Run is terrific for frontend or backend components, functions or apps, open source or commercial products. Try one of these integrations and tell me what you think!

  • What does a modern, AI-assisted developer workflow built around Google Gemini look like? Let’s explore.

    What does a modern, AI-assisted developer workflow built around Google Gemini look like? Let’s explore.

    Software is never going to be the same. Why would we go back to laborious research efforts, wasting time writing boilerplate code, and accepting so many interruptions to our flow state? Hard pass. It might not happen for you tomorrow, next month, or next year, but AI will absolutely improve your developer workflow.

    Your AI-powered workflow may make use of more than one LLMs. Go for it. But we’ve done a good job of putting Gemini into nearly every stage of the new way of working. Let’s look at what you can do RIGHT NOW to build with Gemini.

    Build knowledge, plans, and prototypes with Gemini

    Are you still starting your learning efforts with a Google search? Amateurs 🙂 I mean, keep doing those so that we earn ad dollars. But you’ve got so many new ways to augment a basic search.

    Gemini Deep Research is pretty amazing. Part of Gemini Advanced, it takes your query, searches the web on your behalf, and gives you a summary in minutes. Here I asked for help understanding the landscape of PostgreSQL providers, and it recapped results found in 240+ relevant websites from vendors, Reddit, analyst, and more.

    Gemini Deep Research creating a report about the PostgreSQL landscape

    You’ve probably heard of NotebookLM. Built with Gemini 2.0, it takes all sorts of digital content and helps you make sense of it. Including those hyper-realistic podcasts (“Audio Overviews”).

    Planning your work or starting to flesh out a prototype? For free, Google AI Studio lets you interact with the latest Gemini models. Generate text, audio, or images from prompts. Produce complex codebases based on reference images or text prompts. Share your desktop and get live assistance on whatever task you’re doing. It’s pretty rad.

    Google AI Studio’s Live API makes it possible to interact live with the model

    Google Cloud customers can get knowledge from Gemini in a few ways. The chat for Gemini Cloud Assist gives me an ever-present agent that can help answer questions or help me explore options. Here, I asked for a summary of the options for running PostgreSQL in Google Cloud. It breaks the response down by fully-managed, self-managed, and options for migration.

    Chat for Gemini Code Assist teaches me about PostgreSQL options

    Gemini for Google Cloud blends AI-assistance into many different services. One way to use this is to understand existing SQL scripts, workflows, APIs, and more.

    Gemini in BigQuery explains an existing query and helps me learn about it

    Trying to plan out your next bit of work? Google AI Studio or Vertex AI Studio can assist here too. In either service, you can pass in your backlog of features and bugs, maybe an architecture diagram or two, and even some reference PDFs, and ask for help planning out the next sprint. Pretty good!

    Vertex AI Studio “thinking” through a sprint plan based on multi-modal input

    Build apps and agents with Gemini

    We can use Google AI Studio or Vertex AI Studio to learn things and craft plans, but now let’s look at how you’d actually build apps with Gemini.

    You can work with the raw Gemini API. There are SDK libraries for Python, Node, Go, Dart, Swift, and Android. If you’re working with Gemini 2.0 and beyond, there’s a new unified SDK that works with both the Developer API and Enterprise API (Vertex). It’s fairly easy to use. I wrote a Google Cloud Function that uses the unified Gemini API to generate dinner recipes for whatever ingredients you pass in.

    package function
    
    import (
    	"context"
    	"encoding/json"
    	"fmt"
    	"log"
    	"net/http"
    	"os"
    
    	"github.com/GoogleCloudPlatform/functions-framework-go/functions"
    	"google.golang.org/genai"
    )
    
    func init() {
    	functions.HTTP("GenerateRecipe", generateRecipe)
    }
    
    func generateRecipe(w http.ResponseWriter, r *http.Request) {
    	ctx := context.Background()
    	ingredients := r.URL.Query().Get("ingredients")
    
    	if ingredients == "" {
    		http.Error(w, "Please provide ingredients in the query string, like this: ?ingredients=pork, cheese, tortilla", http.StatusBadRequest)
    		return
    	}
    
    	projectID := os.Getenv("PROJECT_ID")
    	if projectID == "" {
    		projectID = "default" // Provide a default, but encourage configuration
    	}
    
    	location := os.Getenv("LOCATION")
    	if location == "" {
    		location = "us-central1" // Provide a default, but encourage configuration
    	}
    
    	client, err := genai.NewClient(ctx, &genai.ClientConfig{
    		Project:  projectID,
    		Location: location,
    		Backend:  genai.BackendVertexAI,
    	})
    	//add error check for err
    	if err != nil {
    		log.Printf("error creating client: %v", err)
    		http.Error(w, "Failed to create Gemini client", http.StatusInternalServerError)
    		return
    	}
    
    	prompt := fmt.Sprintf("Given these ingredients: %s, generate a recipe.", ingredients)
    	result, err := client.Models.GenerateContent(ctx, "gemini-2.0-flash-exp", genai.Text(prompt), nil)
    	if err != nil {
    		log.Printf("error generating content: %v", err)
    		http.Error(w, "Failed to generate recipe", http.StatusServiceUnavailable)
    		return
    	}
    
    	if len(result.Candidates) == 0 {
    		http.Error(w, "No recipes found", http.StatusNotFound) // Or another appropriate status
    		return
    	}
    
    	recipe := result.Candidates[0].Content.Parts[0].Text // Extract the generated recipe text
    
    	response, err := json.Marshal(map[string]string{"recipe": recipe})
    	if err != nil {
    		log.Printf("error marshalling response: %v", err)
    		http.Error(w, "Failed to format response", http.StatusInternalServerError)
    		return
    	}
    
    	w.Header().Set("Content-Type", "application/json")
    	w.Write(response)
    }
    

    There are a lot agent frameworks out there right now. A LOT. Many of them have good Gemini support. You can build agents with Gemini using LangChain, LangChain4J, LlamaIndex, Spring AI, Firebase Genkit, and the Vercel AI SDK.

    What coding tools can I use with Gemini? GitHub Copilot now supports Gemini models. Folks who love Cursor can choose Gemini as their underlying model. Same goes for fans of Sourcegraph Cody. Gemini Code Assist from Google Cloud puts AI-assisted tools into Visual Studio Code and the JetBrains IDEs. Get the power of Gemini’s long context, personalization on your own codebase, and now the use of tools to pull data from Atlassian, GitHub, and more. Use Gemini Code Assist within your local IDE, or in hosted environments like Cloud Workstations or Cloud Shell Editor.

    Gemini Code Assist brings AI assistance to your dev workspace, including the use of tools

    Project IDX is another Google-provided dev experience for building with Gemini. Use it for free, and build AI apps, with AI tools. It’s pretty great for frontend or backend apps.

    Project IDX lets you build AI apps with AI tools

    Maybe you’re building apps and agents with Gemini through low-code or declarative tools? There’s the Vertex AI Agent Builder. This Google Cloud services makes it fairly simple to create search agents, conversational agents, recommendation agents, and more. No coding needed!

    Conversational agents in the Vertex AI Agent Builder

    Another options for building with Gemini is the declarative Cloud Workflows service. I built a workflow that calls Gemini through Vertex AI and summarizes any provided document.

    # Summarize a doc with Gemini
    main:
      params: [args]
      steps:
      - init:
          assign:
            - doc_url: ${args.doc_url}
            - project_id: ${args.project_id}
            - location: ${args.location}
            - model: ${args.model_name}
            - desired_tone: ${args.desired_tone}
            - instructions: 
      - set_instructions:
          switch:
            - condition: ${desired_tone == ""}
              assign:
                - instructions: "Deliver a professional summary with simple language."
              next: call_gemini
            - condition: ${desired_tone == "terse"}
              assign:
                - instructions: "Deliver a short professional summary with the fewest words necessary."
              next: call_gemini
            - condition: ${desired_tone == "excited"}
              assign:
                - instructions: "Deliver a complete, enthusiastic summary of the document."
              next: call_gemini
      - call_gemini:
          call: googleapis.aiplatform.v1.projects.locations.endpoints.generateContent
          args:
            model: ${"projects/" + project_id + "/locations/" + location + "/publishers/google/models/" + model}
            region: ${location}
            body:
              contents:
                role: user
                parts:
                  - text: "summarize this document"
                  - fileData: 
                      fileUri: ${doc_url}
                      mimeType: "application/pdf"
              systemInstruction: 
                role: user
                parts:
                  - text: ${instructions}
              generation_config:  # optional
                temperature: 0.2
                maxOutputTokens: 2000
                topK: 10
                topP: 0.9
          result: gemini_response
      - returnStep:
          return: ${gemini_response.candidates[0].content.parts[0].text}
    

    Similarly, its sophisticated big-brother, Application Integration, can also interact with Gemini through drag-and-drop integration workflows. These sorts of workflow tools help you bake Gemini predictions into all sorts of existing processes.

    Google Cloud Application Integration calls Gemini models

    After you build apps and agents, you need a place to host them! In Google Cloud, you’ve could run in a virtual machine (GCE), Kubernetes cluster (GKE), or serverless runtime (Cloud Run). There’s also the powerful Firebase App Hosting for these AI apps.

    There are also two other services to consider. For RAG apps, we now offer the Vertex AI RAG Engine. I like this because you get a fully managed experience for ingesting docs, storing in a vector database, and performing retrieval. Doing LangChain? LangChain on Vertex AI offers a handy managed environment for running agents and calling tools.

    Build AI and data systems with Gemini

    In addition to building straight-up agents or apps, you might build backend data or AI systems with Gemini.

    If you’re doing streaming analytics or real-time ETL with Dataflow, you can build ML pipelines, generate embeddings, and even invoke Gemini endpoints for inference. Maybe you’re doing data analytics with frameworks like Apache Spark, Hadoop, or Apache Flink. Dataproc is a great service that you can use within Vertex AI, or to run all sorts of data workflows. I’m fairly sure you know what Colab is, as millions of folks per month use it for building notebooks. Colab and Colab Enterprise offer two great ways to build data solutions with Gemini.

    Let’s talk about building with Gemini inside your database. From Google Cloud SQL, Cloud Spanner, and AlloyDB, you can create “remote models” that let you interact with Gemini from within your SQL queries. Very cool and useful. BigQuery also makes it possible to work directly with Gemini from my SQL query. Let me show you.

    I made a dataset from the public “release notes” dataset from Google Cloud. Then I made a reference to the Gemini 2.0 Flash model, and then asked Gemini for a summary of all a product’s release notes from the past month.

    -- create the remote model
    CREATE OR REPLACE MODEL
    `[project].public_dataset.gemini_2_flash`
    REMOTE WITH CONNECTION `projects/[project]/locations/us/connections/gemini-connection`
    OPTIONS (ENDPOINT = 'gemini-2.0-flash-exp');
    
    -- query an aggregation of responses to get a monthly product summary
    SELECT * 
    FROM
     ML.GENERATE_TEXT(
        MODEL `[project].public_dataset.gemini_2_flash`,
        (
          SELECT CONCAT('Summarize this month of product announcements by rolling up the key info', monthly_summary) AS prompt
          FROM (
            SELECT STRING_AGG(description, '; ') AS monthly_summary
            FROM `bigquery-public-data`.`google_cloud_release_notes`.`release_notes` 
            WHERE product_name = 'AlloyDB' AND DATE(published_at) BETWEEN '2024-12-01' AND '2024-12-31'
          )
        ),
        STRUCT(
          .05 AS TEMPERATURE,
          TRUE AS flatten_json_output)
        )
    

    How wild is that? Love it.

    You can also build with Gemini in Looker. Build reports, visualizations, and use natural language to explore data. See here for more.

    And of course, Vertex AI helps you build with Gemini. Build prompts, fine-tune models, manage experiments, make batch predictions, and lots more. If you’re working with AI models like Gemini, you should give Vertex AI a look.

    Build a better day-2 experience with Gemini

    It’s not just about building software with Gemini. The AI-driven product workflow extends to post-release activities.

    Have to set up least-privilege permissions for service accounts? Build the right permission profile with Gemini.

    The “Help me choose roles” feature uses Gemini to figure out the right permissions

    Something goes wrong. You need to get back to good. You can build faster resolution plans with Gemini. Google Cloud Logging supports log summarization with Gemini.

    Google Cloud Logging supports log summarization with Gemini

    Ideally, you know when something goes wrong before your customers notice. Synthetic monitors are one way to solve that. We made it easy to build synthetic monitors with Gemini using natural language.

    “Help me code” option for creating synthetic monitors in Cloud Monitoring

    You don’t want to face security issues on day-2, but it happens. Gemini is part of Security Command Center where you can build search queries and summarize cases.

    Gemini can also help you build billing reports. I like this experience where I can use natural language to get answers about my spend in Cloud Billing.

    Gemini in Cloud Billing makes it easier to understand your spend

    Build supporting digital assets with Gemini

    The developer workflow isn’t just about code artifacts. Sometimes you create supporting assets for design docs, production runbooks, team presentations, and more.

    Use the Gemini app (or our other AI surfaces) to generate images. I do this all the time now!

    Image for use in a presentation is generated by Gemini

    Building slides? Writing docs? Creating spreadsheets? Gemini for Workspace gives you some help here. I use this on occasion to refine text, generate slides or images, and update tables.

    Gemini in Google Docs helps me write documents

    Maybe you’re getting bored with static image representations and want some more videos in your life? Veo 2 is frankly remarkable and might be a new tool for your presentation toolbox. Consider a case where you’re building a mobile app that helps people share cars. Maybe produce a quick video to embed in the design pitch.

    Veo 2 generating videos for use in a developer’s design pitch

    AI disrupts the traditional product development workflow. Good! Gemini is part of each stage of the new workflow, and it’s only going to get better. Consider introducing one or many of these experiences to your own way of working in 2025.

  • 8 ways AI will change how I work in 2025

    You don’t have to use generative AI. It’s possible to avoid it and continue doing whatever you’ve been doing, the way you’ve been doing it. I don’t believe that sentence will be true in twelve months. Not because you’ll have to use it—although in some cases it may be unavoidable—but because you’ll want to use it. I thought about how my work will change next year.

    #1. I’ll start most efforts by asking “can AI help with this?”

    Do I need to understand a new market or product area? Analyze a pile of data? Schedule a complex series of meetings? Quickly generate a sample app for a customer demo? Review a blog post a teammate wrote? In most cases, AI can give me an assist. I want to change my mental model to first figure out if there’s a smarter (Ai-assisted) way to do something.

    That said, it’s about “can AI help me” versus “can AI do all my work.” I don’t want to end up in this situation.

    #2. I’m going to do much better research.

    Whether planning a strategy or a vacation, there’s a lot of time spent researching. That’s ok, as you often uncover intriguing new tangents while exploring the internet.

    AI can still improve the process. A lot. I find myself using the Gemini app, Google AI Studio, and NotebookLM to understand complex ideas. Gemini Deep Research is almost unbelievable. Give it a prompt, it scours the web for dozens or hundreds of sources, and then compiles a report.

    What an amazing way to start or validate research efforts. Have an existing pile of content—might be annual reports, whitepapers, design docs, or academic material—that you need to make sense of? NotebookLM is pretty amazing, and should change how all of us ask questions of research material.

    #3. I will learn new things faster.

    Many of us have jobs where we need to quickly get up to speed on a topic. I want help in context, so that I stay in a flowstate.

    Back to NotebookLM, I might use this to get easier-to-digest audio overviews of complex new ideas.

    And then with coding assistance tools, I also am getting more and more comfortable staying in my IDE to get help on things I don’t yet know. Here, my Gemini Code Assist extension is helping me learn how to fix my poorly-secured Java code.

    Finally, I’m quite intrigued by how the new Gemini 2.0 Multimodal Live API will help me in the moment. By sharing my screen with the model, I can get realtime help into whatever I’m struggling with. Wow.

    #4. I’ll less time debating and more time coding.

    My day job is to lead a sizable team at Google Cloud and help everyone do their best work. I still like to code, though!

    it’s already happening, but next year I expect to code more than in years past. Why? Because AI is making easier and more fun. Whether using an IDE assistant, or a completely different type of IDE like Cursor, it’s never been simpler to build legit software. We all can go from idea to reality so quickly now.

    Stop endlessly debating ideas, and just test them out quickly! Using lowcode platforms or AI assisted coding tools, you can get working prototypes in no time.

    #5. I will ask better questions.

    I’ve slowly learned that the best leaders simply ask better questions. AI can help us a few ways here. First, there are “thinking” models that show you a chain of thought that might inspire your own questions.

    LLMs are awesome at giving answers, but they’re also pretty great at crafting questions. Look at this. I uploaded a set of (fake) product bugs and asked the Gemini model to help me come up with clarifying questions to ask the engineers. Good list!

    And how about this. Google Cloud BigQuery has an excellent feature called Data Insights which generates a bunch of candidate questions for a given dataset (here, the Google Cloud Release Notes). What a great way to get some smart, starter questions to consider!

    #6. I want to identify where the manual struggle is actually the point.

    I don’t want AI to do everything for me. There are cases where the human struggle is where the enjoyment comes from. Learning how to do something. Fumbling with techniques. Building up knowledge or strength. I don’t want a shortcut. I want deep learning.

    I’m going to keep doing my daily reading list by hand. No automation allowed, as it forces me to really get a deeper grasp on what’s going on in our industry. I’m not using AI to write newsletters, as I want to keep working on the writing craft myself.

    This mass integration of AI into services and experiences is great. It also forces us to stop and decide where we intentionally want to avoid it!

    #7. I should create certain types of content much faster.

    There’s no excuse to labor over document templates or images in presentations anymore. No more scouring the web for the perfect picture.

    I use Gemini in Google Slides all the time now. This is the way I add visuals to presentations and it saves me hours of time.

    Generate code, docs, and images, sure. We’ve seen that, but the image generation tech is getting tremendous.

    But videos too? I’m only starting to consider how to use remarkable technology like Veo 2. I’m using it now, and it’s blowing my mind. It’ll likely impact what I produce next year.

    #8. I’m going to free up some valuable time.

    That’s what most of this is all about. I don’t want to do less work; I want to do better work. Even with all this AI and automation, I expect I’ll be working the same number of hours next year. But I’ll be happier with how I’m spending those hours: learning, talking to humans, investing in others. Less time writing boilerplate code, breaking flow state to get answers, or even executing mindlessly repetitive tasks in the browser.

    I don’t work for AI; AI works for me. And in 2025, I’m expecting to make it work hard!