Author: Richard Seroter

  • Daily Reading List – October 16, 2025 (#650)

    Good day here in Austin. I like talking to folks who have their own perspectives on what AI and the current tech landscape looks like. Helps me pop my own ideological bubble!

    [article] Inside Google’s AI turnaround: The rise of AI Mode, strategy behind AI Overviews, and their vision for AI-powered search | Robby Stein (VP of Product, Google Search). It’s actually a podcast episode, but here’s the landing page. It’s long, but I listened to the whole thing today, and it was a terrific lesson on being user focused and scaling products.

    [article] Where do developers actually want AI to support their work? Good topic. Just because AI *can* help everywhere, doesn’t mean developers want it to. Yet.

    [blog] Bringing AI to the next generation of fusion energy. AI shops are showing their focus areas right now. While we’re doing fun consumer AI with our models, our most meaningful work is happening in the sciences.

    [blog] The ROI Pendulum: Build Vs. Buy In The Age Of AI. Is it about taking back some control? Your AI strategy should be like most others with tech: buy commodity, build differentiation.

    [blog] 10 years of genomics research at Google. Related. This isn’t a side project here. It’s core to the work we do.

    [article] Rethinking operations in an agentic AI world. Fundamental concepts may remain the same, but the implementation is changing. James takes a look at a fresh way to look at ops when dealing with agent workloads.

    [blog] Stop Guessing and Start Benchmarking Your AI Prompts. Writing prompts doesn’t have to be art. There can be a science to it where you have measurable proof as to what works and what doesn’t.

    Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

  • Daily Reading List – October 15, 2025 (#649)

    Flying to Austin right now for my last trip of this wacky run of five weeks or so with travel. I’ll be doing a keynote at Cognizant’s big annual conference and also slipping out to Waco to see my kiddo in college.

    [blog] Introducing Veo 3.1 and advanced capabilities in Flow. Each of these new capabilities could be transformative in the hands of a creative person. Or even someone like me who pretends to be creative.

    [blog] Introducing Beads: A coding agent memory system. Color me interested. Might this be a better way to have some persistent memory between coding sessions, better than dumping sessions to Markdown files?

    [blog] Say hello to a new level of interactivity in Gemini CLI. Kick into other interactive terminal sessions—think opening a file with vim—without ever leaving your agentic CLi? Love it.

    [blog] Leveling up your deployment pipelines. Some insights into adding more features to your deployment pipelines.

    [blog] Chaos engineering on Google Cloud: Principles, practices, and getting started. Not a new idea, but still useful to re-learn some of the core ideas around rehearsing for the worst case scenario.

    [blog] Nano Banana is coming to Google Search, NotebookLM and Photos. This amazing model is showing up everywhere in Google products and services.

    [article] How to Be a Great Coach—Even When You’re Busy. Nobody is too busy to help others grow. I refuse to believe it. How can you coach well when you’ve got a fairly packed calendar? Here’s some advice.

    [blog] Gemini Code Assist brings enterprise-grade AI code reviews to GitHub. This GitHub app is used by many to do code reviews and more. But it’s been limited to github.com repos. Now it works with GitHub Enterprise Cloud and on-prem GitHub Enterprise Server repos.

    [blog] AI Agent Benchmark Compendium. Wow, here’s a breakdown of 50 modern benchmarks and where to use them.

    [blog] How to add MCP Servers to Gemini CLI with Docker MCP Toolkit. Handy integration for those that are invested in the Docker ecosystem.

    [article] Zoom dooms the developer’s afternoon. It’s not Zoom’s fault alone, but the fact that it’s super easy to set up virtual meetings that constantly interrupt developers.

    Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

  • Daily Reading List – October 14, 2025 (#648)

    I published a blog post today, and also had a chance to read a lot of interesting content.

    [blog] Agents 2.0: From Shallow Loops to Deep Agents. Philipp talks about agents that don’t just react within a loop, but orchestrate more complex patterns.

    [blog] The new AI-driven SDLC. Some good thoughts here, and I suspect we’ll see more written about this in the coming months.

    [blog] 2024 Open Source Contributions: A Year in Review. Yes, these numbers are impressive. But what’s better is the display of the breadth of engagement needed for healthy open source ecosystems.

    [blog] Deploy Faster with Terraform: Your Guide to vLLM on GKE with Infrastructure-as-Code. Can you be agile with your infrastructure? Sure. Karl shows off the use of Terraform for ML engineering.

    [blog] Unpacking Cloudflare Workers CPU Performance Benchmarks. This recent independent test had Vercel coming out a clear performance winner. I like that Cloudflare’s response was particularly defensive, and that this triggered some improvements in their end. Well done.

    [blog] DevRel Activity Patterns… published. I had the chance to provide content reviews of this book, and imagine this material will be useful to many teams.

    [blog] Understanding Etsy’s Vast Inventory with LLMs. Can LLMs help you describe and understand millions of products? That’s what Etsy is doing.

    [blog] The Key Vibe Coding Practices. Here’s another book I reviewed a few months ago. This is an excerpt ahead of its release.

    [blog] Is the Vibe Coding Bubble Starting to Burst? Have we hit peak vibe coding? I’m not as doomer as this article, but the data does show that the honeymoon may be over.

    [blog] Building Bulletproof LLM Applications: A Guide to Applying SRE Best Practices. Lots of advice here for building in resilience and responding quickly when app issues arise.

    [blog] DevRel is -Unbelievably- Back. More open positions. I hope this isn’t for classic DevRel that had looser metrics and softer connection to company success. I doubt it!

    Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

  • How to build and deploy a portable AI agent that uses a managed memory service

    How to build and deploy a portable AI agent that uses a managed memory service

    I enjoy building with new frameworks and services. Do you? It’s fun to break new ground. That said, I’m often filled with regret as a I navigate incomplete docs, non-existent search results, and a dearth of human experts to bother. Now add LLMs that try to help but accidentally set you back. Good times. But we persevere. My goal? Build an AI agent—it helps you plan a career change—that retains memory through long-running conversations, and is portable enough that it can run on most any host. Easy enough, yes?

    My weapons of choice were the Agent Development Kit (Python), the new fully-managed Vertex AI Memory Bank service, and runtime hosts including Google Cloud Run and Vertex AI Agent Engine. Most every sample I found for this tech combination was either PhD level coding with excessive functionality, a hard-coded “hello world” that didn’t feel realistic, or a notebook-like flow that didn’t translate to an independent agent. I craved a simple, yet complete, example of what a real, hosted, and memory-infused agent looks like. I finally got it all working, it’s very cool, and wanted to share steps to reproduce it.

    Vertex AI Memory Bank showing memories from my AI agent

    Let’s go through this step by step, and I’ll explain the various gotchas and such that weren’t clear from the docs or existing samples. Note that I am NOT a Python developer, but I think I follow some decent practices here.

    First, I wanted a new Python virtual environment for the folder containing my app.

     python3 -m venv venv
    source venv/bin/activate
    

    I installed the latest version of the Google ADK.

    pip install google-adk
    

    My source code is here, so you can just download the requirements.txt file and install the local dependencies you need.

    pip install -r requirements.txt
    

    I’ve got an __init__.py file that simply contains:

    from . import agent
    

    Now the agent.py itself where all the logic lives. Let’s go step by step, but this all is from a single file.

    import os
    import sys
    from google.adk.agents import Agent
    from google.adk.tools import agent_tool
    from google.adk.tools import google_search
    
    from google import adk
    from google.adk.runners import Runner
    from google.adk.sessions import VertexAiSessionService
    from google.adk.memory import VertexAiMemoryBankService
    from google.api_core import exceptions
    

    Nothing earth-shattering here. But I use a mix of built-in tools including Google Search. And I’m using durable storage for sessions and memory (versus the default in-memory options) and importing those references.

    app_name = 'career_agent'
    
    # Retrieve the agent engine ID needed for the memory service
    agent_engine_id = os.environ.get("GOOGLE_CLOUD_AGENT_ENGINE_ID")
    

    Our agent app needs a name for the purpose of storing sessions and memory through ADK. And that agent_engine_id is important for environments where it’s not preloaded (e.g. outside of Vertex AI Agent Engine).

    # Create a durable session for our agent
    session_service = VertexAiSessionService()
    print("Vertex session service created")
    
    # Instantiate the long term memory service, needs agent_engine parameter from environment or doesn't work right
    memory_service = VertexAiMemoryBankService(
        agent_engine_id=agent_engine_id)
    print("Vertex memory service created")
    

    Here I create instances of the VertexAiSessionService and VertexAiMemoryBankService. These refer to fully managed, no ops needed, services that you can use standalone wherever your agent runs.

    # Use for callback to save the session info to memory
    async def auto_save_session_to_memory_callback(callback_context):
        try:
            await memory_service.add_session_to_memory(
                callback_context._invocation_context.session
            )
            print("\n****Triggered memory generation****\n")
        except exceptions.GoogleAPICallError as e:
            print(f"Error during memory generation: {e}")
    

    Now we’re getting somewhere. This function (thanks to my colleague Megan who I believe came up with it) will be invoked as a callback during session turns.

    # Agent that does Google search
    career_search_agent_memory = Agent(
        name="career_search_agent_memory",
        model="gemini-2.5-flash",
        description=(
            "Agent answers questions career options for a given city or country"
        ),
        instruction=(
            "You are an agent that helps people figure out what types of jobs they should consider based on where they want to live."
        ),
        tools=[google_search],
    )
    

    That’s agent number one. It’s a secondary agent that just does a real-time search to supplement the LLM’s knowledge with real data about a given job in a particular city.

    # Root agent that retrieves memories and saves them as part of career plan assistance
    root_agent = Agent(
        name="career_advisor_agent_memory",
        model="gemini-2.5-pro", # Using a more capable model for orchestration
        description=(
            "Agent to help someone come up with a career plan"
        ),
        instruction=(
            """
            **Persona:** You are a helpful and knowledgeable career advisor.
    
            **Goal:** Your primary goal is to provide personalized career recommendations to users based on their skills, interests, and desired geographical location.
    
            **Workflow:**
    
            1.  **Information Gathering:** Your first step is to interact with the user to gather essential information. You must ask about:
                *   Their skills and areas of expertise.
                *   Their interests and passions.
                *   The city or country where they want to work.
    
            2.  **Tool Utilization:** Once you have identified a potential career and a specific geographical location from the user, you **must** use the `career_search_agent_memory` tool to find up-to-date information about job prospects.
    
            3.  **Synthesize and Respond:** After obtaining the information from the `career_search_agent_memory` tool, you will combine that with the user's stated skills and interests to provide a comprehensive and helpful career plan.
    
            **Important:** Do not try to answer questions about career options in a specific city or country from your own knowledge. Always use the `career_search_agent_memory` tool for such queries to ensure the information is current and accurate.
            """
        ),
        tools=[adk.tools.preload_memory_tool.PreloadMemoryTool(), agent_tool.AgentTool(career_search_agent_memory), ],
        after_agent_callback=auto_save_session_to_memory_callback,
    )
    

    That’s the root agent. Let’s unpack it. I’ve got some fairly detailed instructions to help it use the tool correctly and give a good response. Also note the tools. I’m preloading memory so that it gets context about existing memories, even if they happened five sessions ago. It’s got a tool reference to that “search” agent I defined above. And then after the agent generates a response, we save the key memories to the Memory Bank.

    runner = Runner(
        agent=root_agent,
        app_name=app_name,
        session_service=session_service,
        memory_service=memory_service)
    

    Finally, I’ve got a Runner. I’m not positive this is even used when the agent runs on Vertex AI Agent Engine, but it plays a part when running elsewhere.

    That’s it. 87 lines in one file. Writing the code wasn’t the hard part; knowing what to do and how to shape the agent was where all the work happened.

    Let’s deploy, and test it all out with cURL commands. To deploy this to the fully-managed Vertex AI Agent Engine, it’s a single ADK command now. You need to provide it a Cloud Storage bucket name (for storing artifacts), but that’s about it.

    adk deploy agent_engine \
        --project=seroter-project-base \
        --region=us-central1 \
        --staging_bucket=gs://seroter-agent-memory-staging \
        --display_name="Career Agent with Memory" \
        --trace_to_cloud \
        career_agent_memory/
    

    When this finishes, I saw a bucket loaded up with code and other artifacts.

    Files generated and stored by ADK for my deployed agent

    More importantly, I had an agent. Vertex AI Agent Engine has a bunch of pre-built observability dashboards, and an integrated view of sessions and memory.

    Vertex AI Agent Engine dashboard in the Google Cloud Console

    Let’s use this agent, and see if it does what it’s supposed to. I’m going to use cURL commands, so that it’s super clear as to what’s happening.

    curl \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json" \
    https://us-central1-aiplatform.googleapis.com/v1/projects/seroter-project-base/locations/us-central1/reasoningEngines/8479666769873600512:query \
    -d '{"class_method": "create_session", "input": {"user_id": "u_123"},}'
    

    This first command creates a new session for our agent chat. The authorization comes from injecting a Google Cloud token into the header. I plugged in the “resource name” of the Agent Engine instance into the URI and set a user ID. I get back something like this:

    {
      "output": {
        "userId": "u_123",
        "id": "5926526278264946688",
        "events": [],
        "appName": "8479666769873600512",
        "state": {},
        "lastUpdateTime": 1760395538.0874159
      }
    }
    

    That “id” value matches the session ID now visible in the Vertex AI Session list. This session is for the given user, u_123.

    A session created for the agent running in the Vertex AI Agent Engine

    Now I can chat with my career agent. Here’s the cURL request for submitting a query. This will trigger my root agent, call my secondary agent, and store the key memories of the interaction as a callback.

    curl \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json" \
    https://us-central1-aiplatform.googleapis.com/v1/projects/seroter-project-base/locations/us-central1/reasoningEngines/8479666769873600512:streamQuery?alt=sse \
    -d '{"class_method": "stream_query","input": {"user_id": "u_123","session_id": "5926526278264946688","message": "I am currently a beekeeper in New Mexico. I have been to college for economics, but that was a long time ago. I am thinking about moving to Los Angeles CA and get a technology job. What are my job prospects in that region and how should I start?",}}'
    

    Note that the engine ID is still in the URI, and payload contains the user ID and session ID. What I got back was a giant answer with some usable advice on how I can take my lucrative career as a beekeeper and make my mark on the technology sector.

    What got automatically saved as a memory? Switching to the Memories view in Vertex AI, I see that a few key details about my context were durably stored.

    Memories automatically parsed and stored in the Vertex AI Memory Bank

    Now if I delete my session, come back tomorrow and start a new one, any memories for this user ID (and agent engine instance) will be preloaded into every agent request. Very cool!

    Let’s quickly prove it. I can destroy my session with this cURL command.

    curl \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json" \
    https://us-central1-aiplatform.googleapis.com/v1/projects/168267934565/locations/us-central1/reasoningEngines/8479666769873600512:query?alt=sse \
    -d '{"class_method": "delete_session","input": {"user_id": "u_123","session_id": "5926526278264946688",}}'
    

    No more session, but my Memories remain. I can then request another session (for the same user) using the earlier command:

    curl \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json" \
    https://us-central1-aiplatform.googleapis.com/v1/projects/seroter-project-base/locations/us-central1/reasoningEngines/8479666769873600512:query \
    -d '{"class_method": "create_session", "input": {"user_id": "u_123"},}'
    

    At this point, I could ask something like “what do you already know about me?” in my query to see if it retrieves the memories it stored before.

    curl \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json" \
    https://us-central1-aiplatform.googleapis.com/v1/projects/seroter-project-base/locations/us-central1/reasoningEngines/8479666769873600512:streamQuery?alt=sse \
    -d '{"class_method": "stream_query","input": {"user_id": "u_123","session_id": "3132042709481553920","message": "What do you already know about me?",}}'
    

    Here’s what I got back:

    {"content": {"parts": [{"thought_signature": "CrgEAR_M...twKw==", "text": "You have an economics degree and are currently a beekeeper in New Mexico. You're considering a move to Los Angeles for a job in the technology sector."}], "role": "model"}, "finish_reason": "STOP", "usage_metadata": {"candidates_token_count": 32, "candidates_tokens_details": [{"modality": "TEXT", "token_count": 32}], "prompt_token_count": 530, "prompt_tokens_details": [{"modality": "TEXT", "token_count": 530}], "thoughts_token_count": 127, "total_token_count": 689, "traffic_type": "ON_DEMAND"}, "avg_logprobs": -0.8719542026519775, "invocation_id": "e-53e94a44-ad6b-4e97-9297-51612f4e77a9", "author": "career_advisor_agent_memory", "actions": {"state_delta": {}, "artifact_delta": {}, "requested_auth_configs": {}, "requested_tool_confirmations": {}}, "id": "c9e484cd-e5f7-4e1e-94d7-7490a006137d", "timestamp": 1760396342.830469}
    

    Excellent! With this approach, I have zero database management to do, yet my agents can retain context for each turn over an extended period of time.

    Vertex AI Agent Engine is cool, but what if you want to serve up your agents on a different runtime? Maybe a VM, Kubernetes, or the best app platform available, Google Cloud Run. We can still take advantage of managed sessions and memory, even if our workload runs elsewhere.

    The docs don’t explain how to do this, but I figured out the first step. You need that Agent Engine ID. When deploying to Vertex AI Agent Engine, it happened automatically. But now I need to explicitly submit an HTTP request to get back an ID to use for my agent. Here’s the request:

    curl \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json" \
    https://aiplatform.googleapis.com/v1/projects/168267934565/locations/us-central1/reasoningEngines \
    -d '{"displayName": "memory-bank-for-cloud-run"}'
    

    I get back an ID value, and I see a new entry show up for me in Vertex AI Agent Engine.

    Memory Bank instance for an agent in Cloud Run

    The ADK also supports Google Cloud Run as a deployment target, so I’ll deploy this exact agent, no code changes, there too. First, I threw a few values into the shell’s environment variables to use for the CLI command.

    export GOOGLE_CLOUD_PROJECT=seroter-project-base
    export GOOGLE_CLOUD_LOCATION=us-central1 
    export GOOGLE_GENAI_USE_VERTEXAI=True
    

    Then I issued the single request to deploy the agent to Cloud Run. Notice some different things here. First, no Cloud Storage bucket. Cloud Run creates a container from the source code and uses that. Also, I explicitly set the –memory_service_uri and –session_service_uri to enable some of the pre-wiring to those services. It didn’t work without it, and the current docs don’t include the proper parameters. And I also figured out (undocumented) how to add Cloud Run environment variables, since the Agent Engine ID was also needed there.

    adk deploy cloud_run \
    --project=$GOOGLE_CLOUD_PROJECT \
    --region=$GOOGLE_CLOUD_LOCATION \
    --service_name=career-agent \
    --app_name=career_agent \
    --port=8080 \
    --memory_service_uri=agentengine://8058017254761037824 \
    --session_service_uri=agentengine://8058017254761037824 \
    career_agent_memory/ \
    -- --set-env-vars "GOOGLE_CLOUD_AGENT_ENGINE_ID=8058017254761037824"
    

    In just a couple minutes, I ended up with an agent ready to serve on Cloud Run.

    Agent running in Cloud Run

    The URLs I use to interact with my agent are now different because we’re not calling the managed service endpoints of Vertex AI to invoke the agent. So if I want a new session to get going, I submit a cURL request like this:

    curl -X POST -H "Content-Type: application/json" -d '{}' \
        https://career-agent-168267934565.us-central1.run.app/apps/career_agent/users/u_456/sessions
    

    I’ve got no payload for this request, and specified the user name in the URL. I got back a session ID in a JSON payload like above. And I can see that session registered in my Agent Engine console.

    Session created based on web request

    Submitting queries to this agent is slightly different than when it was hosted in Vertex AI Agent Engine. For Cloud Run agents, the cURL request looks like this:

    curl -X POST \
        -H "Authorization: Bearer $(gcloud auth print-access-token)" \
        https://career-agent-168267934565.us-central1.run.app/run_sse \
        -H "Content-Type: application/json" \
        -d '{
        "app_name": "career_agent",
        "user_id": "u_456",
        "session_id": "311768995957047296",
        "new_message": {
            "role": "user",
            "parts": [{
            "text": "I am currently a cowboy in Las Vegas. I have been to college for political science, but that was a long time ago. I am thinking about moving to San Francisco CA and getting a technology job. What are my job prospects in that region and how should I start?"
            }]
        },
        "streaming": false
        }'
    

    After a moment, not only do I get a valid answer from my agent, but I also see that the callback fired and I’ve got durable memories in Vertex AI Memory Bank.

    Memories saved for the Cloud Run agent

    Just like before, I could end this session, start a new one, and the memories still apply . Very nice.

    Access to sessions and memories that scale as your agent does, or survive compute restarts, seems like a big deal. You can use your own database to store these, but I like having a fully managed option that handles every part of it for me. Once you figure out the correct code and configurations, it’s fairly easy to use. You can try this all yourself in Google Cloud with your existing account, or a new account with a bunch of free credits.

  • Daily Reading List – October 13, 2025 (#647)

    I never seem to get my demo apps working on the first pass, but it always turns out to be a (painful) blessing in disguise. Instead of taking a couple of hours to build an agent demo, it took a couple of weeks. But I was forced to read source code, experiment, and learn so much more than if it worked the first time. I’ll post my experiences tomorrow.

    [blog] F*ck it and Let it Rip. Try your hardest and have fun. A performance approach mindset is the way to go.

    [article] Becoming an AI-first business requires a total organizational mindshift. It’s true and many won’t make it. Not because they’re not smart, but because it takes a level of acceptable recklessness to institute the change.

    [blog] I’m in Vibe Coding Hell. I liked the points here. There’s a new challenge for self-learners who used to be dependent on the tutorial to get work done; now they’re dependent on their AI tool.

    [blog] Predictions 2026: Tech Leadership Will Be Wild — Bring Your Surfboard, Your Calculator, And Maybe A Clone. Yah, I can’t imagine being a team or organization leader in tech next year. Wait a minute.

    [article] Salesforce bets on AI ‘agents’ to fix what it calls a $7 billion problem in enterprise software. The unique circus of Dreamforce is going on this week, so expect all sorts of announcements. Here’s one about the new AgentForce 360.

    [blog] Quantum computing 101 for developers. My boss is deep into this, but I’ve only stayed peripherally aware. But I thought this was a good article for bringing folks up to speed.

    [article] Java or Python for building agents? A silly question twelve months ago, not so much today.

    [blog] Agents That Prove, Not Guess: A Multi-Agent Code Review System. It’s tempting to just dump a single prompt or pile of context into an agent and want something good back. But Ayo shows a better approach if you care about repeatability and transparency.

    [paper] Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models. Nonstop research and experimentation into making AI models more trustworthy and useful.

    [blog] What’s 🔥 in Enterprise IT/VC #467. This is one my favorite weekend reads. Doesn’t hurt that I showed up in this one.

    [blog] The Architect’s Dilemma. Good look at how you’d decide between using tools in your agent architecture, or use agents talking to agents.

    Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

  • Daily Reading List – October 10, 2025 (#646)

    Whew. A frantic day until about 3pm, and then I got a chance to work on an AI agent I’ve been messing with. I’m building it for use in a new blog post, but timeboxing how much more effort I put in. Hopefully posting next week.

    [blog] 150 of the latest AI use cases from leading startups and digital natives. Get inspired by what startups are doing with AI right now.

    [blog] 1,001 real-world gen AI use cases from the world’s leading organizations. We might have gone too far here. This started with an innocent 101 use cases, and now we’re drunk on stories and sharing 1,001 of them.

    [blog] Embracing the parallel coding agent lifestyle. Is every developer a type of “manager” now? There’s a new workstyle that involves coordinating a series of agents doing various bits of work for you.

    [blog] Vibe engineering. Also from Simon, this builds on the previous post. Now we’re talking broader engineering practices, not just using AI for a snippet of code.

    [article] Control Codegen Spend. Unless you have an all-you-can-eat license (which seems rare), there are cost and consumption considerations with AI-assisted coding tools.

    [blog] Gemini Computer Use: Giving LLMs Hands and Eyes. Our new computer use model definitely landed on my “try this out” list. Heiko gives it a run through here.

    [blog] The Weak Point in MCP Nobody’s Talking About: API Versioning. Seems like something to consider, but I also don’t know how versioning might be different when we’re talking agents and tools.

    Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

  • Daily Reading List – October 9, 2025 (#645)

    Another day, another chance to learn something new. Today’s reading list had some useful data points, fresh ideas, and new products.

    [blog] Introducing Gemini Enterprise. The AI platform era is here. It’s not just about a collection of random products. It’s about intentionally connecting people, systems, and knowledge bases so that we can get better work done. If you’re a Google shop, Microsoft shop, IBM shop or whatever, Gemini Enterprise is a major upgrade. More, from Sundar.

    [blog] 4 ways Gemini Enterprise makes work easier for everyone. It’s useful to see actual examples of this platform in action.

    [blog] How I Learned to Stop Worrying and Trust AI Coding Agents. Yes, we can learn some cool things from watching these agents at work.

    [blog] Platform Shifts Redefine Apps. Important concepts here. What an “app” is changes with each tech platform evolution. Are you working with a modern definition?

    [blog] Googler Michel Devoret awarded the Nobel Prize in Physics. It’s hard to have a big head at work when you’re surrounded by so many brilliant folks.

    [guide] Choose a design pattern for your agentic AI system. This is fantastic, vendor-neutral architecture guidance for someone who wants to learn agent design patterns and when to pick each one.

    [article] What the 2025 DORA Report means for your AI strategy. Extremely good takeaways in this post if you’re wondering how you’re supposed to actually land and scale AI at your company.

    [blog] Predictions 2026: AI Moves From Hype To Hard Hat Work. Prediction season is upon us! See what various analysts and thought leaders are guessing.

    [article] When You’re the Executive Everyone Relies On—and You’re Burning Out. Good advice here that resonates with me and how I’m feeling right now.

    [blog] Give me AI slop over human sludge any day. He’s not wrong. Why are we automatically assuming AI-created stuff is worse or less useful than human created stuff?

    [article] Survey: Engineers Want To Code, But Spend All Day on Tech Debt. Oof, these numbers are rough. Coding is rarely the bottleneck in your teams. It’s the 84% of the time that you’re not able to do what you love doing.

    [blog] How to Build AI Security Agents with Gemini CLI. Good rationale here for when to build a tool versus when to leave the action to the agent to figure out.

    [blog] LoRA Explained: Faster, More Efficient Fine-Tuning with Docker. If you kinda understood this approach to updating a model without a full training run, you’ll learn more from this post.

    Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

  • Daily Reading List – October 8, 2025 (#644)

    Flying home after a great day with friends at Comcast and talking about AI assisted engineering. Everyone is looking for the playbook for effectively landing AI in their org. Maybe I’ll write a book about it 🙂

    [blog] Five Best Practices for Using AI Coding Assistants. A few months ago, our CEO asked me to run a few long-running engineering experiments with my team. Here’s what we learned from using the full spectrum of today’s AI coding tools to get work done.

    [blog] Stitching with the new Jules API. I don’t think we’re just going to improve the existing toolchain. Instead, a new toolchain is emerging where you simply work differently.

    [blog] The State of CI/CD in 2025: Key Insights from the Latest JetBrains Survey. What do folks use, are they using AI in CI/CD (answer: no), and why do folks use more than one tool? Answers here.

    [blog] Now open for building: Introducing Gemini CLI extensions. This is a huge capability that brings all sorts of data and functionality into your agentic CLI.

    [article] To scale agentic AI, Notion tore down its tech stack and started fresh. Bravo. Sometimes you need to reset, not refactor.

    [article] Kubernetes for agentic apps: A platform engineering perspective. While I’m extremely bullish on lightweight platforms for hosting agentic apps, there’s absolutely a place for Kubernetes in the mix. Abdel makes a good case.

    [article] How to write nonfunctional requirements for AI agents. I haven’t thought much about this, but sure, there are going to be some adjustments to what requirements you gather for AI apps.

    [blog] Expanding access to Opal, our no-code AI mini-app builder. When I tried this tool, it wasn’t what I expected. But it’s an interesting way to build a new style of app.

    [blog] Not Another Workflow Builder. LangChain isn’t interested in adding to the pile of visual workflow builders. We’re back to arguing workflows versus agents again too!

    Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

  • Daily Reading List – October 7, 2025 (#643)

    Flew to Philadelphia today to do a keynote at a customer’s internal developer conference tomorrow. Should be fun, although I’m definitely getting a little burnt out by all the recent travel!

    [blog] Introducing the Gemini 2.5 Computer Use model. What can you build with agents that can navigate user interfaces? Try out our new Computer Use model.

    [article] From Autocomplete to Agents: Mapping the Design Space of AI Coding Assistants. Great analysis. These ten dimensions, and six personas, are a useful way to understand the landscape.

    [blog] Why Google Cloud Run is the most underrated container platform – Part 1. I’d go as far to say that it’s one of the best (compute) services on the Internet.

    [blog] MCP Development with Rust, Gemini CLI, and Google Cloud Run. Here’s one example why Cloud Run is great. You have virtually no limit on programming language or scenario.

    [article] Beyond the Hype: Architecting Systems with Agentic AI. Long conversation, but some interesting points about some of the bigger lifecycle considerations for AI agents.

    [blog] Databases on K8s — Really? (part 8). My colleague at Google has been sharing a series of thoughts about his journey to appreciating containers and Kubernetes a a viable host for databases.

    [blog] Spec-driven development: Using Markdown as a programming language when building with AI. You can do spec-driven development in many ways. GitHub created a cool framework that they’re demonstrating here.

    [blog] Connect Spark data pipelines to Gemini and other AI models with Dataproc ML library. Integrating apps or data pipelines with AI models is getting more and more seamless. This looks like a very handy way to bring Gemini models to Spark data.

    [blog] Accelerate AI with Agents: Event Series for Developers in EMEA. This roadshow has been overbooked in every city it’s visited. We’re rolling across EMEA, and see if your hometown is in the mix.

    [blog] Ask a Techspert: What is vibe coding? We’re all builders now. What we build, and how production-ready it is, depends on the builder and the circumstances.

    Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

  • Daily Reading List – October 6, 2025 (#642)

    Shorter reading list today, but some great depth. I liked the thinking and challenge that many of these pieces offered.

    [blog] Introducing CodeMender: an AI agent for code security. This AI is both proactive and reactive in helping secure your code from vulnerabilities. This looks like the future to me.

    [blog] The RAG Obituary: Killed by Agents, Buried by Context Windows. I’ve seen a few obituaries for RAG, but maybe the agentic workflows really do resolve some of the issues that made RAG necessary in the first place.

    [article] Microsoft retires AutoGen and debuts Agent Framework to unify and govern enterprise AI agents. There are too many frameworks to choose from, so it’s a good thing when a new one explicitly replaces an old one.

    [article] OpenAI launches AgentKit to help developers build and ship AI agents. I’m wrong. Apparently we need more frameworks and agent-building tools. OpenAI announced many interesting things today during their Dev Day. Also updated models.

    [article] Top executives jump on AI upskilling. Great to see. The right type of executive upskilling with shrink the expectation-gap between leaders and their employees.

    [blog] vercel vs cloudflare: two philosophies of building for developers. A good rivalry forces participants to keep upping their game. This one is fascinating. I like Guillermo and the Vercel team but either way, developers win.

    [blog] More choice, more control: self-deploy proprietary models in your VPC with Vertex AI. Great. Better control and security when running these leading models in the cloud.

    [article] How to Drive Digital Innovation Without Wasting Resources. Some of this goes against conventional wisdom, which is why I liked it.

    [blog] Scaling Engineering Teams: Lessons from Google, Facebook, and Netflix. You can’t copy culture and practices from one team and assume they work on the next. But we can still learn from what others do.

    Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below: