Richard Seroter's Architecture Musings

Author: Richard Seroter

Daily Reading List – June 7, 2024 (#335)

It’s the end to another wild work week. Are you taking time to decompress and give your brain time to make new connections? I hope so. See you Monday.

[article] Rote automation is so last year: AI pushes more intelligence into software development. Good analysis that looks at trends, risks, and benefits of applying AI to a wider set of use cases than just coding.

[blog] The Bulkhead Pattern: How To Make Your System Fault-tolerant. How do you isolate components so that a failure in one doesn’t cascade further and take down the whole system? Here’s some guidance.

[blog] Tutorial: Vertex AI Agent Builder for Developers. You don’t have to be a coder to build AI agents nowadays. Coding can be part of it, but not required. Here’s one walkthrough.

[blog] Vertex AI Agent Builder Demo from I/O. Aja shares some lessons learned building a demo for our joint presentation at Google I/O.

[article] At Kubernetes 10th Anniversary in Mountain View: History Remembered. I like articles such as this. It’s a reminder that OSS is full of great people doing their best to make a difference.

[blog] Ambient and the SPOF Myth. Does the architecture of the proxy-less version of the Istio mesh introduce a single point of failure? John says no.

[blog] The Struggle Makes the Reward. If AI or other tech is taking away some of the challenging parts of your day, make sure you replace that with something else that gives you a deep sense of accomplishment.

[article] Take Your First Steps for Building on LLMs With Google Gemini. This only uses our model, not any of our other tech—app is built in Node and deployed to Heroku—and that’s fine with me. Just use great models.

[blog] 5 more myths about platform engineering: how it’s built, what it does, and what it doesn’t. There’s some useful, candid advice in here. Take a look if you’re building or optimizing a platform engineering team.

[blog] How to integrate Gemini and Sheets with BigQuery. I wouldn’t have thought to mash together these products this way, but that makes this all the more interesting to me.

[article] Rust Growing Fastest, But JavaScript Reigns Supreme. 43 million devs out there, according to SlashData. This article looks at which languages are the most popular.

##

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

June 7, 2024
Daily Reading List – June 6, 2024 (#334)

I had a great final day at this offsite, and am thankful to work with such talented and kind folks. Today’s reading list is full of product info, general purpose insights, and a couple freebies.

[blog] Does Crossplane Replace Terraform? Part I: the Theory. Ian performs a useful deep dive into APIs, cloud services, and control planes to figure out if the open source Crossplane project is a valid choice for modern teams.

[blog] Google is a Leader in The Forrester Wave™: AI Foundation Models for Language, Q2 2024. I haven’t seen an analyst assessment like this in a while. The distance between the leader (Google Cloud) and the rest is unusual. Read their analysis.

[article] LangChain and Google Gemini API for AI Apps: A Quickstart Guide. Here’s a quick demo that anyone at home can follow along with.

[blog] AI Assistants are Now Organizational Accelerants. Siloed assistants might be the future, but most of us are betting on something that is connected from end-to-end.

[blog] TBM 291: Why Your Product Transformation Will Fail. This reads like something that would come out of a productive pre-mortem where you consider all the ways an upcoming effort will go sideways.

[blog] How to use feature flags. I haven’t heard many of these perspectives on feature flags, and found this an educational read.

[blog] All Google Cloud courses and labs are now available at no cost through Innovators. Free, high quality training? What’s the catch? Only that you get an email newsletter from me every week, which is admittedly a burden for you.

[blog] Develop Kubernetes Operators in Java without Breaking a Sweat. This walkthrough from Docker shows how to test those custom Kubernetes operators written in Java.

[blog] NotebookLM goes global with Slides support and better ways to fact-check. Now available in 200 countries and based on our latest models. Try out this research and writing assistant.

[blog] BigQuery adds first-party support for Delta Lake. We’ve got some nice new integrations here for those that like open lakehouses.

[blog] 10 Years of Kubernetes. Happy birthday Kubernetes! It’s been te years since the first public commit was pushed to GitHub. Lots of history and links in this post.

##

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

June 6, 2024
Daily Reading List – June 5, 2024 (#333)

Finished up another day of this offsite in Seattle, and am spending my evening building demos for a customer presentation tomorrow at 630am. It’s a glamorous life!

[article] How to Talk About What You Do (without Being Boring). Are you good at this? Do you have a crisp, conversation-starting way to describe your job?

[repo] Gemini UI to Code Streamlit App. Here’s a cool little Python app that converts UI designs into usable code.

[blog] Getting started with retrieval augmented generation on BigQuery with LangChain. Very cool writeup that shows off a few key technologies all at once. There’s also a link to a sample notebook to try yourself.

[article] GoFr: A Go Framework To Power Scalable and Observable Apps. Getting a web server running in Go isn’t very difficult. But frameworks like this (which is new to me) do more than that for devs building data-driven apps.

[blog] Building a Smart Retail Shopping Assistant PART 1. Abi wrote up a good piece that walks through a complete solution scenario.

[blog] 25 AI prompts to make product managers’ lives easier. I’d like to see more of this type of thing that shows prompt ideas or techniques that best serve different roles.

[blog] Empowered development: GitLab on Google Cloud for streamlined delivery and enhanced security. We announced this a month or so back, and now have more details about using an integrated GitLab on Google Cloud.

[article] 9 command-line jewels for your developer toolkit. I’d advise that you know at least the very basics of bash scripting and how to navigate around, edit files, and such. These are additional commands that are good to be aware of.

[blog] Phishing for Gold: Cyber Threats Facing the 2024 Paris Olympics. I found this to be a sobering look at the breadth and depth of attack faced by organizations.

##

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

June 5, 2024
Daily Reading List – June 4, 2024 (#332)

Wrapped up day one of an offsite here in Seattle. Now off to a team building event at a cooking school. Which means I’ll likely be eating a second dinner after politely pushing the one we make around my plate.

[article] 5 Signs Your One-on-Ones Aren’t Working. There was chatter on Twitter/X a couple weeks back about high performers not needing 1:1s with their managers. I strongly disagree, but these meetings also need to be useful. This article has good advice for getting on track.

[article] What We Learned from a Year of Building with LLMs (Part II). I shared “part 1” of this series in last week’s reading list, and part 2 is excellent too. This one is chock full of strategic advice for working with LLMs.

[blog] Let’s make Gemini Groovy! Can you get Gemini 1.5 Flash to execute a Groovy script as part of a “function calling” exercise? Apparently you an.

[blog] Google Cloud Artifact Registry Goes Limitless with Generic Format Support. I like Artifact Registry as a service for storing operating system and application packages. But no service supports every possible package type. Offering a “generic” type is smart here.

[blog] Anomaly detection with few labeled samples under distribution mismatch. I like that this is open source and now widely available. Use it for anomaly detection in data sets.

[blog] The Kubernetes ecosystem is a candy store. Kubernetes is 10 years old, and has a healthy ecosystem that the community should be proud of.

[article] Everyone Wants to Ditch the Middleman. Or Do They? This article walks through some new research into whether consumers are going direct for services or still using an intermediary.

[blog] Reading Google Sheets from a Go program. I remember the old days of trying to parse Microsoft Office objects in code. Not fun. These SaaS platforms are so much easier to interact with.

[blog] Creating a Bespoke Platform as a Service: History Doesn’t Repeat, but It Often Rhymes. Most everyone has SOME sort of platform that underpins their work. Daniel looks at what folks consider using, and why a new crop of tooling looks promising.

[blog] Introducing the Google Developer Program: Unlock New Opportunities. Lots of perks, and no cost to this program. Take a look.

##

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

June 4, 2024
Daily Reading List – June 3, 2024 (#331)

I traveled up to Seattle today for an offsite, so I got limited reading done. But, still some good ones in there!

[article] Unexpected Anti-Patterns for Engineering Leaders — Lessons From Stripe, Uber & Carta. Wow, this is great. These are legit “anti-patterns” that Will says you should actually embrace. All of these resonated with me.

[blog] Cloud CISO Perspectives: What the past year tells us about our cybersecurity future. I often see doom and gloom about security—recent hacks at Ticketmaster and Hugging Face come to mind—but a sub-heading here of “Attackers innovate, but defenders get better, too” gives me optimism.

[blog] Developer Experience: What not to do. Fair list. There’s logic behind doing each of these things, but companies need to recognize who in their audience those things are for. Often, not developers.

[blog] Gemini 1.5 Flash Outperforms Much More Expensive Models. A lot of the discussion on which model to choose focuses on benchmarks, which is reasonable. But I like this additional focus on cost as well.

[blog] How to Evaluate Video Performance in Developer Relations. Are you publishing videos yourself or through your company? How do you measure success? This post takes a look.

[blog] Vertical Slice Architecture: Structuring Vertical Slices. This pattern has been around a while, but I feel like I’m seeing more about it lately. I’m a fan of slicing through all the application layers and delivering value, versus building up stacks layer by layer.

[blog] AI Overviews: About last week. I thought this was a good response to the flare-up last week around goofy AI generated answers in Google search results.

##

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

June 3, 2024
Daily Reading List – May 31, 2024 (#330)

It was a short week (thanks to the Monday holiday here in the States), but a full one. I’ve got a trip to Seattle coming up next week, so the fun continues.

[article] What to Know About Starting Your Career Remotely. Those of you who work fully remotely, hats off. I’ve done it for extended periods, but there are things I missed out on. This is good guidance for those starting off.

[blog] Shipping Fast with FastAPI and Cloud Run. If you’re building Python APIs, you might be using FastAPI. This is a complete walkthrough of an end to end scenario.

[blog] Cloud Run: the fastest way to get your AI applications to production. Speaking of Cloud Run, this is a great look at why this serverless product is a strong fit for AI apps.

[article] Deno adds support for private NPM registries. This JavaScript runtime keeps chugging along, adding useful features. It’s new Node compatibility features should speed adoption.

[blog] Looking into Agent Builder on Vertex AI and Reasoning Engine for building Generative AI Agent. You have many options when it comes to building AI agents, and this post looks at a couple of good ones.

[blog] Data Platform Explained Part II. More from the Spotify team about how they think about data collection and processing, along with the cultural aspects around a data platform.

[blog] How DZ BANK improved developer productivity with Cloud Workstations. There’s a surprising amount of detail in this story about setting up cloud-based dev environments.

[blog] What’s new for the Google Cloud global front end for web delivery and protection. There aren’t many (any?) infrastructure platforms like Google, and Cloud customers can get unique protections by leveraging our global front end. We’ve made some useful updates that you can read about here.

[blog] Why you shouldn’t use AI to write your tests. This writer looks at why we write tests, and recommends using AI for higher order tests.

[blog] 5 myths about platform engineering: what it is and what it isn’t. There’s a lot of noise out there about platform engineering, but what the real scoop? These are good myths to bust.

##

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

May 31, 2024
Store prompts in source control and use AI to generate the app code in the build pipeline? Sounds weird. Let’s try it!
I can’t remember who mentioned this idea to me. It might have been a customer, colleague, internet rando, or voice in my head. But the idea was whether you could use source control for the prompts, and leverage an LLM to dynamically generate all the app code each time you run a build. That seems bonkers for all sorts of reasons, but I wanted to see if it was technically feasible.

Should you do this for real apps? No, definitely not yet. The non-deterministic nature of LLMs means you’d likely experience hard-to-find bugs, unexpected changes on each build, and get yelled at by regulators when you couldn’t prove reproducibility in your codebase. When would you use something like this? I’m personally going to use this to generate stub apps to test an API or database, build demo apps for workshops or customer demos, or to create a component for a broader architecture I’m trying out.

tl;dr I built an AI-based generator that takes a JSON file of prompts like this and creates all the code. I call this generator from a CI pipeline which means that I can check in (only) the prompts to GitHub, and end up with a running app in the cloud.
```
{
  "folder": "generated-web",
  "prompts": [
    {
      "fileName": "employee.json",
      "prompt": "Generate a JSON structure for an object with fields for id, full name, state date, and office location. Populate it with sample data. Only return the JSON content and nothing else."
    },
    {
      "fileName": "index.js",
      "prompt": "Create a node.js program. It instantiates an employee object that looks like the employee.json structure. Start up a web server on port 8080 and expose a route at /employee return the employee object defined earlier."
    },
    {
      "fileName": "package.json",
      "prompt": "Create a valid package.json for this node.js application. Do not include any comments in the JSON."
    },
    {
      "fileName": "Dockerfile",
      "prompt": "Create a Dockerfile for this node.js application that uses a minimal base image and exposes the app on port 8080."
    }
  ]
}
```
In this post, I’ll walk through the steps of what a software delivery workflow such as this might look like, and how I set up each stage. To be sure, you’d probably make different design choices, write better code, and pick different technologies. That’s cool; this was mostly an excuse for me to build something fun.

Before explaining this workflow, let me first show you the generator itself and how it works.

Building an AI code generator

There are many ways to build this. An AI framework makes it easier, and I chose Spring AI because I wanted to learn how to use it. Even though this is a Java app, it generates code in any programming language.

I began at Josh Long’s second favorite place on the Internet, start.spring.io. Here I started my app using Java 21, Maven, and the Vertex AI Gemini starter, which pulls in Spring AI.

My application properties point at my Google Cloud project and I chose to use the impressive new Gemini 1.5 Flash model for my LLM.
```
spring.application.name=demo
spring.ai.vertex.ai.gemini.projectId=seroter-project-base
spring.ai.vertex.ai.gemini.location=us-central1
spring.ai.vertex.ai.gemini.chat.options.model=gemini-1.5-flash-001
```
My main class implements the CommandLineRunner interface and expects a single parameter, which is a pointer to a JSON file containing the prompts. I also have a couple of classes that define the structure of the prompt data. But the main generator class is where I want to spend some time.

Basically, for each prompt provided to the app, I look for any local files to provide as multimodal context into the request (so that the LLM can factor in any existing code as context when it processes the prompt), call the LLM, extract the resulting code from the Markdown wrapper, and write the file to disk.

Here are those steps in code. First I look for local files:
```
//load code from any existing files in the folder
private Optional<List<Media>> getLocalCode() {
    String directoryPath = appFolder;
    File directory = new File(directoryPath);

    if (!directory.exists()) {
        System.out.println("Directory does not exist: " + directoryPath);
        return Optional.empty();
    }

    try {
        return Optional.of(Arrays.stream(directory.listFiles())
            .filter(File::isFile)
            .map(file -> {
                try {
                    byte[] codeContent = Files.readAllLines(file.toPath())
                        .stream()
                        .collect(Collectors.joining("\n"))
                        .getBytes();
                    return new Media(MimeTypeUtils.TEXT_PLAIN, codeContent);
                } catch (IOException e) {
                    System.out.println("Error reading file: " + file.getName());
                    return null;
                }
            })
            .filter(Objects::nonNull)
            .collect(Collectors.toList()));
    } catch (Exception e) {
        System.out.println("Error getting local code");
        return Optional.empty();
    }
}
```
I call the LLM using Spring AI, choosing one of two method depending on whether there’s any local code or not. There won’t be any code when the first prompt is executed!
```
//call the LLM and pass in existing code
private String callLlmWithLocalCode(String prompt, List<Media> localCode) {
    System.out.println("calling LLM with local code");
    var userMessage = new UserMessage(prompt, localCode);
    var response = chatClient.call(new Prompt(List.of(userMessage)));
    return extractCodeContent(response.toString());
}

//call the LLM when there's no local code
private String callLlmWithoutLocalCode(String prompt) {
    System.out.println("calling LLM withOUT local code");
    var response = chatClient.call(prompt);
    return extractCodeContent(response.toString());
}
```
You see there that I’m extracting the code itself from the response string with this operation:
```
//method that extracts code from the LLM response
public static String extractCodeContent(String markdown) {

    System.out.println("Markdown: " + markdown);

    String regex = "`(\\w+)?\\n([\\s\\S]*?)```";  
    Pattern pattern = Pattern.compile(regex);
    Matcher matcher = pattern.matcher(markdown);

    if (matcher.find()) {
        String codeContent = matcher.group(2); // Extract group 2 (code content)
        return codeContent;
    } else {
        //System.out.println("No code fence found.");
        return markdown;
    }
}
```
And finally, I write the resulting code to disk:
```
//write the final code to the target file path
private void writeCodeToFile(String filePath, String codeContent) {
    try {
        File file = new File(filePath);
        if (!file.exists()) {
            file.createNewFile();
        }

        FileWriter writer = new FileWriter(file);
        writer.write(codeContent);
        writer.close();

        System.out.println("Content written to file: " + filePath);
    } catch (IOException e) {
        e.printStackTrace();
    }
}
```
There’s some more ancillary stuff that you can check out in the complete GitHub repo with this app in it. I was happy to be using Gemini Code Assist while building this. This AI assistant helped me understand some Java concepts, complete some functions, and fix some of my subpar coding choices.

That’s it. Once I had this component, I built a JAR file and could now use it locally or in a continuous integration pipeline to produce my code. I uploaded the JAR file to Google Cloud Storage so that I could use it later in my CI pipelines. Now, onto the day-to-day workflow that would use this generator!

Workflow step: Set up repo and pipeline

Like with most software projects, I’d start with the supporting machinery. In this case, I needed a source repo to hold the prompt JSON files. Done.

And I’d also consider setting up the path to production (or test environment, or whatever) to build the app as it takes shape. I’m using Google Cloud Build for a fully-managed CI service. It’s a good service with a free tier. Cloud Build uses declarative manifests for pipelines, and this pipeline starts off the same for any type of app.
```
steps:
  # Print the contents of the current directory
  - name: 'bash'
    id: 'Show source files'
    script: |
      #!/usr/bin/env bash
      ls -l

  # Copy the JAR file from Cloud Storage
  - name: 'gcr.io/cloud-builders/gsutil'
    id: 'Copy AI generator from Cloud Storage'
    args: ['cp', 'gs://seroter-llm-demo-tools/demo-0.0.1-SNAPSHOT.jar', 'demo-0.0.1-SNAPSHOT.jar']

  # Print the contents of the current directory
  - name: 'bash'
    id: 'Show source files and builder tool'
    script: |
      #!/usr/bin/env bash
      ls -l
```
Not much to it so far. I just print out the source contents seen in the pipeline, download the AI code generator from the above-mentioned Cloud Storage bucket, and prove that it’s on the scratch disk in Cloud Build.

Ok, my dev environment was ready.

Workflow step: Write prompts

In this workflow, I don’t write code, I write prompts that generate code. I might use something like Google AI Studio or even Vertex AI to experiment with prompts and iterate until I like the response I get.

Within AI Studio, I chose Gemini 1.5 Flash because I like nice things. Here, I’d work through the various prompts I would need to generate a working app. This means I still need to understand programming languages, frameworks, Dockerfiles, etc. But I’m asking the LLM to do all the coding.

Once I’m happy with all my prompts, I add them to the JSON file. Note that each prompt entry has a corresponding file name that I want the generator to use when writing to disk.

At this point, I was done “coding” the Node.js app. You could imagine having a dozen or so templates of common app types and just grabbing one and customizing it quickly for what you need!

Workflow step: Test locally

To test this, I put the generator in a local folder with a prompt JSON file and ran this command from the shell:
```
rseroter$ java -jar  demo-0.0.1-SNAPSHOT.jar --prompt-file=app-prompts-web.json
```
After just a few seconds, I had four files on disk.

This is just a regular Node.js app. After npm install and npm start commands, I ran the app and successfully pinged the exposed API endpoint.

Can we do things more sophisticated? I haven’t tried a ton of scenarios, but I wanted to see if I could get a database interaction generated successfully.

I went into the Google Cloud console and spun up a (free tier) instance of Cloud Firestore, our NoSQL database. I then created a “collection” called “Employees” and added a single document to start it off.

Then I built a new prompts file with directions to retrieve records from Firestore. I messed around with variations that encouraged the use of certain libraries and versions. Here’s a version that worked for me.
```
{
  "folder": "generated-web-firestore",
  "prompts": [
    {
      "fileName": "employee.json",
      "prompt": "Generate a JSON structure for an object with fields for id, full name, state date, and office location. Populate it with sample data. Only return the JSON content and nothing else."
    },
    {
      "fileName": "index.js",
      "prompt": "Create a node.js program. Start up a web server on port 8080 and expose a route at /employee. Initializes a firestore database using objects from the @google-cloud/firestore package, referencing Google Cloud project 'seroter-project-base' and leveraging Application Default credentials. Return all the documents from the Employees collection."
    },
    {
      "fileName": "package.json",
      "prompt": "Create a valid package.json for this node.js application using version 7.7.0 for @google-cloud/firestore dependency. Do not include any comments in the JSON."
    },
    {
      "fileName": "Dockerfile",
      "prompt": "Create a Dockerfile for this node.js application that uses a minimal base image and exposes the app on port 8080."
    }
  ]
}
```
After running the prompts through the generator app again, I got four new files, this time with code to interact with Firestore!

Another npm install and npm start command set started the app and served up the document sitting in Firestore. Very nice.

Finally, how about a Python app? I want a background job that actually populates the Firestore database with some initial records. I experimented with some prompts, and these gave me a Python app that I could use with Cloud Run Jobs.
```
{
  "folder": "generated-job-firestore",
  "prompts": [
    {
      "fileName": "main.py",
      "prompt": "Create a Python app with a main function that initializes a firestore database object with project seroter-project-base and Application Default credentials. Add two documents to the Employees collection. Generate random id, fullname, startdate, and location data for each document. Have the start script try to call that main function and if there's an exception, prints the error."
    },
    {
      "fileName": "requirements.txt",
      "prompt": "Create a requirements.txt file for the packages used by this app"
    },
    {
      "fileName": "Procfile",
      "prompt": "Create a Procfile for python3 that starts up main.py"
    },
    {
      "fileName": "Dockerfile",
      "prompt": "Create a Dockerfile for this Python batch application that uses a minimal base image and doesn't expose any ports"
    }
  ]
}
```
Running this prompt set through the AI generator gave me the valid files I wanted. All my prompt files are here.

At this stage, I was happy with the local tests and ready to automate the path from source control to cloud runtime.

Workflow step: Generate app in pipeline

Above, I had started the Cloud Build manifest with the step of yanking down the AI generator JAR file from Cloud Storage.

The next step is different for each app we’re building. I could use substitution variables in Cloud Build and have a single manifest for all of them, but for demonstration purposes, I wanted one manifest per prompt set.

I added this step to what I already had above. It executes the same command in Cloud Build that I had run locally to test the generator. First I do an apt-get on the “ubuntu” base image to get the Java command I need, and then invoke my JAR, passing in the name of the prompt file.
```
...

# Run the JAR file
  - name: 'ubuntu'
    id: 'Run AI generator to create code from prompts'
    script: |
      #!/usr/bin/env bash
      apt-get update && apt-get install -y openjdk-21-jdk
      java -jar  demo-0.0.1-SNAPSHOT.jar --prompt-file=app-prompts-web.json

  # Print the contents of the generated directory
  - name: 'bash'
    id: 'Show generated files'
    script: |
      #!/usr/bin/env bash
      ls ./generated-web -l
```
I updated my Cloud Build pipeline that’s connected to my GitHub repo with an updated YAML manifest.

Running the pipeline at this point showed that the generator worked correctly and adds the expected files to the scratch volume in the pipeline. Awesome.

At this point, I had an app generated from prompts found in GitHub.

Workflow step: Upload artifact

Next up? Getting this code into a deployable artifact. There are plenty of options, but I want to use a container-based runtime, and need a container image. Cloud Build makes that easy.

I added another section to my existing Cloud Build manifest to containerize with Docker and upload to Artifact Registry.
```
 # Containerize the code and upload to Artifact Registry
  - name: 'gcr.io/cloud-builders/docker'
    id: 'Containerize generated code'
    args: ['build', '-t', 'us-west1-docker.pkg.dev/seroter-project-base/ai-generated-images/generated-web:latest', './generated-web']
  - name: 'gcr.io/cloud-builders/docker'
    id: 'Push container to Artifact Registry'
    args: ['push', 'us-west1-docker.pkg.dev/seroter-project-base/ai-generated-images/generated-web']
```
It used the Dockerfile our AI generator created, and after this step ran, I saw a new container image.

Workflow step: Deploy and run app

The final step, running the workload! I could use our continuous deployment service Cloud Deploy but I took a shortcut and deployed directly from Cloud Build. This step in the Cloud Build manifest does the job.
```
  # Deploy container image to Cloud Run
  - name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
    id: 'Deploy container to Cloud Run'
    entrypoint: gcloud
    args: ['run', 'deploy', 'generated-web', '--image', 'us-west1-docker.pkg.dev/seroter-project-base/ai-generated-images/generated-web', '--region', 'us-west1', '--allow-unauthenticated']
```
After saving this update to Cloud Build and running it again, I saw all the steps complete successfully.

Most importantly, I had an active service in Cloud Run that served up a default record from the API endpoint.

I went ahead and ran a Cloud Build pipeline for the “Firestore” version of the web app, and then the background job that deploys to Cloud Run Jobs. I ended up with two Cloud Run services (web apps), and one Cloud Run Job.

I executed the job, and saw two new Firestore records in the collection!

To prove that, I executed the Firestore version of the web app. Sure enough, the records returned include the two new records.

Wrap up

What we saw here was a fairly straightforward way to generate complete applications from nothing more than a series of prompts fed to the Gemini model. Nothing prevents you from using a different LLM, or using other source control, continuous integration, and hosting services. Just do some find-and-replace!

Again, I would NOT use this for “real” workloads, but this sort of pattern could be a powerful way to quickly create supporting apps and components for testing or learning purposes.

You can find the whole project here on GitHub.

What do you think? Completely terrible idea? Possibly useful?
May 31, 2024
Daily Reading List – May 30, 2024 (#329)

Today was a productive day, and I’m hoping to write up a fun blog post this evening about an app I’ve been working on. Stay tuned!

[blog] Disentangling the three languages: customers, product, and the business. Are you watching teams talk past each other and use local language that doesn’t translate to other contexts? Jason offers up a great post on how to translate.

[blog] Gemini 1.5 Pro and 1.5 Flash GA, 1.5 Flash tuning support, higher rate limits, and more API updates. These models are terrific, and now generally available. Along with billing enabled to get a higher rate limit.

[article] 10 big devops mistakes and how to avoid them. We’re not breaking any new ground here, but these are still useful points to keep in mind when starting or tuning your DevOps-style work.

[blog] Versioning with Git Tags and Conventional Commits. For you source control geeks out there, you’ll like this SEI post which explores semantic versioning with git tags.

[blog] Meet 24 startups advancing healthcare with AI. A common thread through this list is those who are using AI to personalize the experiences for their patients and users.

[blog] Don’t DRY Your Code Prematurely. It’s not unreasonable to quickly try and consolidate code that appears redundant, but this post advises you to not rush. I built something recently where I just let the duplication sit for a while, and used AI tools to eventually de-dupe.

[article] Top 5 Cutting-Edge JavaScript Techniques. There are plenty of timeless techniques in any programming language, but it’s also easy to go stale and miss new approaches. This article looks at some JavaScript techniques folks should consider using.

[blog] Query-Defined Infrastructure with Firebase Data Connect. This takes the idea of “fully managed” in a fresh and exciting direction. Your data model triggers a host of auto-generated infrastructure and SDKs to support it.

[blog] Do you know about Quality of Service in Kubernetes?? It’s a quick post, but a good reminder of what it means to specify (or not specify) infrastructure reservations for Kubernetes workloads.

[blog] Vertex AI’s Grounding with Google Search: how to use it and why. Incorporating Google search results into LLM responses is a truly useful way to get timely, trusted answers.

##

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

May 30, 2024
Daily Reading List – May 29, 2024 (#328)

I found lots of good advice in today’s reading list, and I hope you do too.

[article] Reducing Code Review Time at Google. This article looks at a recent paper from us that covers how we use a code review assistant to help us improve productivity.

[article] What We Learned from a Year of Building with LLMs (Part I). There’s a whole lot of advice in this big, useful post up on O’Reilly. Dig in for legit guidance on prompting, information retrieval, tuning, and more.

[article] Bertrand Russell: On Avoiding Foolish Opinions. A spicy take, but a warranted one for many of us who want to be better thinkers.

[blog] App Hosting vs. the original Hosting: Which one do I use? Firebase is running on all cylinders right now. This is a well-done post that explains their new App Hosting service, and when to choose it.

[blog] What’s New in Angular 18? I’m still not going to become a frontend guy, but I do like staying aware of what’s new and relevant in this space.

[blog] Continuous delivery without a CI server. Do you need a build system? Not for every app or every team. This post looks at a case where it wasn’t needed.

[blog] Adding Context to Retrieval-Augmented Generation with Gemini Function Calling and MongoDB Atlas. Here’s a deep walkthrough of a scenario where you look up supporting info in MongoDB to support your LLM queries.

[blog] A Tale of Two Functions : Function calling in Gemini. Check out this related post for even more about function calling. This is a key LLM pattern, and it’s worth understanding the fundamentals.

[article] 3 Ways to Clearly Communicate Your Company’s Strategy. I liked this. A lot can go into developing a strategy, and it may be hard to succinctly summarize it. These are good options.

[blog] I made a new cartoon thing for you to try. Forrest creates good cartoons, and now you can access high quality versions for a fee.

[blog] Solving the Dual-Write Problem: Effective Strategies for Atomic Updates Across Systems. What patterns are at your disposal when you need an all-or-nothing write to two systems? This Confluent post explores your options.

##

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

May 29, 2024
Daily Reading List – May 28, 2024 (#327)

Whew, what a Tuesday. I had an outstanding 3-day weekend with sunshine, friends, and baseball. I also (mostly) completed a fun coding project that I’ll blog about later this week. Today was a blur, but a lot got done. I think.

[article] A Great Sales Pitch Hinges on the Right Story. It doesn’t matter if you’re in sales, engineering, program management, or most any other role. Get good at storytelling!

[blog] Effective large language model adaptation for improved grounding. Some cool work from Google Research that looks at a new framework for adapting a base LLM to self-ground responses.

[blog] “The Business” is BS. If you’re in IT, or a tech consultant, don’t refer to a set of people as “the business.” It creates an unnecessary separation, and treats tech as a far-off service provider.

[blog] The Boring Product Manifesto. Making products shouldn’t be so dramatic. John says that we need more of the “good kind” of boring.

[blog] Using LLMs to Learn From YouTube. This seems like a complicated architecture, but it gets the job done.

[blog] Lazy Work, Good Work. Massively important point. Our most creative work, and the moments where we connect the dots, doesn’t happen in meetings. Get more thinking time.

[blog] Don’t Get Lost in the Metrics Maze: A Practical Guide to SLOs, SLIs, Error Budgets, and Toil. Here’s a brief, helpful take on some of the core ideas behind Site Reliability Engineering and focusing on the right dimensions when keeping a system online and healthy.

[blog] Grounding Gemini with Web Search results in LangChain4j. Read this for an excellent example of how to call an LLM and ground the results in a trustworthy source.

[blog] The future of foundation models is closed-source. Those building “open” models aren’t doing charity work, There’s other motives, and John encourages thinking about which models you’re betting on.

[blog] What if…slower wasn’t safer? Instead of “getting it right” by slowing down, maybe it’s smarter to make the inevitable process of making mistakes cheaper and faster? That’s the argument here.

[blog] The Future is Now: TuringBots Will Collapse the Software Development Life Cycle Siloes. I like the work that Forrester has done on AI dev assistants. Diego talks here about the changing SDLC.

##

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

May 28, 2024