Category: Cloud

  • 4 ways to pay down tech debt by ruthlessly removing stuff from your architecture

    4 ways to pay down tech debt by ruthlessly removing stuff from your architecture

    What advice do you get if you’re lugging around a lot of financial debt? Many folks will tell you to start purging expenses. Stop eating out at restaurants, go down to one family car, cancel streaming subscriptions, and sell unnecessary luxuries. For some reason, I don’t see the same aggressive advice when it comes to technical debt. I hear soft language around “optimization” or “management” versus assertive stances that take a meat cleaver to your architectural excesses.

    What is architectural debt? I’m thinking about bloated software portfolios where you’re carrying eight products in every category. Brittle automation that only partially works and still requires manual workarounds and black magic. Unique customizations to packaged software that’s now keeping you from being able to upgrade to modern versions. Also half-finished “ivory tower” designs where the complex distributed system isn’t fully in place, and may never be. You might have too much coupling, too little coupling, unsupported frameworks, and all sorts of things that make deployments slow, maintenance expensive, and wholesale improvements impossible.

    This stuff matters. The latest StackOverflow developer survey shows that the most common frustration is the “amount of technical debt.” It’s wasting up to eight hours a week for each developer! Number two and three are around stack complexity. Your code and architectural tech debt is slowing down your release velocity, creating attrition with your best employees, and limiting how much you can invest in new tech areas. It’s well-past time to simplify by purging architecture components that have built up (and calcified) over time. Let’s write bigger checks to pay down this debt faster.

    Explore these four areas, all focused on simplification. There are obviously tradeoffs and cost with each suggestion, but you’re not going to make meaningful progress by being timid. Note there are other dimensions to fixing tech debt besides simplification, but that’s one I see discussed the least often. I’ll use Google Cloud to offer some examples of how you might specifically tackle each, given we’re the best cloud for those making a firm shift away from legacy tech debt.

    1. Stop moving so much data around.

    If you zoom out on your architecture, how many components do you have that get data from point A to point B? I’d bet that you have lots of ETL pipelines to consolidate data into a warehouse or data lake, messaging and event processing solutions to shunt data around, and even API calls that suck data from one system into another. That’s a lot of machinery you have to create, update, and manage every day.

    Can you get rid of some of this? Can you access more of the data where it rests, versus copying it all over the place? Or use software that act on data in different ways without forcing you to migrate it for further processing? I think so.

    Let’s see some examples.

    Perform analytical queries against data sitting in different places? Google Cloud supports that with BigQuery Omni. We run BigQuery in AWS and Azure so that you can access data at rest, and not be forced to consolidate it in a single data lake. Here, I have an Excel file sitting in an Azure blob storage account. I could copy that data over to Google Cloud, but that’s more components for me to create and manage.

    Rather, I can set up a pointer to Azure from within BigQuery, and treat it like any other table. The data is processed in Azure, and only summary info travels across the wire.

    You might say “that’s cool, but I have related data in another cloud, so I’d have to move it anyway to do joins and such.” You’d think so. But we also offer cross-cloud joins with BigQuery Omni. Check this out. I’ve got that employee data in Azure, but timesheet data in Google Cloud.

    With a single SQL statement, I’m joining data across clouds. No data movement required. Less debt.

    Enrich data in analytical queries from outside databases? You might have ETL jobs in place to bring reference data into your data warehouse to supplement what’s already there. That may be unnecessary.

    With BigQuery’s Federated Queries, I can reach live into PostgreSQL, MySQL, Cloud Spanner, and even SAP Datasphere sources. Access data where it rests. Here, I’m using the EXTERNAL_QUERY function to retrieve data from a Cloud SQL database instance.

    I could use that syntax to perform joins, and do all sorts of things without ever moving data around.

    Perform complex SQL analytics against log data? Does your architecture have data copying jobs for operational data? Maybe to get it into a system where you can perform SQL queries against logs? There’s a better way.

    Google Cloud Log Analytics lets you query, view, and analyze log data without moving it anywhere.

    You can’t avoid moving data around. It’s often required. But I’m fairly sure that through smart product selection and some redesign of the architecture, you could eliminate a lot of unnecessary traffic.

    2. Compress the stack by removing duplicative components.

    Break out the chainsaw. Do you have multiple products for each software category? Or too many fine-grained categories full of best-of-breed technology? It’s time to trim.

    My former colleague Josh McKenty used to say something along the lines of “if it’s emerging buy a few, it’s a mature, no more than two.”

    You don’t need a dozen project management software products. Or more than two relational database platforms. In many cases, you can use multi-purpose services and embrace “good enough.”

    There should be a fifteen day cooling off period before you buy a specialized vector database. Just use PostgreSQL. Or, any number of existing databases that now support vector capabilities. Maybe you can even skip RAG-based solutions (and infrastructure) all together for certain use cases and just use Gemini with its long context.

    Do you have a half-dozen different event buses and stream processors? Maybe you don’t need all that? Composite services like Google Cloud Pub/Sub can be a publish/subscribe message broker, apply a log-like approach with a replay-able stream, and do push-based notifications.

    You could use Spanner Graph instead of a dedicated graph database, or Artifact Registry as a single place for OS and application packages.

    I’m keen on the new continuous queries for BigQuery where you can do stream analytics and processing as data comes into the warehouse. Enrich data, call AI models, and more. Instead of a separate service or component, it’s just part of the BigQuery engine. Turn off some stuff?

    I suspect that this one is among the hardest for folks to act upon. We often hold onto technology because it’s familiar, or even because of misplaced loyalty. But be bold. Simplify your stack by getting rid of technology that’s no longer differentiated. Make a goal of having 30% fewer software products or platforms in your architecture in 2025.

    3. Replace hyper-customized software and automation with managed services and vanilla infrastructure.

    Hear me out. You’re not that unique. There are a handful of things that your company does which are the “secret sauce” for your success, and the rest is the same as everyone else.

    More often than not, you should be fitting your team to the software, not your software to the team. I’ve personally configured and extended packaged software to a point that it was unrecognizable. For what? Because we thought our customer service intake process was SO MUCH different than anyone else’s? It wasn’t. So much tech debt happens because we want to shape technology to our existing requirements, or we want to avoid “lock-in” by committing to a vendor’s way of doing things. I think both are misguided.

    I read a lot of annual reports from public companies. I’ve never seen “we slayed at Kubernetes this year” called out. Nobody cares. A cleverly scripted, hyper-customized setup that looks like the CNCF landscape diagram is more boat anchor than accelerator. Consider switching a fully automated managed cluster in something like GKE Autopilot. Pay per pod, and get automatic upgrades, secure-by-default configurations, and a host of GKE Enterprise features to create sameness across clusters.

    Or thank-and-retire that customized or legacy workflow engine (code framework, or software product) that only four people actually understand. Use a nicely API-enabled managed product with useful control-flow actions, or a full-fledged cloud-hosted integration engine.

    You probably don’t need a customized database, caching solution, or even CI/CD stack. These are all super mature solution spaces, where whatever is provided out of the box is likely suitable for what you really need.

    4. Tone it down on the microservices and distributed systems.

    Look, I get excited about technology and want to use all the latest things. But it’s often overkill, especially in the early (or late) stages of a product.

    You simply don’t need a couple dozen serverless functions to serve a static web app. Simmer down. Or a big complex JavaScript framework when your site has a pair of pages. So much technical debt comes from over-engineering systems to use the latest patterns and technology, when the classic ones will do.

    Smash most of your serverless functions back into an “app” hosted in Cloud Run. Fewer moving parts, and all the agility you want. Use vanilla JavaScript where you can. Use small, geo-located databases until you MUST to do cross-region or global replication. Don’t build “developer platforms” and IDPs until you actually need them.

    I’m not going all DHH on you, but most folks would be better off defaulting to more monolithic systems running on a server or two. We’ve all over-distributed too many services and created unnecessarily complex architectures that are now brittle or impossible to understand. If you need the scale and resilience of distributed systems RIGHT NOW then go build one. But most of us have gotten burned from premature optimization because we assumed that our system had to handle 100x user growth overnight.

    Wrap Up

    Every company has tech debt, whether the business is 100 years old or started last week. Google has it, big banks have it, the governments have it, and YC companies have it. And “managing it” is probably a responsible thing to do. But sometimes, when you need to make a step-function improvement in how you work, incremental changes aren’t good enough. Simplify by removing the cruft, and take big cuts out of your architecture to do it!

  • Three Ways to Run Apache Kafka in the Public Cloud

    Three Ways to Run Apache Kafka in the Public Cloud

    Yes, people are doing things besides generative AI. You’ve still got other problems to solve, systems to connect, and data to analyze. Apache Kafka remains a very popular product for event and data processing, and I was thinking about how someone might use it in the cloud right now. I think there are three major options, and one of them (built-in managed service) is now offered by Google Cloud. So we’ll take that for a spin.

    Option 1: Run it yourself on (managed) infrastructure

    Many companies choose to run Apache Kafka themselves on bare metal, virtual machines, or Kubernetes clusters. It’s easy to find stories about companies like Netflix, Pinterest, and Cloudflare running their own Apache Kafka instances. Same goes for big (and small) enterprises that choose to setup and operate dedicated Apache Kafka environments.

    Why do this? It’s the usual reasons why people decide to manage their own infrastructure! Kafka has a lot of configurability, and experienced folks may like the flexibility and cost profile of running Apache Kafka themselves. Pick your infrastructure, tune every setting, and upgrade on your timetable. On the downside, self-managed Apache Kafka can result in a higher total cost of ownership, requires specialized skills in-house, and could distract you from other high-priority work.

    If you want to go that route, I see a few choices.

    There’s no shame in going this route! It’s actually very useful to know how to run software like Apache Kafka yourself, even if you decide to switch to a managed service later.

    Option 2: Use a built-in managed service

    You might want Apache Kafka, but not want to run Apache Kafka. I’m with you. Many folks, including those at big web companies and classic enterprises, depend on managed services instead of running the software themselves.

    Why do this? You’d sign up for this option when you want the API, but not the ops. It may be more elastic and cost-effective than self-managed hosting. Or, it might cost more from a licensing perspective, but provide more flexibility on total cost of ownership. On the downside, you might not have full access to every raw configuration option, and may pay for features or vendor-dictated architecture choices you wouldn’t have made yourself.

    AWS offers an Amazon Managed Streaming for Apache Kafka product. Microsoft doesn’t offer a managed Kafka product, but does provide a subset of the Apache Kafka API in front of their Azure Event Hubs product. Oracle cloud offers self-managed infrastructure with a provisioning assist, but also appears to have a compatible interface on their Streaming service.

    Google Cloud didn’t offer any native service until just a couple of months ago. The Apache Kafka for BigQuery product is now in preview and looks pretty interesting. It’s available in a global set of regions, and provides a fully-managed set of brokers that run in a VPC within a tenant project. Let’s try it out.

    Set up prerequisites

    First, I needed to enable the API within Google Cloud. This gave me the ability to use the service. Note that this is NOT FREE while in preview, so recognize that you’ll incur changes.

    Next, I wanted a dedicated service account for accessing the Kafka service from client applications. The service supports OAuth and SASL_PLAIN with service account keys. The latter is appropriate for testing, so I chose that.

    I created a new service account named seroter-bq-kafka and gave it the roles/managedkafka.client role. I also created a JSON private key and saved it to my local machine.

    That’s it. Now I was ready to get going with the cluster.

    Provision the cluster and topic

    I went into the Apache Kafka for BigQuery dashboard in the Google Cloud console—I could have also used the CLI which has the full set of control plane commands—to spin up a new cluster. I get very few choices, and that’s not a bad thing. You give the CPU and RAM capacity for the cluster, and Google Cloud creates the right shape for the brokers, and creates a highly available architecture. You’ll also see that I choose the VPC for the cluster, but that’s about it. Pretty nice!

    In about twenty minutes, my cluster was ready. Using the console or CLI, I could see the details of my cluster.

    Topics are a core part of Apache Kafka represent the resource you publish and subscribe to. I could create a topic via the UI or CLI. I created a topic called “topic1”.

    Build the producer and consumer apps

    I wanted two client apps. One to publish new messages to Apache Kafka, and another to consume messages. I chose Node.js and JavaScript as the language for the app. There are a handful of libraries for interacting with Apache Kafka, and I chose the mature kafkajs.

    Let’s start with the consuming app. I need (a) the cluster’s bootstrap server URL and (b) the encoded client credentials. We access the cluster through the bootstrap URL and it’s accessible via the CLI or the cluster details (see above). The client credentials for SASL_PLAIN authentication consists of the base64 encoded service account key JSON file.

    My index.js file defines a Kafka object with the client ID (which identifies our consumer), the bootstrap server URL, and SASL credentials. Then I define a consumer with a consumer group ID and subscribe to the “topic1” we created earlier. I process and log each message before appending to an array variable. There’s an HTTP GET endpoint that returns the array. See the whole index.js below, and the GitHub repo here.

    const express = require('express');
    const { Kafka, logLevel } = require('kafkajs');
    const app = express();
    const port = 8080;
    
    const kafka = new Kafka({
      clientId: 'seroter-consumer',
      brokers: ['bootstrap.seroter-kafka.us-west1.managedkafka.seroter-project-base.cloud.goog:9092'],
      ssl: {
        rejectUnauthorized: false
      },
      logLevel: logLevel.DEBUG,
      sasl: {
        mechanism: 'plain', // scram-sha-256 or scram-sha-512
        username: 'seroter-bq-kafka@seroter-project-base.iam.gserviceaccount.com',
        password: 'tybgIC ... pp4Fg=='
      },
    });
    
    const consumer = kafka.consumer({ groupId: 'message-retrieval-group' });
    
    //create variable that holds an array of "messages" that are strings
    let messages = [];
    
    async function run() {
      await consumer.connect();
      //provide topic name when subscribing
      await consumer.subscribe({ topic: 'topic1', fromBeginning: true }); 
    
      await consumer.run({
        eachMessage: async ({ topic, partition, message }) => {
          console.log(`################# Received message: ${message.value.toString()} from topic: ${topic}`);
          //add message to local array
          messages.push(message.value.toString());
        },
      });
    }
    
    app.get('/consume', (req, res) => {
        //return the array of messages consumed thus far
        res.send(messages);
    });
    
    run().catch(console.error);
    
    app.listen(port, () => {
      console.log(`App listening at http://localhost:${port}`);
    });
    

    Now we switch gears and go through the producer app that publishes to Apache Kafka.

    This app starts off almost identically to the consumer app. There’s a Kafka object with a client ID (different for the producer) and the same pointer to the bootstrap server URL and credentials. I’ve got an HTTP GET endpoint that takes the querystring parameters and publishes the key and value content to the request payload. The code is below, and the GitHub repo is here.

    const express = require('express');
    const { Kafka, logLevel } = require('kafkajs');
    const app = express();
    const port = 8080; // Use a different port than the consumer app
    
    const kafka = new Kafka({
        clientId: 'seroter-publisher',
        brokers: ['bootstrap.seroter-kafka.us-west1.managedkafka.seroter-project-base.cloud.goog:9092'],
        ssl: {
          rejectUnauthorized: false
        },
        logLevel: logLevel.DEBUG,
        sasl: {
          mechanism: 'plain', // scram-sha-256 or scram-sha-512
          username: 'seroter-bq-kafka@seroter-project-base.iam.gserviceaccount.com',
          password: 'tybgIC ... pp4Fg=='
        },
      });
    
    const producer = kafka.producer();
    
    app.get('/publish', async (req, res) => {
      try {
        await producer.connect();
    
        const _key = req.query.key; // Extract key from querystring
        console.log('key is ' + _key);
        const _value = req.query.value // Extract value from querystring
        console.log('value is ' + _value);
    
        const message = {
          key: _key, // Optional key for partitioning
          value: _value
        };
    
        await producer.send({
          topic: 'topic1', // Replace with your topic name
          messages: [message]
        });
    
        res.status(200).json({ message: 'Message sent successfully' });
    
      } catch (error) {
        console.error('Error sending message:', error);
        res.status(500).json({ error: 'Failed to send message' });
      }
    });
    
    app.listen(port, () => {
      console.log(`Producer listening at http://localhost:${port}`);
    });
    
    

    Next up, containerizing both apps so that I could deploy to a runtime.

    I used Google Cloud Artifact Registry as my container store, and created a Docker image from source code using Cloud Native buildpacks. It took one command for each app:

    gcloud builds submit --pack image=gcr.io/seroter-project-base/seroter-kafka-consumer
    gcloud builds submit --pack image=gcr.io/seroter-project-base/seroter-kafka-publisher

    Now we had everything needed to deploy and test our client apps.

    Deploy apps to Cloud Run and test it out

    I chose Google Cloud Run because I like nice things. It’s still one of the best two or three ways to host apps in the cloud. We also make it much easier now to connect to a VPC, which is what I need. Instead of creating some tunnel out of my cluster, I’d rather access it more securely.

    Here’s how I configured the consuming app. I first picked my container image and a target location.

    Then I chose to use always-on CPU for the consumer, as I had connection issues when I had a purely ephemeral container.

    The last setting was the VPC egress that made it possible for this instance to talk to the Apache Kafka cluster.

    About three seconds later, I had a running Cloud Run instance ready to consume.

    I ran through a similar deployment process for the publisher app, except I kept the true “scale to zero” setting turned on since it doesn’t matter if the publisher app comes and goes.

    With all apps deployed, I fired up the browser and issued a pair of requests to the “publish” endpoint.

    I checked the consumer app’s logs and saw that messages were successfully retrieved.

    Sending a request to the GET endpoint on the consumer app returns the pair of messages I sent from the publisher app.

    Sweet! We proved that we could send messages to the Apache Kafka cluster, and retrieve them. I get all the benefits of Apache Kafka, integrated into Google Cloud, with none of the operational toil.

    Read more in the docs about this preview service.

    Option 3: Use a managed provider on your cloud(s) of choice

    The final way you might choose to run Apache Kafka in the cloud is to use a SaaS product designed to work on different infrastructures.

    The team at Confluent does much of the work on open source Apache Kafka and offers a managed product via Confluent Cloud. It’s performant, feature-rich, and runs in AWS, Azure, and Google Cloud. Another option is Redpanda, who offer a managed cloud service that they operate on their infrastructure in AWS or Google Cloud.

    Why do this? Choosing a “best of breed” type of managed service is going to give you excellent feature coverage and operational benefits. These platforms are typically operated by experts and finely tuned for performance and scale. Are there any downside? These platforms aren’t free, and don’t always have all the native integrations into their target cloud (logging, data services, identity, etc) that a built-in service does. And you won’t have all the configurability or infrastructure choice that you’d have running it yourself.

    Wrap up

    It’s a great time to run Apache Kafka in the cloud. You can go full DIY or take advantage of managed services. As always, there are tradeoffs with each. You might even use a mix of products and approaches for different stages (dev/test/prod) and departments within your company. Are there any options I missed? Let me know!

  • Store prompts in source control and use AI to generate the app code in the build pipeline? Sounds weird. Let’s try it!

    I can’t remember who mentioned this idea to me. It might have been a customer, colleague, internet rando, or voice in my head. But the idea was whether you could use source control for the prompts, and leverage an LLM to dynamically generate all the app code each time you run a build. That seems bonkers for all sorts of reasons, but I wanted to see if it was technically feasible.

    Should you do this for real apps? No, definitely not yet. The non-deterministic nature of LLMs means you’d likely experience hard-to-find bugs, unexpected changes on each build, and get yelled at by regulators when you couldn’t prove reproducibility in your codebase. When would you use something like this? I’m personally going to use this to generate stub apps to test an API or database, build demo apps for workshops or customer demos, or to create a component for a broader architecture I’m trying out.

    tl;dr I built an AI-based generator that takes a JSON file of prompts like this and creates all the code. I call this generator from a CI pipeline which means that I can check in (only) the prompts to GitHub, and end up with a running app in the cloud.

    {
      "folder": "generated-web",
      "prompts": [
        {
          "fileName": "employee.json",
          "prompt": "Generate a JSON structure for an object with fields for id, full name, state date, and office location. Populate it with sample data. Only return the JSON content and nothing else."
        },
        {
          "fileName": "index.js",
          "prompt": "Create a node.js program. It instantiates an employee object that looks like the employee.json structure. Start up a web server on port 8080 and expose a route at /employee return the employee object defined earlier."
        },
        {
          "fileName": "package.json",
          "prompt": "Create a valid package.json for this node.js application. Do not include any comments in the JSON."
        },
        {
          "fileName": "Dockerfile",
          "prompt": "Create a Dockerfile for this node.js application that uses a minimal base image and exposes the app on port 8080."
        }
      ]
    }
    

    In this post, I’ll walk through the steps of what a software delivery workflow such as this might look like, and how I set up each stage. To be sure, you’d probably make different design choices, write better code, and pick different technologies. That’s cool; this was mostly an excuse for me to build something fun.

    Before explaining this workflow, let me first show you the generator itself and how it works.

    Building an AI code generator

    There are many ways to build this. An AI framework makes it easier, and I chose Spring AI because I wanted to learn how to use it. Even though this is a Java app, it generates code in any programming language.

    I began at Josh Long’s second favorite place on the Internet, start.spring.io. Here I started my app using Java 21, Maven, and the Vertex AI Gemini starter, which pulls in Spring AI.

    My application properties point at my Google Cloud project and I chose to use the impressive new Gemini 1.5 Flash model for my LLM.

    spring.application.name=demo
    spring.ai.vertex.ai.gemini.projectId=seroter-project-base
    spring.ai.vertex.ai.gemini.location=us-central1
    spring.ai.vertex.ai.gemini.chat.options.model=gemini-1.5-flash-001
    

    My main class implements the CommandLineRunner interface and expects a single parameter, which is a pointer to a JSON file containing the prompts. I also have a couple of classes that define the structure of the prompt data. But the main generator class is where I want to spend some time.

    Basically, for each prompt provided to the app, I look for any local files to provide as multimodal context into the request (so that the LLM can factor in any existing code as context when it processes the prompt), call the LLM, extract the resulting code from the Markdown wrapper, and write the file to disk.

    Here are those steps in code. First I look for local files:

    //load code from any existing files in the folder
    private Optional<List<Media>> getLocalCode() {
        String directoryPath = appFolder;
        File directory = new File(directoryPath);
    
        if (!directory.exists()) {
            System.out.println("Directory does not exist: " + directoryPath);
            return Optional.empty();
        }
    
        try {
            return Optional.of(Arrays.stream(directory.listFiles())
                .filter(File::isFile)
                .map(file -> {
                    try {
                        byte[] codeContent = Files.readAllLines(file.toPath())
                            .stream()
                            .collect(Collectors.joining("\n"))
                            .getBytes();
                        return new Media(MimeTypeUtils.TEXT_PLAIN, codeContent);
                    } catch (IOException e) {
                        System.out.println("Error reading file: " + file.getName());
                        return null;
                    }
                })
                .filter(Objects::nonNull)
                .collect(Collectors.toList()));
        } catch (Exception e) {
            System.out.println("Error getting local code");
            return Optional.empty();
        }
    }
    

    I call the LLM using Spring AI, choosing one of two method depending on whether there’s any local code or not. There won’t be any code when the first prompt is executed!

    //call the LLM and pass in existing code
    private String callLlmWithLocalCode(String prompt, List<Media> localCode) {
        System.out.println("calling LLM with local code");
        var userMessage = new UserMessage(prompt, localCode);
        var response = chatClient.call(new Prompt(List.of(userMessage)));
        return extractCodeContent(response.toString());
    }
    
    //call the LLM when there's no local code
    private String callLlmWithoutLocalCode(String prompt) {
        System.out.println("calling LLM withOUT local code");
        var response = chatClient.call(prompt);
        return extractCodeContent(response.toString());
    }
    

    You see there that I’m extracting the code itself from the response string with this operation:

    //method that extracts code from the LLM response
    public static String extractCodeContent(String markdown) {
    
        System.out.println("Markdown: " + markdown);
    
        String regex = "`(\\w+)?\\n([\\s\\S]*?)```";  
        Pattern pattern = Pattern.compile(regex);
        Matcher matcher = pattern.matcher(markdown);
    
        if (matcher.find()) {
            String codeContent = matcher.group(2); // Extract group 2 (code content)
            return codeContent;
        } else {
            //System.out.println("No code fence found.");
            return markdown;
        }
    }
    

    And finally, I write the resulting code to disk:

    //write the final code to the target file path
    private void writeCodeToFile(String filePath, String codeContent) {
        try {
            File file = new File(filePath);
            if (!file.exists()) {
                file.createNewFile();
            }
    
            FileWriter writer = new FileWriter(file);
            writer.write(codeContent);
            writer.close();
    
            System.out.println("Content written to file: " + filePath);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
    

    There’s some more ancillary stuff that you can check out in the complete GitHub repo with this app in it. I was happy to be using Gemini Code Assist while building this. This AI assistant helped me understand some Java concepts, complete some functions, and fix some of my subpar coding choices.

    That’s it. Once I had this component, I built a JAR file and could now use it locally or in a continuous integration pipeline to produce my code. I uploaded the JAR file to Google Cloud Storage so that I could use it later in my CI pipelines. Now, onto the day-to-day workflow that would use this generator!

    Workflow step: Set up repo and pipeline

    Like with most software projects, I’d start with the supporting machinery. In this case, I needed a source repo to hold the prompt JSON files. Done.

    And I’d also consider setting up the path to production (or test environment, or whatever) to build the app as it takes shape. I’m using Google Cloud Build for a fully-managed CI service. It’s a good service with a free tier. Cloud Build uses declarative manifests for pipelines, and this pipeline starts off the same for any type of app.

    steps:
      # Print the contents of the current directory
      - name: 'bash'
        id: 'Show source files'
        script: |
          #!/usr/bin/env bash
          ls -l
    
      # Copy the JAR file from Cloud Storage
      - name: 'gcr.io/cloud-builders/gsutil'
        id: 'Copy AI generator from Cloud Storage'
        args: ['cp', 'gs://seroter-llm-demo-tools/demo-0.0.1-SNAPSHOT.jar', 'demo-0.0.1-SNAPSHOT.jar']
    
      # Print the contents of the current directory
      - name: 'bash'
        id: 'Show source files and builder tool'
        script: |
          #!/usr/bin/env bash
          ls -l
    

    Not much to it so far. I just print out the source contents seen in the pipeline, download the AI code generator from the above-mentioned Cloud Storage bucket, and prove that it’s on the scratch disk in Cloud Build.

    Ok, my dev environment was ready.

    Workflow step: Write prompts

    In this workflow, I don’t write code, I write prompts that generate code. I might use something like Google AI Studio or even Vertex AI to experiment with prompts and iterate until I like the response I get.

    Within AI Studio, I chose Gemini 1.5 Flash because I like nice things. Here, I’d work through the various prompts I would need to generate a working app. This means I still need to understand programming languages, frameworks, Dockerfiles, etc. But I’m asking the LLM to do all the coding.

    Once I’m happy with all my prompts, I add them to the JSON file. Note that each prompt entry has a corresponding file name that I want the generator to use when writing to disk.

    At this point, I was done “coding” the Node.js app. You could imagine having a dozen or so templates of common app types and just grabbing one and customizing it quickly for what you need!

    Workflow step: Test locally

    To test this, I put the generator in a local folder with a prompt JSON file and ran this command from the shell:

    rseroter$ java -jar  demo-0.0.1-SNAPSHOT.jar --prompt-file=app-prompts-web.json
    

    After just a few seconds, I had four files on disk.

    This is just a regular Node.js app. After npm install and npm start commands, I ran the app and successfully pinged the exposed API endpoint.

    Can we do things more sophisticated? I haven’t tried a ton of scenarios, but I wanted to see if I could get a database interaction generated successfully.

    I went into the Google Cloud console and spun up a (free tier) instance of Cloud Firestore, our NoSQL database. I then created a “collection” called “Employees” and added a single document to start it off.

    Then I built a new prompts file with directions to retrieve records from Firestore. I messed around with variations that encouraged the use of certain libraries and versions. Here’s a version that worked for me.

    {
      "folder": "generated-web-firestore",
      "prompts": [
        {
          "fileName": "employee.json",
          "prompt": "Generate a JSON structure for an object with fields for id, full name, state date, and office location. Populate it with sample data. Only return the JSON content and nothing else."
        },
        {
          "fileName": "index.js",
          "prompt": "Create a node.js program. Start up a web server on port 8080 and expose a route at /employee. Initializes a firestore database using objects from the @google-cloud/firestore package, referencing Google Cloud project 'seroter-project-base' and leveraging Application Default credentials. Return all the documents from the Employees collection."
        },
        {
          "fileName": "package.json",
          "prompt": "Create a valid package.json for this node.js application using version 7.7.0 for @google-cloud/firestore dependency. Do not include any comments in the JSON."
        },
        {
          "fileName": "Dockerfile",
          "prompt": "Create a Dockerfile for this node.js application that uses a minimal base image and exposes the app on port 8080."
        }
      ]
    }
    
    

    After running the prompts through the generator app again, I got four new files, this time with code to interact with Firestore!

    Another npm install and npm start command set started the app and served up the document sitting in Firestore. Very nice.

    Finally, how about a Python app? I want a background job that actually populates the Firestore database with some initial records. I experimented with some prompts, and these gave me a Python app that I could use with Cloud Run Jobs.

    {
      "folder": "generated-job-firestore",
      "prompts": [
        {
          "fileName": "main.py",
          "prompt": "Create a Python app with a main function that initializes a firestore database object with project seroter-project-base and Application Default credentials. Add two documents to the Employees collection. Generate random id, fullname, startdate, and location data for each document. Have the start script try to call that main function and if there's an exception, prints the error."
        },
        {
          "fileName": "requirements.txt",
          "prompt": "Create a requirements.txt file for the packages used by this app"
        },
        {
          "fileName": "Procfile",
          "prompt": "Create a Procfile for python3 that starts up main.py"
        },
        {
          "fileName": "Dockerfile",
          "prompt": "Create a Dockerfile for this Python batch application that uses a minimal base image and doesn't expose any ports"
        }
      ]
    }
    

    Running this prompt set through the AI generator gave me the valid files I wanted. All my prompt files are here.

    At this stage, I was happy with the local tests and ready to automate the path from source control to cloud runtime.

    Workflow step: Generate app in pipeline

    Above, I had started the Cloud Build manifest with the step of yanking down the AI generator JAR file from Cloud Storage.

    The next step is different for each app we’re building. I could use substitution variables in Cloud Build and have a single manifest for all of them, but for demonstration purposes, I wanted one manifest per prompt set.

    I added this step to what I already had above. It executes the same command in Cloud Build that I had run locally to test the generator. First I do an apt-get on the “ubuntu” base image to get the Java command I need, and then invoke my JAR, passing in the name of the prompt file.

    ...
    
    # Run the JAR file
      - name: 'ubuntu'
        id: 'Run AI generator to create code from prompts'
        script: |
          #!/usr/bin/env bash
          apt-get update && apt-get install -y openjdk-21-jdk
          java -jar  demo-0.0.1-SNAPSHOT.jar --prompt-file=app-prompts-web.json
    
      # Print the contents of the generated directory
      - name: 'bash'
        id: 'Show generated files'
        script: |
          #!/usr/bin/env bash
          ls ./generated-web -l
    

    I updated my Cloud Build pipeline that’s connected to my GitHub repo with an updated YAML manifest.

    Running the pipeline at this point showed that the generator worked correctly and adds the expected files to the scratch volume in the pipeline. Awesome.

    At this point, I had an app generated from prompts found in GitHub.

    Workflow step: Upload artifact

    Next up? Getting this code into a deployable artifact. There are plenty of options, but I want to use a container-based runtime, and need a container image. Cloud Build makes that easy.

    I added another section to my existing Cloud Build manifest to containerize with Docker and upload to Artifact Registry.

     # Containerize the code and upload to Artifact Registry
      - name: 'gcr.io/cloud-builders/docker'
        id: 'Containerize generated code'
        args: ['build', '-t', 'us-west1-docker.pkg.dev/seroter-project-base/ai-generated-images/generated-web:latest', './generated-web']
      - name: 'gcr.io/cloud-builders/docker'
        id: 'Push container to Artifact Registry'
        args: ['push', 'us-west1-docker.pkg.dev/seroter-project-base/ai-generated-images/generated-web']
    

    It used the Dockerfile our AI generator created, and after this step ran, I saw a new container image.

    Workflow step: Deploy and run app

    The final step, running the workload! I could use our continuous deployment service Cloud Deploy but I took a shortcut and deployed directly from Cloud Build. This step in the Cloud Build manifest does the job.

      # Deploy container image to Cloud Run
      - name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
        id: 'Deploy container to Cloud Run'
        entrypoint: gcloud
        args: ['run', 'deploy', 'generated-web', '--image', 'us-west1-docker.pkg.dev/seroter-project-base/ai-generated-images/generated-web', '--region', 'us-west1', '--allow-unauthenticated']
    

    After saving this update to Cloud Build and running it again, I saw all the steps complete successfully.

    Most importantly, I had an active service in Cloud Run that served up a default record from the API endpoint.

    I went ahead and ran a Cloud Build pipeline for the “Firestore” version of the web app, and then the background job that deploys to Cloud Run Jobs. I ended up with two Cloud Run services (web apps), and one Cloud Run Job.

    I executed the job, and saw two new Firestore records in the collection!

    To prove that, I executed the Firestore version of the web app. Sure enough, the records returned include the two new records.

    Wrap up

    What we saw here was a fairly straightforward way to generate complete applications from nothing more than a series of prompts fed to the Gemini model. Nothing prevents you from using a different LLM, or using other source control, continuous integration, and hosting services. Just do some find-and-replace!

    Again, I would NOT use this for “real” workloads, but this sort of pattern could be a powerful way to quickly create supporting apps and components for testing or learning purposes.

    You can find the whole project here on GitHub.

    What do you think? Completely terrible idea? Possibly useful?

  • Here’s what I’d use to build a generative AI application in 2024

    Here’s what I’d use to build a generative AI application in 2024

    What exactly is a “generative AI app”? Do you think of chatbots, image creation tools, or music makers? What about document analysis services, text summarization capabilities, or widgets that “fix” your writing? These all seem to apply in one way or another! I see a lot written about tools and techniques for training, fine-tuning, and serving models, but what about us app builders? How do we actually build generative AI apps without obsessing over the models? Here’s what I’d consider using in 2024. And note that there’s much more to cover besides just building—think designing, testing, deploying, operating—but I’m just focusing on the builder tools today.

    Find a sandbox for experimenting with prompts

    A successful generative AI app depends on a useful model, good data, and quality prompts. Before going to deep on the app itself, it’s good to have a sandbox to play in.

    You can definitely start with chat tools like Gemini and ChatGPT. That’s not a bad way to get your hands dirty. There’s also a set of developer-centric surfaces such as Google Colab or Google AI Studio. Once you sign in with a Google ID, you get free access to environments to experiment.

    Let’s look at Google AI Studio. Once you’re in this UI, you have the ability to simulate a back-and-forth chat, create freeform prompts that include uploaded media, or even structured prompts for more complex interactions.

    If you find yourself staring at an empty console wondering what to try, check out this prompt gallery that shows off a lot of unique scenarios.

    Once you’re doing more “serious” work, you might upgrade to a proper cloud service that offers a sandbox along with SLAs and prompt lifecycle capabilities. Google Cloud Vertex AI is one example. Here, I created a named prompt.

    With my language prompts, I can also jump into a nice “compare” experience where I can try out variations of my prompt and see if the results are graded as better or worse. I can even set one as “ground truth” used as a baseline for all comparisons.

    Whatever sandbox tools you use, make sure they help you iterate quickly, while also matching the enterprise-y needs of the use case or company you work for.

    Consume native APIs when working with specific models or platforms

    At this point, you might be ready to start building your generative AI app. There seems to be a new, interesting foundation model up on Hugging Face every couple of days. You might have a lot of affection for a specific model family, or not. If you care about the model, you might choose the APIs for that specific model or provider.

    For example, let’s say you were making good choices and anchored your app to the Gemini model. I’d go straight to the Vertex AI SDK for Python, Node, Java, or Go. I might even jump to the raw REST API and build my app with that.

    If I were baking a chat-like API call into my Node.js app, the quickest way to get the code I need is to go into Vertex AI, create a sample prompt, and click the “get code” button.

    I took that code, ran it in a Cloud Shell instance, and it worked perfectly. I could easily tweak it for my specific needs from here. Drop this code into a serverless function, Kubernetes pod, or VM and you’ve got a working generative AI app.

    You could follow this same direct API approach when building out more sophisticated retrieval augmented generation (RAG) apps. In a Google Cloud world, you might use the Vertex AI APIs to get text embeddings. Or you could choose something more general purpose and interact with a PostgreSQL database to generate, store, and query embeddings. This is an excellent example of this approach.

    If you have a specific model preference, you might choose to use the API for Gemini, Llama, Mistral, or whatever. And you might choose to directly interact with database or function APIs to augment the input to those models. That’s cool, and is the right choice for many scenarios.

    Use meta-frameworks for consistent experiences across models and providers

    As expected, the AI builder space is now full of higher-order frameworks that help developers incorporate generative AI into their apps. These frameworks help you call LLMs, work with embeddings and vector databases, and even support actions like function calling.

    LangChain is a big one. You don’t need to be bothered with many model details, and you can chain together tasks to get results. It’s for Python devs, so your choice is either to use Python, or, embrace one of the many offshoots. There’s LangChain4J for Java devs, LangChain Go for Go devs, and LangChain.js for JavaScript devs.

    You have other choices if LangChain-style frameworks aren’t your jam. There’s Spring AI, which has a fairly straightforward set of objects and methods for interacting with models. I tried it out for interacting with the Gemini model, and almost found it easier to use than our native API! It takes one update to my POM file:

    <dependency>
    			<groupId>org.springframework.ai</groupId>
    			<artifactId>spring-ai-vertex-ai-gemini-spring-boot-starter</artifactId>
    </dependency>
    

    One set of application properties:

    spring.application.name=demo
    spring.ai.vertex.ai.gemini.projectId=seroter-dev
    spring.ai.vertex.ai.gemini.location=us-central1
    spring.ai.vertex.ai.gemini.chat.options.model=gemini-pro-vision
    

    And then an autowired chat object that I call from anywhere, like in this REST endpoint.

    @RestController
    @SpringBootApplication
    public class DemoApplication {
    
    	public static void main(String[] args) {
    		SpringApplication.run(DemoApplication.class, args);
    	}
    
    	private final VertexAiGeminiChatClient chatClient;
    
    	@Autowired
        public DemoApplication(VertexAiGeminiChatClient chatClient) {
            this.chatClient = chatClient;
        }
    
    	@GetMapping("/")
    	public String getGeneratedText() {
    		String generatedResponse = chatClient.call("Tell me a joke");
    		return generatedResponse;
    	}
    }
    

    Super easy. There are other frameworks too. Use something like AI.JSX for building JavaScript apps and components. BotSharp is a framework for .NET devs building conversational apps with LLMs. Hugging Face has frameworks that help you abstract the LLM, including Transformers.js and agents.js.

    There’s no shortage of these types of frameworks. If you’re iterating through LLMs and want consistent code regardless of which model you use, these are good choices.

    Create with low-code tools when available

    If I had an idea for a generative AI app, I’d want to figure out how much I actually had to build myself. There are a LOT of tools for building entire apps, components, or widgets, and many require very little coding.

    Everyone’s in this game. Zapier has some cool integration flows. Gradio lets you expose models and APIs as web pages. Langflow got snapped up by DataStax, but still offers a way to create AI apps without much required coding. Flowise offers some nice tooling for orchestration or AI agents. Microsoft’s Power Platform is useful for low-code AI app builders. AWS is in the game now with Amazon Bedrock Agents. ServiceNow is baking generative AI into their builder tools, Salesforce is doing their thing, and basically every traditional low-code app vendor is playing along. See OutSystems, Mendix, and everyone else.

    As you would imagine, Google does a fair bit here as well. The Vertex AI Agent Builder offers four different app types that you basically build through point-and-click. These include personalized search engines, chat, recommendation engine, and connected agents.

    Search apps can tap into a variety of data sources including crawled websites, data warehouses, relational databases, and more.

    What’s fairly new is the “agent app” so let’s try building one of those. Specifically, let’s say I run a baseball clinic (sigh, someday) and help people tune their swing in our batting cages. I might want a chat experience for those looking for help with swing mechanics, and then also offer the ability to book time in the batting cage. I need data, but also interactivity.

    Before building the AI app, I need a Cloud Function that returns available times for the batting cage.

    This Node.js function returns an array of book-able timeslots. I’ve hard-coded the data, but you get the idea.

    I also jumped into the Google Cloud IAM interface to ensure that the Dialogflow service account (which the AI agent operates as) has permission to invoke the serverless function.

    Let’s build the agent. Back in the Vertex AI Agent Builder interface, I choose “new app” and pick “agent.”

    Now I’m dropped into the agent builder interface. On the left, I have navigation for agents, tools, test cases, and more. In the next column, I set the goal of the agent, the instructions, and any tools I want to use with the agent. On the right, I preview my agent.

    I set a goal of “Answer questions about baseball and let people book time in the batting cage” and then get to the instructions. There’s a “sample” set of instructions that are useful for getting started. I used those, but removed references to other agents or tools, as we don’t have that yet.

    But now I want to add a tool, as I need a way to show available booking times if the user asks. I have a choice of adding a data store—this is useful if you want to source Q&A from a BigQuery table, crawl a website, or get data from an API. I clicked the “manage all tools” button and chose to add a new tool. Here I give the tool a name, and very importantly, a description. This description is used by the AI agent to figure out when to invoke it.

    Because I chose OpenAPI as the tool type, I need to provide an OpenAPI spec for my Cloud Function. There’s a sample provided, and I used that to put together my spec. Note that the URL is the function’s base URL, and the path contains the specific function name.

    {
        "openapi": "3.0.0",
        "info": {
            "title": "Cage API",
            "version": "1.0.0"
        },
        "servers": [
            {
                "url": "https://us-central1-seroter-anthos.cloudfunctions.net"
            }
        ],
        "paths": {
            "/function-get-cage-times": {
                "get": {
                    "summary": "List all open cage times",
                    "operationId": "getCageTimes",
                    "responses": {
                        "200": {
                            "description": "An array of cage times",
                            "content": {
                                "application/json": {
                                    "schema": {
                                        "type": "array",
                                        "items": {
                                            "$ref": "#/components/schemas/CageTimes"
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            }
        },
        "components": {
            "schemas": {
                "CageTimes": {
                    "type": "object",
                    "required": [
                        "cageNumber",
                        "openSlot",
                        "cageType"
                    ],
                    "properties": {
                        "cageNumber": {
                            "type": "integer",
                            "format": "int64"
                        },
                        "openSlot": {
                            "type": "string"
                        },
                        "cageType": {
                            "type": "string"
                        }
                    }
                }
            }
        }
    }
    

    Finally, in this “tool setup” I define the authentication to that API. I chose “service agent token” and because I’m calling a specific instance of a service (versus the platform APIs), I picked “ID token.”

    After saving the tool, I go back to the agent definition and want to update the instructions to invoke the tool. I use the syntax, and appreciated the auto-completion help.

    Let’s see if it works. I went to the right-hand preview pane and asked it a generic baseball question. Good. Then I asked it for open times in the batting cage. Look at that! It didn’t just return a blob of JSON; it parsed the result and worded it well.

    Very cool. There are some quirks with this tool, but it’s early, and I like where it’s going. This was MUCH simpler than me building a RAG-style or function-calling solution by hand.

    Summary

    The AI assistance and model building products get a lot of attention, but some of the most interesting work is happening in the tools for AI app builders. Whether you’re experimenting with prompts, coding up a solution, or assembling an app out of pre-built components, it’s a fun time to be developer. What products, tools, or frameworks did I miss from my assessment?

  • Google Cloud Next ’24 is better than last year’s event in every way but one

    Conferences aren’t cheap to attend. Forget the financial commitment—although that’s far from trivial—it’s expensive with regards to your time. You’re likely traveling out of town and spend time commuting. Then there’s the event itself, which takes you away from work and life for at least a few days. All of this in the hope of getting equal or greater value than what you spent. Risky bet? No doubt. I’ve been attending and organizing conferences for many years, and I’ll honestly say that this year’s Google Cloud Next ’24 is one of the surer bets I’ve seen. Even if you’re not a Google Cloud user (yet), I’m confident that you’d get a lot out of attending.

    The last edition of the event was terrific, but this one is better; except for one aspect, which I’ll mention at the end. This might be your best 2024 investment to learn about AI, modern app architectures and development, best practices for data access and analysis, and operations at scale. But why do I think it’s better than last year?

    There’s much more technical content

    We had too much introductory material in our breakout sessions last year. Level 100 content is super valuable, but you can get that anywhere. Many of us attend events to hear stories and go deeper than we can someplace else. This year, well over half of the breakouts are Level 200 or 300 content, and there’s a proper mix of introductory and in-depth material.

    There are breakouts for everybody. If you want to learn about AI, this is maybe the best event of the year. Go deep on GPUs and TPUs, learn about AI and serverless, study ML and streaming, build LLM apps with a RAG architecture, building AI apps with Go, creating gen AI apps with LangChain, using Gemini through Vertex AI, understanding vector searching, and 175+ more sessions.

    Are you a database enthusiast? Learn about high availability for relational databases, picking the right cloud database, non-relational database design patterns, how Yahoo! uses Cloud Spanner, managing databases with AI, and more.

    This is a terrific event for data scientists with dozens of breakouts. Learn about natural language analytics queries, continuous queries, using LLMs in BigQuery, vector search and multimodal embeddings, and lots more.

    Ops folks get a ton of content this year. Whether you’re building an internal developer platform on GKE, managing edge retail experiences at scale, embracing observability, setting up continuous deployment of AI models, migrating legacy workloads, securing multi-tenant Kubernetes, building a global service mesh, or advancing your logging infrastructure, you’ll leave the event smarter.

    And don’t forget about developers! We didn’t. With over 100 breakouts, we amped up the deep developer content. Learn about Java on serverless platforms, deploying apps to cloud, testing apps with testcontainers, building apps from scratch using AI assistance, pushing JavaScript apps to cloud, app troubleshooting, and tons more.

    Notice a better focus on developers and onsite learning

    Historically, Cloud Next was focused heavily on cloud services, but we also wanted to expand our usefulness for folks who are actually coding!

    For breakouts, we’ve got content for Android developers, those building Firebase apps, devs using Flutter, game developers, devs building with Angular, builders extending Workspace through APIs, and even those running training for Llama2!

    Our Innovator’s Hive is where you have 10s of thousands of square feet worth of demo stations featuring creative and educational examples of technology. And our first-time Community Hub offers education on Google-sponsored open tech like Android, Flutter, and more.

    Also, come for the dedicated tech training and certification options. This is more of a developer-centric program than I’ve ever seen from us.

    See more “now” technology to accompany “next” technology

    Last year’s event had lots of exciting previews, but much the tech wasn’t ready yet. We showed off AI developer assistance, previewed some new AI models, and talked about many things that were coming up soon.

    That’s all good, but now we have a better mix of “now” and “next.” You’ll continue seeing cutting edge tech that’s coming in the future, but you also will see more products, services, and frameworks that you can use RIGHT NOW.

    Hear from more industry expert voices

    Our developer keynote last year was so much fun, and we heard from awesome Google Champion Innovators. I loved it.

    We thought we’d mix it up this year and invite folks to our main stage that aren’t directly associated with Google. Our developer keynote features Guillermo Rauch, the CEO of Vercel; Josh Long, Spring advocate at Broadcom; and Charity Majors co-founder at Honeycomb. I’m a fan of all three people, which is why I’m amped that they accepted my invitation to join us on stage.

    And the breakouts themselves feature an absolute ton of customers and independent experts. A quick scan through our program gave me a list of speakers from companies like AMD, ANZ Bank, ASML, Accenture, Alaska Airlines, American Express, Anthropic, Anyscale, Apple, BMW AG, Bayer, Bayer Corporation, Belk, Bombardier, Boston Consulting Group, Box, CME Group, Carrefour, Charles Schwab, Chicago CTA, Citadel, Commerzbank AG, Core Logic, Covered California, Cox Communication, DZ Bank, Databricks, Deloitte, Deutsche Telekom, Devoteam G Cloud, Dialpad, DoIT International, Docker, Fiserv, Ford Motor Company, GitLab, Glean, Globe Telecom, Goldman Sachs, GrowthLoop, HCA Healthcare, HCL, HSBC, Harness, Hashicorp, Illinois Department of Human Services, Intel, KPMG, Lloyds Banking Group, Logitech, L’Oreal, Macquarie Bank, Major League Baseball, Mayo Clinic, Mercado Libre, Moloco, MongoDB, NJ Cybersecurity and Communications Cell, Nuro, Onix, OpenText, Palo Alto Networks, Paramount, Paypal, Pfizer, PwC, Quantiphi, Red Hat, Reddit, Rent the Runway, Roche, Roku, SADA, Sabre, Salesforce, Scotia Bank, Shopify, Snap, Spotify, Stability AI, Stagwell, Stanford, Symphony, Synk, TD Securities, Telus corporation, TransUnion, Trendyol, Typeface, UC Riverside, UPS, US News and World Report, Uber, Ubie, Inc, Unilever, Unity Technologies, Verizon, Walmart, Wayfair, Weights & Biases, Wells Fargo, Yahoo, and apree health. So many folks to learn from!

    Enjoy a bigger overall event

    This version of Next is going to be significantly larger than the last one, and that’s a good thing. I don’t want the conference to ever be festival-sized like Dreamforce or re:Invent, but having tens of thousands of folks in one place means a bigger breakout program, more learning opportunities, more serendipitous meetups, and a unique energy for attendees.

    We don’t have any musical numbers 😦

    The one thing that’s not better than last year? We couldn’t top our last keynote intro, and I didn’t try. There’s no musical tune featuring a sousaphone. That said, I genuinely think our developer keynote itself is even better overall this time, and the whole event should be memorable.

    There’s still time to register, and I’d love to bump into you if you attend. Let me know if you’ll be there!

  • No cloud account, no problem. Try out change streams in Cloud Spanner locally with a dozen-ish shell commands.

    No cloud account, no problem. Try out change streams in Cloud Spanner locally with a dozen-ish shell commands.

    If you have a choice, you should test software against the real thing. The second best option is to use a “fake” that implements the target service’s API. In the cloud, it’s straightforward to spin up a real instance of a service for testing. But there are reasons (e.g. cost or speed) or times (e.g. within a CI pipeline, or rapid testing on your local machine) when an emulator is a better bet.

    Let’s say that you wanted to try out Google Cloud Spanner, and it’s useful change streams functionality. Consider creating a real instance and experimenting, but you have an alternative option. The local emulator just added support for change streams, and you can test the whole thing out from the comfort of your own machine. Or, to make life even easier, test it out from a free cloud machine.

    With just a Google account (which most everyone has?), you can use a free cloud-based shell and code editor. Just go to shell.cloud.google.com. We’ve loaded this environment up with language CLIs for Java, .NET, Go, and others. It’s got the Docker daemon running. And it’s got our gcloud CLI pre-loaded and ready to go. It’s pretty cool. From here, we can install the Spanner emulator, and run just a few shell commands to see the entire thing in action.

    Let’s begin by installing the emulator for Cloud Spanner. It takes just one command.

    sudo apt-get install google-cloud-sdk-spanner-emulator
    

    Then we start up the emulator itself with this command:

    gcloud emulators spanner start 
    

    After a couple of seconds, I see the emulator running, and listening on two ports.

    Great. I want to leave that running while having the freedom to run more commands. It’s easy to spin up new tabs in the Cloud Shell Editor, so I created a new one.

    In this new tab, I ran a set of commands that configured the gcloud CLI to work locally with the emulator. The CLI supports the concept of multiple configurations, so we create one that is emulator friendly. Also note that Google Cloud has the idea of “projects.” But if you don’t have a Google Cloud account, you’re ok here. For the emulators, you can use a non-existent value for “project” as I have here.

    gcloud config configurations create emulator
    gcloud config set auth/disable_credentials true
    gcloud config set project local-project
    gcloud config set api_endpoint_overrides/spanner http://localhost:9020/
    

    It’s time to create a (local) Spanner instance. I ran this one command to do so. It’s super fast, which makes it great for CI pipeline scenarios. That second command sets the default instance name so that we don’t have to provide an instance value in subsequent commands.

    gcloud spanner instances create test-instance \
       --config=emulator-config --description="Test Instance" --nodes=1
    gcloud config set spanner/instance test-instance
    

    Now, we need a database in this instance. Spanner supports multiple “dialects”, including PostgreSQL. Here’s how I create a new database.

    gcloud spanner databases create example-db --database-dialect=POSTGRESQL
    

    Let’s throw a couple of tables into this database. We’ve got one for Singers, and one for Albums.

    gcloud spanner databases ddl update example-db \
    --ddl='CREATE TABLE Singers ( SingerId bigint NOT NULL, FirstName varchar(1024), LastName varchar(1024), SingerInfo bytea, PRIMARY KEY (SingerId) )'
    gcloud spanner databases ddl update example-db \
    --ddl='CREATE TABLE Albums ( SingerId bigint NOT NULL, AlbumId bigint NOT NULL, AlbumTitle varchar, PRIMARY KEY (SingerId, AlbumId) ) INTERLEAVE IN PARENT Singers ON DELETE CASCADE'
    

    Now we’ll insert a handful of rows into each table.

    gcloud spanner databases execute-sql example-db \
      --sql="INSERT INTO Singers (SingerId, FirstName, LastName) VALUES (1, 'Marc', 'Richards')"
    gcloud spanner databases execute-sql example-db \
      --sql="INSERT INTO Singers (SingerId, FirstName, LastName) VALUES (2, 'Catalina', 'Smith')"
    gcloud spanner databases execute-sql example-db   \
      --sql="INSERT INTO Singers (SingerId, FirstName, LastName) VALUES (3, 'Alice', 'Trentor')"
    gcloud spanner databases execute-sql example-db   \
      --sql="INSERT INTO Albums (SingerId, AlbumId, AlbumTitle) VALUES (1, 1, 'Total Junk')"
    gcloud spanner databases execute-sql example-db   \
      --sql="INSERT INTO Albums (SingerId, AlbumId, AlbumTitle) VALUES (2, 1, 'Green')"
    

    If you want to prove this works (thus far), you can execute regular queries against the new tables. Here’s an example of retrieving the albums.

    gcloud spanner databases execute-sql example-db \
        --sql='SELECT SingerId, AlbumId, AlbumTitle FROM Albums'
    

    It’s time to turn on change streams, and this takes an extra step. It doesn’t look like I can smuggle utility commands through the “execute-sql” operation, so we need to run a DDL statement instead. Note that you can create change streams that listen to specific tables or columns. This one listens to anything changing in any table.

    gcloud spanner databases ddl update example-db \
    --ddl='CREATE CHANGE STREAM EverythingStream FOR ALL'
    

    If you want to prove everything is in place, you can run this command to see all the database objects.

    gcloud spanner databases ddl describe example-db --instance=test-instance
    

    I’m now going to open a third tab in the Cloud Shell Editor. This is so that we can continuously tail the change stream results. We’ve created this nice little sample project that lets you tail the change stream. Install the app by running this command in the third tab.

    go install github.com/cloudspannerecosystem/spanner-change-streams-tail@latest
    

    Then, in this same tab, we want the Go SDK (which this app uses) to look at the local emulator’s gRPC port instead of the public cloud. Set the environment variable that overrides the default behavior.

    export SPANNER_EMULATOR_HOST=localhost:9010
    

    Awesome. Now we start up the change stream app with a single command. You should see it start up and hold waiting for data.

    spanner-change-streams-tail -p local-project -i test-instance -d example-db -s everythingstream
    

    Back in the second tab (the first should still be running the emulator, the third is running the change stream tail), let’s add a new record to the Spanner database table. What SHOULD happen is that we see a change record pop up in the third tab.

    gcloud spanner databases execute-sql example-db   \
      --sql="INSERT INTO Albums (SingerId, AlbumId, AlbumTitle) VALUES (2, 2, 'Go, Go, Go')"
    

    Sure enough, I see a record pop into the third tab showing the before and after values of the row.

    You can mess around with updating records, deleting records, and so on. A change stream is powerful for event sourcing scenarios, or simply feeding data changes to downstream systems.

    In this short walkthrough, we tried out the Cloud Shell Editor, spun up the Spanner emulator, and experimented with database change streams. All without needing a Google Cloud account, or installing a lick of software on our own device. Not bad!

  • How I’d use generative AI to modernize an app

    How I’d use generative AI to modernize an app

    I’m skeptical of anything that claims to make difficult things “easy.” Easy is relative. What’s simple for you might draw blood from me. And in my experience, when a product claims to make something “easy”, it’s talking about simplifying a subset of the broader, more complicated job-to-be-done.

    So I won’t sit here and tell you that generative AI makes app modernization easy. Nothing does. It’s hard work and is as much about technology as it is psychology and archeology. But AI can make it easier. We’ll take any help we can get, right? I count at least five ways I’d use generative AI to make smarter progress on my modernization journey.

    #1 Understand the codebase

    Have you been handed a pile of code and scripts before? Told to make sense of it and introduce some sort of feature enhancement? You might spend hours, days, or weeks figuring out the relationships between components and side effects of any changes.

    Generative AI is fairly helpful here. Especially now that things like Gemini 1.5 (with its 1 million token input) exist.

    I might use something like Gemini (or ChatGPT, or whatever) to ask questions about the code base and get ideas for how something might be used. This is where the “generative” part is handy. When I use the Duet AI assistance in to explain SQL in BigQuery, I get back a creative answer about possible uses for the resulting data.

    In your IDE, you might use Duet AI (or Copilot, Replit, Tabnine) to give detailed explanations of individual code files, shell scripts, YAML, or Dockerfiles. Even if you don’t decide to use any generative AI tools to write code, consider using them to explain it.

    #2 Incorporate new language/framework features

    Languages themselves modernize at a fairly rapid pace. Does your codebase rely on a pattern that was rad back in 2011? It happens. I’ve seen that generative AI is a handy way to modernize the code itself while teaching us how to apply the latest language features.

    For instance, Go generics are fairly new. If your Go app is more than 2 years old, it wouldn’t be using them. I could go into my Go app and ask my generative AI chat tool for advice on how to introduce generics to my existing code.

    Usefully, the Duet AI tooling also explains what it did, and why it matters.

    I might use the same types of tools to convert an old ASP.NET MVC app to the newer Minimal APIs structure. Or replace deprecated features from Spring Boot 3.0 with more modern alternatives. Look at generative AI tools as a way to bring your codebase into the current era of language features.

    #3 Improve code quality

    Part of modernizing an app may involve adding real test coverage. You’ll never continuously deploy an app if you can’t get reliable builds. And you won’t get reliable builds without good tests and a CI system.

    AI-assisted developer tools make it easier to add integration tests to your code. I can go into my Spring Boot app and get testing scaffolding for my existing functions.

    Consider using generative AI tools to help with broader tasks like defining an app-wide test suite. You can use these AI interfaces to brainstorm ideas, get testing templates, or even generate test data.

    In addition to test-related activities, you can use generative AI to check for security issues. These tools don’t care about your feelings; here, it’s calling out my terrible practices.

    Fortunately, I can also ask the tool to “fix” the code. You might find a few ways to use generative AI to help you refactor and improve the resilience and quality of the codebase.

    #4 Swap out old or unsupported components

    A big part of modernization is ensuring that a system is running fully supported components. Maybe that database, plugin, library, or entire framework is now retired, or people don’t want to work with it. AI tools can help with this conversion.

    For instance, maybe it’s time to swap out JavaScript frameworks. That app you built in 2014 with Backbone.js or jQuery is feeling creaky. You want to bring in React or Angular instead. I’ve had some luck coaxing generative AI tools into giving me working versions of just that. Even if you use AI chat tools to walk you through the steps (versus converting all the code), it’s a time-saver.

    The same may apply to upgrades from Java 8 to Java 21, or going from classic .NET Framework to modern .NET. Heck, you can even have some luck switching from COBOL to Go. I wouldn’t blindly trust these tools to convert code; audit aggressively and ensure you understand the new codebase. But these tools may jump start your work and cut out some of the toil.

    #5 Upgrade the architecture

    Sometimes an app modernization requires some open-heart surgery. It’s not about light refactoring or swapping a frontend framework. No, there are times where you’re yanking out major pieces or making material changes.

    I’ve had some positive experiences asking generative AI tools to help me upgrade a SOAP service to REST. Or REST to gRPC. You might use these tools to switch from a stored procedure-heavy system to one that puts the logic into code components instead. Speaking of databases, you could change from MySQL to Cloud Spanner, or even change a non-relational database dependency back to a relational one. Will generative AI do all the work? Probably not, but much of it’s pretty good.

    This might be a time to make bigger changes like swapping from one cloud to another, or adding a major layer of infrastructure-as-code templates to your system. I’ve seen good results from generative AI tools here too. In some cases, a modernization project is your chance to introduce real, lasting changes to a architecture. Don’t waste the opportunity!

    Wrap Up

    Generative AI won’t eliminate the work of modernizing an app. There’s lots of work to do to understand, transform, document, and rollout code. AI tools can make a big difference, though, and you’re tying a hand behind your back if you ignore it! What other uses for app modernization come to mind?

  • Make Any Catalog-Driven App More Personalized to Your Users: How I used Generative AI Coding Tools to Improve a Go App With Gemini.

    Make Any Catalog-Driven App More Personalized to Your Users: How I used Generative AI Coding Tools to Improve a Go App With Gemini.

    How many chatbots do we really need? While chatbots are a terrific example app for generative AI use cases, I’ve been thinking about how developers may roll generative AI into existing “boring” apps and make them better.

    As I finished all my Christmas shopping—much of it online—I thought about all the digital storefronts and how they provide recommended items based on my buying patterns, but serve up the same static item descriptions, regardless of who I am. We see the same situation with real estate listings, online restaurant menus, travel packages, or most any catalog of items! What if generative AI could create a personalized story for each item instead? Wouldn’t that create such a different shopping experience?

    Maybe this is actually a terrible idea, but during the Christmas break, I wanted to code an app from scratch using nothing but Google Cloud’s Duet AI while trying out our terrific Gemini LLM, and this seemed like a fun use case.

    The final app (and codebase)

    The app shows three types of catalogs and offers two different personas with different interests. Everything here is written in Go and uses local files for “databases” so that it’s completely self-contained. And all the images are AI-generated from Google’s Imagen2 model.

    When the user clicks on a particular catalog entry, the go to a “details” page where the generic product summary from the overview page is sent along with a description of the user’s preferences to the Google Gemini model to get a personalized, AI-powered product summary.

    That’s all there is to it, but I think it demonstrates the idea.

    How it works

    Let’s look at what we’ve got here. Here’s the basic flow of the AI-augmented catalog request.

    How did I build the app itself (GitHub repo here)? My goal was to only use LLM-based guidance either within the IDE using Duet AI in Google Cloud, or burst out to Bard where needed. No internet searches, no docs allowed.

    I started at the very beginning with a basic prompt.

    What are the CLI commands to create a new Go project locally?

    The answer offered the correct steps for getting the project rolling.

    The next commands are where AI assistance made a huge difference for me. With this series of natural language prompts in the Duet AI chat within VS Code, I got the foundation of this app set up in about five minutes. This would have easily taken me 5 or 10x longer if I did it manually.

    Give me a main.go file that responds to a GET request by reading records from a local JSON file called property.json and passes the results to an existing html/template named home.html. The record should be defined in a struct with fields for ID, Name, Description, and ImageUrl.
    Create an html/template for my Go app that uses Bootstrap for styling, and loops through records. For each loop, create a box with a thin border, an image at the top, and text below that. The first piece of text is "title" and is a header. Below that is a short description of the item. Ensure that there's room for four boxes in a single row.
    Give me an example data.json that works with this struct
    Add a second function to the class that responds to HTML requests for details for a given record. Accept a record id in the querystring and retrieve just that record from the array before sending to a different html/template

    With these few prompts, I had 75% of my app completed. Wild! I took this baseline, and extended it. The final result has folders for data, personas, images, a couple HTML files, and a single main.go file.

    Let’s look at the main.go file, and I’ll highlight a handful of noteworthy bits.

    package main
    
    import (
    	"context"
    	"encoding/json"
    	"fmt"
    	"html/template"
    	"log"
    	"net/http"
    	"os"
    	"strconv"
    
    	"github.com/google/generative-ai-go/genai"
    	"google.golang.org/api/option"
    )
    
    // Define a struct to hold the data from your JSON file
    type Record struct {
    	ID          int
    	Name        string
    	Description string
    	ImageURL    string
    }
    
    type UserPref struct {
    	Name        string
    	Preferences string
    }
    
    func main() {
    
    	// Parse the HTML templates
    	tmpl := template.Must(template.ParseFiles("home.html", "details.html"))
    
    	//return the home page
    	http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
    
    		var recordType string
    		var recordDataFile string
    		var personId string
    
    		//if a post-back from a change in record type or persona
    		if r.Method == "POST" {
    			// Handle POST request:
    			err := r.ParseForm()
    			if err != nil {
    				http.Error(w, "Error parsing form data", http.StatusInternalServerError)
    				return
    			}
    
    			// Extract values from POST data
    			recordType = r.FormValue("recordtype")
    			recordDataFile = "data/" + recordType + ".json"
    			personId = r.FormValue("person")
    
    		} else {
    			// Handle GET request (or other methods):
    			// Load default values
    			recordType = "property"
    			recordDataFile = "data/property.json"
    			personId = "person1" // Or any other default person
    		}
    
    		// Parse the JSON file
    		data, err := os.ReadFile(recordDataFile)
    		if err != nil {
    			fmt.Println("Error reading JSON file:", err)
    			return
    		}
    
    		var records []Record
    		err = json.Unmarshal(data, &records)
    		if err != nil {
    			fmt.Println("Error unmarshaling JSON:", err)
    			return
    		}
    
    		// Execute the template and send the results to the browser
    		err = tmpl.ExecuteTemplate(w, "home.html", struct {
    			RecordType string
    			Records    []Record
    			Person     string
    		}{
    			RecordType: recordType,
    			Records:    records,
    			Person:     personId,
    		})
    		if err != nil {
    			fmt.Println("Error executing template:", err)
    		}
    	})
    
    	//returns the details page using AI assistance
    	http.HandleFunc("/details", func(w http.ResponseWriter, r *http.Request) {
    
    		id, err := strconv.Atoi(r.URL.Query().Get("id"))
    		if err != nil {
    			fmt.Println("Error parsing ID:", err)
    			// Handle the error appropriately (e.g., redirect to error page)
    			return
    		}
    
    		// Extract values from querystring data
    		recordType := r.URL.Query().Get("recordtype")
    		recordDataFile := "data/" + recordType + ".json"
    
    		//declare recordtype map and extract selected entry
    		typeMap := make(map[string]string)
    		typeMap["property"] = "Create an improved home listing description that's seven sentences long and oriented towards a a person with these preferences:"
    		typeMap["store"] = "Create an updated paragraph-long summary of this store item that's colored by these preferences:"
    		typeMap["restaurant"] = "Create a two sentence summary for this menu item that factors in one or two of these preferences:"
    		//get the preamble for the chosen record type
    		aiPremble := typeMap[recordType]
    
    		// Parse the JSON file
    		data, err := os.ReadFile(recordDataFile)
    		if err != nil {
    			fmt.Println("Error reading JSON file:", err)
    			return
    		}
    
    		var records []Record
    		err = json.Unmarshal(data, &records)
    		if err != nil {
    			fmt.Println("Error unmarshaling JSON:", err)
    			return
    		}
    
    		// Find the record with the matching ID
    		var record Record
    		for _, rec := range records {
    			if rec.ID == id { // Assuming your struct has an "ID" field
    				record = rec
    				break
    			}
    		}
    
    		if record.ID == 0 { // Record not found
    			// Handle the error appropriately (e.g., redirect to error page)
    			return
    		}
    
    		//get a reference to the persona
    		person := "personas/" + (r.URL.Query().Get("person") + ".json")
    
    		//retrieve preference data from file name matching person variable value
    		preferenceData, err := os.ReadFile(person)
    		if err != nil {
    			fmt.Println("Error reading JSON file:", err)
    			return
    		}
    		//unmarshal the preferenceData response into an UserPref struct
    		var userpref UserPref
    		err = json.Unmarshal(preferenceData, &userpref)
    		if err != nil {
    			fmt.Println("Error unmarshaling JSON:", err)
    			return
    		}
    
    		//improve the message using Gemini
    		ctx := context.Background()
    		// Access your API key as an environment variable (see "Set up your API key" above)
    		client, err := genai.NewClient(ctx, option.WithAPIKey(os.Getenv("GEMINI_API_KEY")))
    		if err != nil {
    			log.Fatal(err)
    		}
    		defer client.Close()
    
    		// For text-only input, use the gemini-pro model
    		model := client.GenerativeModel("gemini-pro")
    		resp, err := model.GenerateContent(ctx, genai.Text(aiPremble+" "+userpref.Preferences+". "+record.Description))
    		if err != nil {
    			log.Fatal(err)
    		}
    
    		//parse the response from Gemini
    		bs, _ := json.Marshal(resp.Candidates[0].Content.Parts[0])
    		record.Description = string(bs)
    
    		//execute the template, and pass in the record
    		err = tmpl.ExecuteTemplate(w, "details.html", record)
    		if err != nil {
    			fmt.Println("Error executing template:", err)
    		}
    	})
    
    	fmt.Println("Server listening on port 8080")
    	fs := http.FileServer(http.Dir("./images"))
    	http.Handle("/images/", http.StripPrefix("/images/", fs))
    	http.ListenAndServe(":8080", nil)
    }
    

    I do not write great Go code, but it compiles, which is good enough for me!

    On line 13, see that I refer to the Go package for interacting with the Gemini model. All you need is an API key, and we have a generous free tier.

    On line 53, notice that I’m loading the data file based on the type of record picked on the HTML template.

    On line 79, I’m executing the HTML template and sending the type of record (e.g. property, restaurant, store), the records themselves, and the persona.

    On lines 108-113, I’m storing a map of prompt values to use for each type of record. These aren’t terrific, and could be written better to get smarter results, but it’ll do.

    Notice on line 147 that I’m grabbing the user preferences we use for customization.

    On line 163, I create a Gemini client so that I can interact with the LLM.

    On line 171, see that I’m generating AI content based on the record-specific preamble, the record details, and the user preference data.

    On line 177, notice that I’m extracting the payload from Gemini’s response.

    Finally, on line 181 I’m executing the “details” template and passing in the AI-augmented record.

    None of this is rocket science, and you can check out the whole project on GitHub.

    What an “enterprise” version might look like

    What I have here is a local example app. How would I make this more production grade?

    • Store catalog images in an object storage service. All my product images shouldn’t be local, of course. They belong in something like Google Cloud Storage.
    • Add catalog items and user preferences to a database. Likewise, JSON files aren’t a great database. The various items should all be in a relational database.
    • Write better prompts for the LLM. My prompts into Gemini are meh. You can run this yourself and see that I get some silly responses, like personalizing the message for a pillow by mentioning sporting events. In reality, I’d write smarter prompts that ensured the responding personalized item summary was entirely relevant.
    • Use Vertex AI APIs for accessing Gemini. Google AI Studio is terrific. For production scenarios, I’d use the Gemini models hosted in full-fledged MLOps platform like Vertex AI.
    • Run app in a proper cloud service. If I were really building this app, I’d host it in something like Google Cloud Run, or maybe GKE if it was part of a more complex set of components.
    • Explore whether pre-generating AI-augmented results and caching them would be more performant. It’s probably not realistic to call LLM endpoints on each “details” page. Maybe I’d pre-warm certain responses, or come up with other ways to not do everything on the fly.

    This exercise helped me see the value of AI-assisted developer tooling firsthand. And, it feels like there’s something useful about LLM summarization being applied to a variety of “boring” app scenarios. What do you think?

  • Building an event-driven architecture in the cloud? These are three approaches for generating events.

    Building an event-driven architecture in the cloud? These are three approaches for generating events.

    When my son was 3 years old, he would often get out of bed WAAAAY too early and want to play with me. I’d send him back to bed, and inevitably he’d check in again just a few minutes later. Eventually, we got him a clock with a timed light on it, so there was a clear trigger that it was time to get up.

    Originally, my son was like a polling component that keeps asking “is it time yet?” I’ve built many of those myself in software. It’s a simple way to produce an event (“time to get up”, or “new order received”) when it’s the proper moment. But these pull-based approaches are remarkably inefficient and often return empty results until the time is right. Getting my son a clock that turned green when it was time to get out of bed is more like a push-based approach where the system tells you when something happened.

    In software, there are legit reasons to do pull-based activities—maybe you intentionally want to batch the data retrieval and process it once a day—but it’s more common nowadays to see architects and developers embrace a push-driven event-based architecture that can operate in near real-time. Cloud platforms make this much easier to set up than it used to be with on-premises software!

    I see three ways to activate events in your cloud architecture. Let’s look at examples of each.

    Events automatically generated by service changes

    This is all about creating event when something happens to the cloud service. Did someone create a new IAM role? Build a Kubernetes cluster? Delete a database backup? Update a machine learning model?

    The major hyperscale cloud providers offer managed services that capture and route these events. AWS offers Amazon EventBridge, Microsoft gives you Azure Event Grid, and Google Cloud serves up Eventarc. Instead of creating your own polling component, retry logic, data schemas, observability system, and hosting infrastructure, you can use a fully managed end-to-end option in the cloud. Yes, please. Let’s look at doing this with Eventarc.

    I can create triggers for most Google Cloud services, then choose among all the possible events for each service, provide any filters for what I’m looking for, and then choose a destination. Supported destinations for the routed event include serverless functions (Cloud Functions), serverless containers (Cloud Run), declarative workflow (Cloud Workflows), a Kubernetes service (GKE), or a random internal HTTP endpoint.

    Starting here assumes I have a event destination pre-configured to receive the CloudEvents-encoded event. Let’s assume I don’t have anything in place to “catch” the event and need to create a new Cloud Function.

    When I create a new Cloud Function, I have a choice of picking a non-HTTP trigger. This flys open an Eventarc pane where I follow the same steps as above. Here, I chose to catch the “enable service account” event for IAM.

    Then I get function code that shows me how to read the key data from the CloudEvent payload. Handy!

    Use these sorts of services to build loosely-coupled solutions to react to what’s going on in our cloud environment.

    Events automatically generated by data changes

    This is the category most of us are familiar with. Here, it’s about change data capture (CDC) that triggers an event based on new, updated, or deleted data in some data source.

    Databases

    Again, in most hyperscale clouds, you’ll find databases with CDC interfaces built in. I found three within Google Cloud: Cloud Spanner, Bigtable, and Firestore.

    Cloud Spanner, our cloud-native relational database, offers change streams. You can “watch” an entire database, or narrow it down to specific tables or columns. Each data change record has the name of the affected table, the before-and-after data values, and a timestamp. We can read these change streams within our Dataflow product, calling the Spanner API, or using the Kafka connector. Learn more here.

    Bigtable, our key-value database service, also supports change streams. Every data change record contains a bunch of relevant metadata, but does not contain the “old” value of the database record. Similar to Spanner, you can read Bigtable change streams using Dataflow or the Java client library. Learn more here.

    Firestore is our NoSQL cloud database that’s often associated with the Firebase platform. This database has a feature to create listeners on a particular document or document collection. It’s different from the previous options, and looks like it’s mostly something you’d call from code. Learn more here.

    Some of our other databases like Cloud SQL support CDC using their native database engine (e.g. SQL Server), or can leverage our manage change data capture service called Datastream. Datastream pulls from PostgreSQL, MySQL, and Oracle data sources and publishes real-time changes to storage or analytical destinations.

    “Other” services

    There is plenty of “data” in systems that aren’t “databases.” What if you want events from those? I looked through Google Cloud services and saw many others that can automatically send change events to Google Cloud Pub/Sub (our message broker) that you can then subscribe to. Some of these look like a mix of the first category (notifications about a service) and this category (notifications about data in the service):

    • Cloud Storage. When objects change in Cloud Storage, you can send notifications to Pub/Sub. The payload contains info about the type of event, the bucket ID, and the name of the object itself.
    • Cloud Build. Whenever your build state changes in Cloud Build (our CI engine), you can have a message sent to Pub/Sub. These events go to a fixed topic called “cloud-builds” and the event message holds a JSON version of your build resource. You can configure either push or pull subscriptions for these messages.
    • Artifact Registry. Want to set up an event for changes to Docker repositories? You can get messages for image uploads, new tags, or image deletions. Here’s how to set it up.
    • Artifact Analysis. This package scanning tool look for vulnerabilities, and you can send notifications to Pub/Sub when vulnerabilities are discovered. The simple payloads tell you what happened, and when.
    • Cloud Deploy. Our continuous deployment tool also offers notifications about changes to resources (rollouts, pipelines), when approvals are needed, or when a pipeline is advancing phases. It can be handy to use these notifications to kick off further stages in your workflows.
    • GKE. Our managed Kubernetes service also offers automatic notifications. These apply at the cluster level versus events about individual workloads. But you can get events about security bulletins for the cluster, new GKE versions, and more.
    • Cloud Monitoring Alerts. Our built-in monitoring service can send alerts to all sorts of notification channels including email, PagerDuty, SMS, Slack, Google Chat, and yes, Pub/Sub. It’s useful to have metric alert events routing through your messaging system, and you can see how to configure that here.
    • Healthcare API. This capability isn’t just for general-purpose cloud services. We offer a rich API for ingesting, storing, analyzing, and integrating healthcare data. You can set up automatic events for FHIR, HL7 resources, and more. You get metadata attributes and an identifier for the data record.

    And there are likely other services I missed! Many cloud services have built-in triggers that route events to downstream components in your architecture.

    Events manually generated by code or DIY orchestration

    Sometimes you need fine-grained control for generating events. You might use code or services to generate and publish events.

    First, you may wire up managed services to do your work. Maybe you use Azure Logic Apps or Google Cloud App Integration to schedule a database poll every hour, and then route any relevant database records as individual events. Or you use a data processing engine like Google Cloud Dataflow to generate batch or real-time messages from data sources into Pub/Sub or another data destination. And of course, you may use third-party integration platform that retrieve data from services and generates events.

    Secondly, you may hand-craft an event in your code. Your app could generate events when specific things happen in your business logic. Every cloud offers a managed messaging service, and you can always send events from your code to best-of-breed products like RabbitMQ, Apache Kafka, or NATS.

    In this short example, I’m generating an event from within a Google Cloud Function and sending it to Pub/Sub. BTW, since Cloud Functions and Pub/Sub both have generous free tiers, you can follow along at no cost.

    I created a brand new function and chose Node.js 20 as my language/framework. I added a single reference to the package.json file:

    "@google-cloud/pubsub": "4.0.7"
    

    Then I updated the default index.js code with a reference to the pubsub package, and added code to publish the incoming querystring value as an event to Pub/Sub.

    const functions = require('@google-cloud/functions-framework');
    const {PubSub} = require('@google-cloud/pubsub');
    
    functions.http('helloHttp', (req, res) => {
    
      var projectId = 'seroter-project-base'; 
      var topicNameOrId = 'custom-event-router';
    
      // Instantiates a client
      const pubsub = new PubSub({projectId});
      const topic = pubsub.topic(topicNameOrId);
    
      // Send a message to the topic
      topic.publishMessage({data: Buffer.from('Test message from ' + req.query.name)});
    
      // return result
      res.send(`Hello ${req.query.name || req.body.name || 'World'}!`);
    });
    

    That’s it. Once I deployed the function and called the endpoint with a querystring, I saw all the messages show up in Pub/Sub, ready to be consumed.

    Wrap

    Creating and processing events using managed services in the cloud is powerful. It can both simplify and complicate your architecture. It can make it simpler by getting rid of all the machinery to poll and process data from your data sources. Events make your architecture more dynamic and reactive. And that’s where it can get more complicated if you’re not careful. Instead of a clumsy, but predictable set of code that pulls data and processes it inline, now you might have a handful of loosely-coupled components that are lightly orchestrated. Do what makes sense, versus what sounds exciting!

  • Would generative AI have made me a better software architect? Probably.

    Would generative AI have made me a better software architect? Probably.

    Much has been written—some by me—about how generative AI and large language models help developers. While that’s true, there are plenty of tech roles that stand to get a boost from AI assistance. I sometimes describe myself as a “recovering architect” when referring back to my six years in enterprise IT as a solutions/functional architect. It’s not easy being an architect. You lead with influence not authority, you’re often part of small architecture teams and working solo on projects, and tech teams can be skeptical of the value you add. When I look at what’s possible with generative AI today, I think about how I would have used it to be better at the architecture function. As an architect, I’d have used it in the following ways:

    Help stay up-to-date on technology trends

    It’s not hard for architects to get stale on their technical knowledge. Plenty of other responsibilities take architects away from hands-on learning. I once worked with a smart architect who was years removed from coding. He was flabbergasted that our project team was doing client-side JavaScript and was certain that server-side logic was the only way to go. He missed the JavaScript revolution and as a result, the team was skeptical of his future recommendations.

    If you have an Internet-connected generative AI experience, you can start with that to explore modern trends in tech. I say “internet-connected” because if you’re using a model trained and frozen at a point in time, it won’t “know” about anything that happened after it’s training period.

    For example, I might ask a service like Google Bard for help understanding the current landscape for server-side JavaScript.

    I could imagine regularly using generative AI to do research, or engaging in back-and-forth discussion to upgrade my dated knowledge about a topic.

    Assess weaknesses in my architectures

    Architects are famous (infamous?) for their focus on the non-functional requirements of a system. You know, the “-ilities” like scalability, usability, reliability, extensibility, operability, and dozens of others.

    While no substitute for your own experience and knowledge, an LLM can offer a perspective on the quality attributes of your architecture.

    For example, I could take one of the architectures from the Google Cloud Jump Start Solutions. These are high-quality reference apps that you deploy to Google Cloud with a single click. Let’s look at the 3-tier web app, for example.

    It’s a very solid architecture. I can take this diagram, send it to Google Bard, and ask how it measures up against core quality attributes I care about.

    What came back from Bard were sections for each quality attribute, and a handful of recommendations. With better prompting, I could get even more useful data back! Whether you’re a new architect or an experienced one, I’d bet that this offers some fresh perspectives that would validate or challenge your own assumptions.

    Validate architectures against corporate specifications

    Through fine-tuning, retrieval augmented generation, or simply good prompting, you can give LLMs context about your specific environment. As an architect, I’d want to factor in my architecture standards into any evaluation.

    In this example, I give Bard some more context about corporate standards when assessing the above architecture diagram.

    In my experience, architecture is local. Each company has different standards, choices of foundational technologies, and strategic goals. Asking LLMs for generic architecture advice is helpful, but not sufficient. Feeding your context into a model is critical.

    Build prototypes to hand over to engineers

    Good architects regularly escape their ivory tower and stay close to the builders. And ideally, you’re bringing new ideas, and maybe even working code, to the teams you support.

    Services like Bard help me create frontend web pages without any work on my part. And I can quickly prototype with cloud services or open source software thanks to AI-assisted coding tools. Instead of handing over whiteboard sketches or UML diagrams, we can hand over rudimentary working apps.

    Help me write sections of my architecture or design specs

    Don’t outsource any of the serious thinking that goes into your design docs or architecture specs. But that doesn’t mean you can’t get help on boilerplate content. What if I have various sections for “background info” in my docs, and want to include tech assessments?

    I used the new “help me write” feature in Google Docs to summarize the current state of Java and call out popular web frameworks. This might be good for bolstering an architecture decision to choose a particular framework.

    Quickly generating templates or content blocks may prove a very useful job for generative AI.

    Bootstrap new architectural standards

    In addition to helping you write design docs, generative AI may help you lay a foundation for new architecture standards. Plenty of architects write SOPs or usage standards, and I would have used LLMs to make my life easier.

    Here, I once again asked the “help me write” capability in Google Docs to give me the baseline of a new spec for database selection in the enterprise. I get back a useful foundation to build upon.

    Summarize docs or notes to pull out key decisions

    Architects can tend to be … verbose. That’s ok. The new Duet AI in Workspace does a good job summarizing long docs or extracting insights. I would have loved to use this on the 30-50 page architecture specs or design docs I used to work with! Readers could have quickly gotten the gist of the doc, or found the handful of decisions that mattered most. Architects will get plenty of value from this.

    A good architect is worth their weight in gold right now. Software systems have never been more powerful, complicated, and important. Good architecture can accelerate a company or sink it. But the role of the architect is evolving, and generative AI can give architects new ways. to create, assess, and communicate. Start experimenting now!