Author: Richard Seroter

  • 2022 in Review: Reading and Writing Highlights

    I had a good year. My immediate family stayed healthy, I got to spend quality in-person time with my extended family, traveled internationally a bit, enjoyed my job at Google (including a change in role), spoke at some events, taught a couple of Pluralsight courses, and moved to beautiful San Diego. I also did fewer things overall, and found a reasonable balance between go-go-go and rest. For the first time, I kept a daily log and the constant reflection really slowed me down, in a good way.

    For the last fifteen years (yeesh!) I’ve been recapping the previous year, and highlighting the best books I read. I finished 49 books in 2022—my daily log shows that I read on 93% of the days last year—and for the first time, read one book twice. But first, let’s look at a few of my favorite written pieces.

    Things I Wrote (or Said)

    Multicloud’s moment: Everybody’s doing it, but are you doing it right? Here’s eight dos and don’ts. This ended up being a fairly popular post on the Google Cloud blog and I enjoyed sharing some mildly-edgy perspectives.

    Google Cloud Next ’22 Developer Keynote: Top 10 Cloud Technology Predictions. I had the pleasure of speaking at Google Cloud’s flagship conference last year, and for some reason they brought me back again. This was a fun talk. More people comment to me on the walkup music than the content itself.

    How easily can you process events in AWS Lambda, Azure Functions, and Google Cloud Functions? Let’s try it out. I use each hyperscaler on almost a weekly basis, and enjoy comparing and contrasting the experiences.

    Running serverless web, batch, and worker apps with Google Cloud Run and Cloud Spanner. One reason that I write things is so that others can take my ideas and make them better. Multiple people took this post and demo and improved on it, and I was happy to see it.

    This might be the cleanest way I’ve seen parallel processing done in a (serverless) workflow engine. For better or worse, I’m getting older and have seen a lot. So, it’s fun to look at new approaches to classic problems.

    Loading data directly into a warehouse via your messaging engine? Here’s how this handy new feature works in Google Cloud. I’m slowly expanding my horizons from being so app centric to becoming more data centric. BigQuery is awesome, and I spent a fair amount of time in 2022 using it.

    Continuously deploy your apps AND data? Let’s try to use Liquibase for BigQuery changes. Hands down, the most complex tech demo I built this year. The result is straightforward, but I had to learn a lot, almost gave up three times, but am proud of myself for figuring it out. And, I think it solves a useful problem!

    Things I Read

    I read 49 books on an assortment of topics. Here are some of the standouts:

    West with the Night by Beryl Markham. Beautifully written book from Beryl Markham who is known for being the first person to fly solo, non-stop across the Atlantic Ocean. But this story is abotu so much more. Her life growing up in Africa is fascinating, as is her adventurous adulthood. Vivid language, exquisite phrasing, and a compelling story.

    Project Hail Mary: A Novel by Andy Weir. I liked this book so much I read it twice! From the author of “The Martian”, it’s another fun, surprising, and education (!) space adventure.

    Measure What Matters: How Google, Bono, and the Gates Foundation Rock the World with OKRs by John Doerr. Google uses Objectives and Key Results (OKRs) as a way to articulate goals and align teams. This is a good book that uses lots of examples to explains what OKRs are, why they matter, and how to do them well (and poorly).

    140 Days to Hiroshima: The Story of Japan’s Last Chance to Avert Armageddon by David Dean Barrett. I didn’t know much about the Japanese perspective regarding WWII. This well-written book explains how the military leadership of Japan led the country towards a catastrophic result.

    Serpico by Peter Maas. Such a great book. It’s the true story of New York cop Frank Serpico who resisted widespread corruption and eventually helped bring some accountability to the department.

    The Big Short: Inside the Doomsday Machine by Michael Lewis. Wow, I saw the movie years ago, but the book was tremendous. This is a maddening, insightful, and engaging book about the financial meltdown earlier this century and the irresponsible industries that made it possible.

    Person of Interest: Why Jesus Still Matters in a World that Rejects the Bible by J. Warner Wallace. When a homicide detective investigates the historicity of Jesus and applies his considerable investigation skills, you know it’ll be interesting. It was also insightful to learn the various techniques a detective uses to crack a case.

    Florence Nightingale: A Life Inspired by Lynn M. Hamilton. We owe so much to Nightingale! She was not a nurse by trade, but her smarts and relentless effort made a massive impact on the mortality rate of soldiers and hospital design. Linking infection to unclean environments was a game changer.

    The Baseball 100 by Joe Posnanski. I loved this. Baseball is my favorite sport, and Posnanski offers up a vignette of the top hundred players of all time. Many of these stories emphasize the father-son relationship, the ability to rise above adversity, and the hard work necessary to realize sustained success over many years.

    Amp It Up: Leading for Hypergrowth by Raising Expectations, Increasing Urgency, and Elevating Intensity by Frank Slootman. This approach won’t resonate with everyone, but it definitely spoke to me. It’s all about leaders focusing on awareness, urgency, decisiveness, and play to win. Great business book.

    Marco Polo by Laurence Bergreen. I admittedly didn’t much about Marco Polo besides saying his name in swimming pool games. I really liked this book that told the story of the Polos and how their partnership with the Mongols had massive implications on cultural transmission between East and West.

    Product Management in Practice by Matt LeMay. If you’re new at product management, or an experienced product manager, you’ll like this book. It’s full of advice for every dimension of the role.

    Courage Is Calling: Fortune Favors the Brave by Ryan Holiday. This book sent me down a rabbit-hole of Stoic content in 2022. Holiday explores “courage”, how to replace “fear”, and uses tons of examples to prove his point. Motivational and enjoyable read.

    I Was Right On Time by Buck O’Neil. If I made a list of ten all-time Americans to have lunch with, O’Neil would be on the list. This is the story of a wonderful man who played a key part in the Negro Leagues of baseball and remained an ambassador of the (eventually-integrated) sport for decades afterward.

    Better Decisions, Fewer Regrets: 5 Questions to Help You Determine Your Next Move by Andy Stanley. We all make hundreds (thousands?) of decisions every day, and Pastor Stanley offers up some truly compelling questions/decisions to help you make good choices. The one about “what story do you want to tell?” helped me change how I handled the difficult process selling my house this year.

    Lives of the Stoics: The Art of Living from Zeno to Marcus Aurelius by Ryan Holiday and Stephen Hanselman. Read this book to learn about the heroes of the Stoic approach. Along the way, you’ll absorb a lot of pragmatic advice for living a satisfied, impactful life.

    Only the Paranoid Survive: How to Exploit the Crisis Points That Challenge Every Company by Andy Grove. Exceptional business book that I got around to reading this year. It’s a masterclass of decisiveness, ownership, continuous attention, and building things that last.

    Burn Rate: Launching a Startup and Losing My Mind by Andy Dunn. Brave story from Dunn who seemingly had it all while founding and succeeding with Bonobos. But his struggles with mental illness blew all that up, and forced him to get help. Important story.

    Build: An Unorthodox Guide to Making Things Worth Making by Tony Fadell. If you’re building a business, building a team, or building products, you’ll love Fadell’s book. It’s chock-full of useful advice and stories from his time creating some of the most iconic products of the last couple decades.

    Oscar Charleston: The Life and Legend of Baseball’s Greatest Forgotten Player by Jeremy Beer. Was Charleston the greatest baseball player of all time? It’s possible. He starred on some all-time Negro League teams before the sport was integrated, yet never seemed to harbor bitterness about that. He stayed in the sport for years after before dying relatively young. I wish I could have seen him play.

    Unstoppable: Siggi B. Wilzig’s Astonishing Journey from Auschwitz Survivor and Penniless Immigrant to Wall Street Legend by Joshua M. Greene. Wilzig had no advantages in life, especially after coming to America following years in Auschwitz. Through talent and determination he built companies and changed industries. All while ensuring that Holocaust memories wouldn’t be forgotten.

    Is Atheism Dead? by Eric Metaxas. Nowadays, it can take more faith to be an atheist than to believe in God. Metaxas wrote a compelling book that outlines the scientific, archeological, and philosophical case for belief.

    Warfighting: The US Marine Corps Book of Strategy by A.M. Gray. I enjoy a good strategy book, and this one offered useful perspective on dealing with uncertainty, strategies for confronting the “enemy”, and creating clarity around your intent.

    Continuous Discovery Habits: Discover Products that Create Customer Value and Business Value by Teresa Torres. A good product team is never done learning. Torres wrote an excellent book that can help product managers and organizational leaders continuously discover what the customer needs and how a product should evolve for maximum benefit.

    Developer Marketing Does Not Exist: The Authentic Guide to Reach a Technical Audience by Adam DuVander. Nobody likes being on the receiving end of “bad” marketing, but developers in particular are quick to tune out anything inauthentic. DuVander does a great job laying out what a sincere, impactful marketing messaging can look like for technical teams.

    Developer Relations: How to Build and Grow a Successful Developer Program by Caroline Lewko and James Parton. Before I took on my role leading Developer Relations at Google Cloud, I picked up this book. I’m glad I did. I learned about team structures, metrics, and mission along with how to help DevRel positively impact the business itself.

    Gates of Fire: An Epic Novel of the Battle of Thermopylae by Steven Pressfield. I could barely put this book down. It’s historical fiction about the 300 Spartan warriors who somehow resisted a massive Persian invasion and changed the course of history. Staggering courage.

    The Big Sky by A.B. Guthrie Jr. This is a fictional Western story that offers up a captivating look at life during westward expansion and what it meant to survive and thrive.

    Jesus and the Eyewitnesses: The Gospels as Eyewitness Testimony by Richard Bauckham. This book represents the first effort to treat the Gospels as testimony by eyewitnesses. Fascinating stuff. Bauckham does thorough analysis into names of Palestinian Jews in this time period, how oral preservation happened in ancient cultures, how memory works, and the standalone value of eyewitness testimony.

    In the Kingdom of Ice: The Grand and Terrible Polar Voyage of the USS Jeannette by Hampton Sides. Another book about courage! The 30+ folks who left to find the North Pole weren’t reckless; they meticulously researched and prepared. But they were doomed from the start. Between two years stuck in the ice, and an improbable journey after their ship sank, it’s almost amazing that 13 survived.

    Investments Unlimited: A Novel About DevOps, Security, Audit Compliance, and Thriving in the Digital Age by many. Can a book about enterprise security practices be engaging? Yes, when written as fiction. Good story that imparted helpful lessons for those trying to go fast while staying safe.

    What does 2023 hold? I have absolutely no idea. Hopefully for all of us, it offers more learning opportunities, more laughter, and less of taking ourselves too seriously. I’m going to continue keeping a short daily journal, started sharing daily recaps of my favorite blog posts, and I hope to continue enjoying my family while tackling new challenges at work. Let’s stay connected this year!

  • Daily Wrap Up – January 6, 2023 (#004)

    Happy Friday! Check out a handful of the best things I read today.

    [blog] Build Containers Without a Dockerfile. I’m no good at writing Dockerfiles—maybe I should just use large language model AI for it?—so I like tools that make the container image generation invisible. If you’re a .NET dev, you can skip Dockerfile creation now. Caveats apply!

    [blog] Confluent + Immerok: Cloud Native Kafka Meets Cloud Native Flink. The Confluent gang—they’re the stewards of Apache Kafka—are betting on Flink as a stream processing project. Looks like they’re going big.

    [article] What Trouble Awaits Cloud Native Security in 2023? “Security” is such a broad topic, that it’s almost meaningless to say you’re focusing on “security in 2023.” Access control? Data encryption? Physical security? The list goes on. This piece calls out a few areas of focus.

    [blog] API Management is About More Than Technology. I’ll admit that I’ve never been particularly excited about API management, but some people are, and want tech that solves their problems. This post calls out a few things that a good API solution does for you.

    [article] C++ wins programming language of the year award. Including this, because it was a reminder to me that the buzziest and most talked about tech is often not the thing everyone is actually using.

    [article] Blue-Green Deployment From the Trenches. Every good solution creates new problems, right? So if you’ve gotten better at shipping software without downtime by doing things like blue/green deployments, now you have problems around handling breaking changes. Good article here.

    [article] Where Is Tech Going in 2023? I buy these predictions as they seem reasonable to me, even if it happens later than 2023.

    ##

    Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

  • Daily Wrap Up – January 5, 2023 (#003)

    Good mix of content today, ranging from an insightful video interview, to lessons learned from Southwest’s holiday meltdown. Enjoy!

    [article] WebAssembly: 5 Predictions for 2023. For years, it’s felt like WASM has been poised for a breakout. Does it replace containers or VMs or whatever else we like to pit against each other? Probably not. But it’s going to matter in a bigger way soon.

    Kelsey Hightower On Kubernetes & Cloud Computing | The Engineering Room Ep. 13. This is a discussion between two folks that I respect a lot. I’ve got Dave Farley’s “Continuous Delivery” book on my bookshelf, and I’ve got Kelsey in my Chat list at work.

    [article] CircleCI warns customers to rotate ‘any and all secrets’ after hack. Yowza. These types of things will keep happening, which is why your ability to automatically (through automation) update infrastructure might be the most important investment you make in 2023. Check out this very good related piece as well.

    [docs] Deploy GPU workloads in Autopilot. New GA functionality in Google Kubernetes Engine. Autopilot clusters are fully managed—we provision, scale, update, repair the cluster for you—and now you can request GPUs for your workload with a simple podspec annotation.

    [blog] Every Company Needs a Developer Relations Team. Provocative argument. It’s a good argument, but in this economy, every role—especially one that can have fuzzy success metrics—requires good business justification.

    [article] Southwest’s Crisis Is A Lesson For Innovators. Some good analysis from Forrester Research.

    [blog] Southwest’s Christmas gift to enterprise tech vendors. Every breach, IT meltdown, or negative news story gets used by companies to sell their solution that fixes the world’s ills.

    [article] How to Motivate a Top Performer — When You Can’t Promote Them. This is useful advice if you’re a manager who can’t pull off a promotion for a staff member right now, or an employee who is thinking of what to ask for if your promotion feels far off.

    [blog] The Home Depot orchestrates self-service cloud solutions with Workflows. I like our fully managed stateful workflow engine, and more companies are using this sort of solution to orchestrate their processes.

    [docs] Vertex AI Jupyter Notebook tutorials. I have a goal to get more hands on with ML tech in 2023, and found this list of available notebooks very handy. Looks like a good way to try out a couple dozen scenarios.

    ##

    Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

  • Daily Wrap Up – January 4, 2023 (#002)

    The vendor-driven content machines haven’t woken up yet in the new year, so I read a lot of practitioner and analyst content today. Check out some good pieces that range from very specific (ML use case) to general purpose (evaluating new tech).

    [article] Why Are We So Bad at Mean Time to Repair (MTTR)? Good article that explains MTTR, and what it takes to get our systems back online faster.

    [blog] What Is Cloud-Native Architecture? Not a bad writeup, but I don’t agree with chunks of it. Specifically, “cloud native” isn’t about containers or meshes. I like to say that “Cloud-native software is built for scale, built for continuous change, built to tolerate failure, and built for manageability.” And I think it’s very difficult to truly do those things without cloud-based managed services.

    [blog] 6 Open-Source API Gateways. Do you use an API gateway? If you want to use one, you’ve got lots of commercial choices, as well as open source ones.

    [article] Predictions for gaming in 2023. I sponsor a gaming account as part of my day job at Google Cloud, so I’ve paid more attention to the space lately. This is a good set of predictions for 2023, and even if you’re not in the gaming industry, you may get insight into how emerging tech ideas are being used.

    [article] Kubernetes 1.26 Released with Image Registry Changes, Enhanced Resource Allocation, and Metrics. Saw data last month that said most Kubernetes users were 18 months behind on the latest release. Here’s another release to fall behind on! But remember, you focus on regular updates because of security updates, not just new and shiny features.

    [article] Why and How to Evaluate Emerging Technologies. We can feel overwhelmed trying to keep track of all the new things going on. Don’t stress it, but also don’t completely ignore what’s happening in our industry.

    [article] How to Grow Your Top Line in a Down Market. This will likely be a year when many folks retreat a bit and try to weather the economic storm. But others won’t just focus on fixing the bottom line, but also the top line.

    [blog] Streamlining gcloud with a custom CLI for serverless JavaScript developers. Nice little extension of our Google Cloud CLI tool that makes it even easier to deploy an app to the cloud.

    [article] Data in 2023: Rethink the Modern Data Stack. We can process more data, and do more with it, than ever before. That’s awesome. It also means the experience has gotten richer, and more complex. This post calls out a need for simplification.

    [blog] Selecting the Best Image for Each Merchant Using Exploration and Machine Learning. I liked this look at doing A/B testing and using ML to find the right image to promote a restaurant.

    ##

    Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

  • Daily Wrap Up – January 3, 2023 (#001)

    Let’s start something new in 2023! Every day, I consume a lot of media created by smart people. Media such as blog posts, videos, articles, and podcasts. I share a subset of those via links in places like Twitter and LinkedIn. I’ll continue doing that, but also want to centralize and “own” a bit more of where I share. So, at the end of each work day (Pacific Time), I’ll publish a new “daily wrap up” post which includes more of the best things I consumed that day. If you want to automatically receive it, you can subscribe via RSS or email (below).

    On to today’s links …

    [article] Google Publishes Technique for AI Language Model Self-Improvement. Lots of AI hype out there, but pay attention to some of the fundamental improvements happening out there.

    [blog] You Can Use Both Features and Benefits in Your Developer Marketing. My friend Adam does a good job explaining what good developer marketing can look like.

    [blog] Data augmentation with BigQuery and Google Knowledge Graph. I thought I knew most (all?) of the Google Cloud services, but I just came across this new one last month. It has billions of entries that you can search.

    [blog] “Supercloud” RIP. I’m not a believer in the idea that you’re better off creating some sort of cloud-agnostic franken-stack that somehow lets you use the best of cloud. Some have done it, and even done it well, but for most, it’s a waste of time and a distraction.

    [blog] You Want Modules, Not Microservices. Pragmatic look at what really matters when architecting your system.

    [blog] Accelerating Model Deployment using Transfer Learning and Vertex AI. Detailed example that touches on Tensorflow, Vertex, and more.

    [blog] 17 Compelling Reasons To Start Ditching TypeScript Now. I rarely come across people griping about TypeScript—every OTHER language? sure!—so this post stood out to me.

    [article] 2023 could be the year of public cloud repatriation. I just don’t see it. Some workloads? Sure. But major re-investments in on-premises data centers and self-managed software? Not by most who are looking to lead their industries.

    ##

    Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

  • How do you back up and restore your (stateful) Kubernetes workloads? Here’s a new cloud-native option.

    How do you back up and restore your (stateful) Kubernetes workloads? Here’s a new cloud-native option.

    The idea of “backup and restore” in a complex distributed system is bit weird. Is it really even possible? Can you snapshot all the components of an entire system at a single point in time, inclusive of all the side effects in downstream systems? I dunno. But you need to at least have a good recovery story for each of your major stateful components! While Kubernetes started out as a terrific orchestrator for stateless containers, it’s also matured as a runtime for stateful workloads. Lots of folks are now using Kubernetes to run databases, event processors, ML models. and even “legacy” apps that maintain local state. Until now, public cloud users have only had DIY or 3rd party options when it comes to backing up their Kubernetes clusters, but not any more. Google Cloud just shipped a new built-in Backup for Google Kubernetes Engine (GKE) feature, and I wanted to try it out.

    What Backup for GKE does

    Basically, it captures the resources—at the cluster or namespace level—and persistent volumes within a given cluster at a specific point in time. It does not back up cluster configurations themselves (e.g. node pool size, machine types, enabled cluster features). For that, you’d like likely have an infrastructure-as-code approach for stamping out clusters (using something like Terraform), and use Backup for GKE to restore the state of your running app. This diagram from the official docs shows the architecture:

    Architecture of Backup for GKE

    A Kubernetes cluster backup comes from a “backup plan” that defines the scope of a given backup. With these, you choose a cluster to back up, which namespaces you want backed up, and a schedule (if any). To restore a backup into an existing cluster, you execute a pre-defined “restore plan.” All of this is part of a fully managed Google Cloud service, so you’re not stuck operating any of the backup machinery.

    Setting up Backup for GKE on a new cluster

    Backup for GKE works with existing clusters (see Appendix A below), but I wanted to try it out on a fresh cluster first.

    I started with a GKE standard cluster. First, I made sure to choose a Kubernetes version that supported the Backup feature. Right now, that’s Kubernetes 1.24 or higher.

    I also turned on two features at the cluster-level. The first was Workload Identity. This security feature enforces more granular, workload-specific permissions to access other Google Cloud services.

    The second and final feature to enable is Backup for GKE. This injects the agent into the cluster and connects it to the control plane.

    Deploying a stateful app to Kubernetes

    Once my cluster was up and running, I wanted to deploy a simple web application to it. What’s the app? I created a poorly-written Go app that has a web form to collect support tickets. After you submit a ticket, I route it to Google Cloud Pub/Sub, write an entry into a directory, and then take the result of the cloud request and jam the identifier into a file on another directory. What does this app prove? Two things. First, it should flex Workload Identity by successfully publishing to Pub/Sub. And second, I wanted to observe how stateful backups worked, so I’m writing files to two directories, one that can be backed by a persistent volume, and one backed by a local (node) volume.

    I built and containerized the app automatically by using a Cloud Buildpack within a Cloud Build manifest, and invoking a single command:

    gcloud builds submit --config cloudbuild.yaml
    

    I then logged into my just-created GKE cluster and created a new namespace to hold my application and specific permissions.

    kubectl create ns demos
    

    To light up Workload Identity, you create a local service account in a namespace and map it to an existing Google Cloud IAM account that has the permissions the application should have. I created a Kubernetes service account:

    kubectl create serviceaccount webapp-sa --namespace demos
    

    And then I annotated the service account with the mapping to an IAM account (demo-container-app-user) which triggers the impersonation at runtime:

    kubectl annotate serviceaccount webapp-sa --namespace demos iam.gke.io/gcp-service-account=demo-container-app-user@seroter-project-base.iam.gserviceaccount.com
    

    Sweet. Finally, there’s the Kubernetes deployment YAML that points to my app container, service account, and the two volumes used by my app. At the top is my definition of the persistent volume, and then the deployment itself.

    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: pvc-output
    spec:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 5Gi
      storageClassName: standard-rwo
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: go-pubsub-publisher-deployment
    spec:
      selector:
        matchLabels:
          app: go-pubsub-publisher
      template:
        metadata:
          labels:
            app: go-pubsub-publisher
        spec:
          containers:
          - name: go-pubsub-publisher
            image: gcr.io/seroter-project-base/go-pubsub-publisher:34749b85-afbb-4b59-98cc-4d5d790eb325
            volumeMounts:
              - mountPath: /logs
                name: log-volume
              - mountPath: /acks
                name: pvc-output-volume
            resources:
              requests:
                memory: "64Mi"
                cpu: "300m"
              limits:
                memory: "128Mi"
                cpu: "500m"
            ports:
            - containerPort: 8080
          serviceAccountName: webapp-sa
          securityContext:
            runAsUser: 1000
            runAsGroup: 3000
            fsGroup: 2000
          volumes:
            - name: log-volume
              emptyDir: {}
            - name: pvc-output-volume
              persistentVolumeClaim:
                claimName: pvc-output
    

    I applied the above manifest (and a services definition) to my GKE cluster with the following command:

    kubectl apply -f k8s/. -n demos
    

    A moment afterwards, I saw a deployment and service. The deployment showed two associated volumes, including the auto-created persistent disk based on my declarative request.

    Let’s triple check that. I got the name of the pod and got a shell into the running container. See below that both directories show up, and my app isn’t aware of which one is from a persistent volume and which is not.

    I pulled up the web page for the app, and entered a few new “support tickets” into the system. The Pub/Sub UI lets me pull messages from a topic subscription, and we see my submitted tickets there.

    The next thing to check is the container’s volumes. Sure enough, I saw the contents of each message written to the local directory (/logs) and the message IDs written to the persistent directory (/acks).

    Running a backup and restore

    Let’s back that thing up.

    Backup plans are tied to a cluster. You can see here that my primary cluster (with our deployed app) and new secondary cluster (empty) have zero plans.

    I clicked the “create a backup plan” button at the top of this page, and got asked for some initial plan details.

    That all seemed straightforward. Then it got real. My next options included the ability to back up ALL the namespaces of the cluster, specific ones, or “protected” (more customized) configs. I just chose our “demos” namespace for backup. Also note that I could choose to back up persistent volume data and control encryption.

    Next, I was asked to choose the frequency of backups. This is defined in the form of a CRON expression. I could back up every few minutes, once a month, or every year. If I leave this “schedule” empty, this becomes an on-demand backup plan.

    After reviewing all my settings, I saved the backup plan. Then I manually kicked off a backup by providing the name and retention period for the backup.

    To do anything with this backup, I need a “restore plan.” I clicked the button to create a new restore plan, and was asked to connect it to a backup plan, and a target cluster.

    Next, I had the choice of restoring some, or all, namespaces. In real life, you might back up everything, and then selectively restore. I like that you’re asked about conflict handling, which determines what happens if the target cluster already has the specified namespace in there. There are also a handful of flexible options for restoring volume data, ranging from creating new volumes, to re-using existing, to not restoring anything.

    After that, I was asked about cluster-scoped resources. It pre-loaded a few API groups and Object kinds to restore, and offered me the option to overwrite any existing resources.

    Finally, I got asked for any substitution rules to swap backed up values for different ones. With that, I finished my restore plan and had everything I needed to test my backup.

    I set up a restore, which basically just involved choosing a restore plan (which is connected to a backup, and target cluster). In just a few moments, I saw a “succeeded” message and it looked like it worked.

    When I checked out the GKE “workloads” view, I saw both the original and “restored” deployment running.

    I logged into the “secondary” GKE cluster and saw my custom namespace and workload. I also checked, and saw that my custom service account (and Workload Identity-ready annotation) came over in the restore action.

    Next, I grabbed a shell into the container to check my stateful data. What did I find? The “local” volume from the original container (“logs”) was empty. Which makes sense. That wasn’t backed by a persistent disk. The “acks” directory, on the other hand, was backed up, and shows up intact as part of the restore.

    To test out my “restored” app instance, I submitted a new ticket, saw it show up in Pub/Sub (it just worked, as Workload Identity was in place), and also saw the new log file, and updated “ids.txt” file.

    Pretty cool! With Backup for GKE, you don’t deal with the installation, patching, or management of your backup infrastructure, and get a fairly sophisticated mechanism for resilience in your distributed system.

    To learn more about this, check out the useful documentation, and these two videos: Introduction to Backup for GKE, and How to enable GKE Backup.

    Appendix A: Setting up Backup for GKE on an an existing cluster

    Backup for GKE doesn’t only work with new clusters. You can add it to most existing GKE clusters. And these clusters can act as either sources or targets!

    First, let’s talk about GKE Autopilot clusters. These are basically hyper-automated GKE standard clusters that incorporate all of Google’s security and scalability best practices. An Autopilot cluster doesn’t yet expose “Backup for GKE” feature at creation time, but you apply if after the fact. You also need to ensure you’re on Kubernetes 1.24 or higher. Workload Identity is enabled by default, so there’s nothing you need to do there.

    But let’s talk about an existing GKE standard cluster. If you provision one from scratch, the default security option is to use a service account for the node pool identity. What this means is that any workloads in the cluster will have the same permissions as that account.

    If I provision a cluster (cluster #1) like so, the app from above does not work. Why? The “default compute service account” doesn’t have permission to write to a Pub/Sub topic. A second security option is to use a specific service account with the minimum set of permissions needed for the node’s workloads. If I provision cluster #2 and choose a service account with rights to publish to Pub/Sub, my app does work.

    The third security option relates to the access scopes for the cluster. This is a legacy method for authorization. The default setting is “allow default access” which offers a limited set of OAuth-based permissions. If I build a GKE cluster (cluster #3) with a default service account and “allow full access to all cloud APIs” then my app above does work because it has wide-ranging access to all the cloud APIs.

    For a GKE standard cluster configured in either of the three ways above, I cannot install Backup for GKE. Why? I have to first enable Workload Identity. Once I edited the three clusters’ settings to enable Workload Identity, my app behaved the same way (not work, work, work)! That surprised me. I expected it to stop using the cluster credentials and require a Workload Identity assignment. What went wrong? For an existing cluster, turning on Workload Identity alone doesn’t trigger the necessary changes for existing node pools. Any new node pools would have everything enabled, but you have to explicitly turn on the GKE Metadata Server for any existing node pools.

    This GKE Metadata Server is automatically turned on for any new node pools when you enable Workload Identity, and if you choose to install Workload Identity on a new cluster, it’s also automatically enabled for the first node pool. I didn’t totally understand all this until I tried out a few scenarios!

    Once you’re running a supported version of Kubernetes and have Workload Identity enabled on a cluster, you can enroll it in Backup for GKE.

  • Continuously deploy your apps AND data? Let’s try to use Liquibase for BigQuery changes.

    Continuously deploy your apps AND data? Let’s try to use Liquibase for BigQuery changes.

    Want to constantly deploy updates to your web app through the use of automation? Not everyone does it, but it’s a mostly solved problem with mature patterns and tools that make it possible. Automated deployments of databases, app services, and data warehouses? Also possible, but not something I personally see done as often. Let’s change that!

    Last month, I was tweeting about Liquibase, and their CTO and co-founder pointed out to me that Google Cloud contributed a BigQuery extension. Given that Liquibase is a well-known tool for automating database changes, I figured it was time to dig in and see how it worked, especially for a fully managed data warehouse like BigQuery. Specifically, I wanted to prove out four things:

    1. Use the Liquibase CLI locally to add columns to a BigQuery table. This is an easy way to get started!
    2. Use the Liquibase Docker image to add columns to a BigQuery table. See how to deploy changes through a Docker container, which makes later automation easier.
    3. Use the Liquibase Docker image within Cloud Build to automate deployment of a BigQuery table change. Bring in continuous integration (and general automation service) Google Cloud Build to invoke the Liquibase container to push BigQuery changes.
    4. Use Cloud Build and Cloud Deploy to automate the build and deployment of the app to GKE along with a BigQuery table change. This feels like the ideal state, where Cloud Build does app packaging, and then hands off to Cloud Deploy to push BigQuery changes (using the Docker image) and the web app through dev/test/prod.

    I learned a lot of new things by performing this exercise! I’ll share all my code and lessons learned about Docker, Kubernetes, init containers, and Liquibase throughout this post.

    Scenario #1 – Use Liquibase CLI

    The concepts behind Liquibase are fairly straightforward: define a connection string to your data source, and create a configuration file that represents the desired change to your database. A Liquibase-driven change isn’t oriented adding data itself to a database (although, it can), but for making structural changes like adding tables, creating views, and adding foreign key constraints. Liquibase also does things like change tracking, change locks, and assistance with rollbacks.

    While it directly integrates with Java platforms like Spring Boot, you can also use it standalone via a CLI or Docker image.

    I downloaded the CLI installer for my Mac, which added the bits to a local directory. And then I checked to see if I could access the liquibase CLI from the console.

    Next, I downloaded the BigQuery JDBC driver which is what Liquibase uses to connect to my BigQuery. The downloaded package includes the JDBC driver along with a “lib” folder containing a bunch of dependencies.

    I added *all* of those files—the GoogleBigQueryJDBC42.jar file and everything in the “lib” folder—to the “lib” folder included in the liquibase install directory.

    Next, I grabbed the latest BigQuery extension for Liquibase and installed that single JAR file into the same “lib” folder in the local liquibase directory. That’s it for getting the CLI properly loaded.

    What about BigQuery itself? Anything to do there? Not really. When experimenting, I got “dataset not found” from Liquibase when using a specific region like “us-west1” so I created a dataset the wider “US” region and everything worked fine.

    I added a simple table to this dataset and started it off with two columns.

    Now I was ready to trigger some BigQuery changes! I had a local folder (doesn’t need to be where the CLI was installed) with two files: liquibase.properties, and changelog.yaml. The properties file (details here) includes the database connection string, among other key attributes. I turned on verbose logging, which was very helpful in finding obscure issues with my setup! Also, I want to use environmental credentials (saved locally, or available within a cloud instance by default) versus entering creds in the file, so the OAuthType is set to “3”.

    #point to where the file is containing the changelog to execute
    changelogFile: changelog.yaml
    #identify which driver to use for connectivity
    driver: com.simba.googlebigquery.jdbc.Driver
    #set the connection string for bigquery
    url: jdbc:bigquery://https://googleapis.com/bigquery/v2:443;ProjectId=seroter-project-base;DefaultDataset=employee_dataset;OAuthType=3;
    #log all the things
    logLevel: 0
    #if not using the "hub" features
    liquibase.hub.mode=off
    

    Next I created the actual change log. There are lots of things you can do here, and change files can be authored in JSON, XML, SQL, or YAML. I chose YAML, because I know how to have a good time. The BigQuery driver supports most of the Liquibase commands, and I chose the one to add a new column to my table.

    databaseChangeLog:
      - changeSet:
          id: addColumn-example1
          author: rseroter
          changes:
            - addColumn:
                tableName: names_1
                columns:
                - column:
                    name: location
                    type: STRING
    

    Once you get all the setup in place, the actual Liquibase stuff is fairly simple! To execute this change, I jumped into the CLI, navigated to the folder holding the properties file and change log, and issued a single command.

    liquibase --changeLogFile=changelog.yaml update

    Assuming you have all the authentication and authorization settings correct and files defined and formatted in the right way, the command should complete successfully. In BigQuery, I saw that my table had a new column.

    Note that this command is idempotent. I can execute it again and again with no errors or side effects. After I executed the command, I saw two new tables added to my dataset. If I had set the “liquibaseSchemaName” property in the properties file, I could have put these tables into a different dataset of my choosing. What are they for? The DATABASECHANGELOGLOCK table is used to create a “lock” on the database change so that only one process at a time can make updates. The DATABASECHANGELOG table stores details of what was done, when. Be aware that each changeset itself is unique, so if I tried to run a new change (add a different column) with the same changeset id (above, set to “addColumn-example1”), I’d get an error.

    That’s it for the CLI example. Not too bad!

    Scenario #2 – Use Liquibase Docker image

    The CLI is cool, but maybe you want an even more portable way to trigger a database change? Liquibase offers a Docker image that has the CLI and necessary bits loaded up for you.

    To test this out, I fired up an instance of the Google Cloud Shell—this is an dev environment that you can access within our Console or standalone. From here, I created a local directory (lq) and added folders for “changelog” and “lib.” I uploaded all the BigQuery JDBC JAR files, as well as the Liquibase BigQuery driver JAR file.

    I also uploaded the liquibase.properties file and changelog.yaml file to the “changelog” folder in my Cloud Shell. I opened the changelog.yaml file in the editor, and updated the changeset identifier and set a new column name.

    All that’s left is to start the Docker container. Note that you might find it easier to create a new Docker image based on the base Liquibase image with all the extra JAR files embedded within it instead of schlepping the JARs all over the place. In my case here, I wanted to keep it all separate. To ensure that the Liquibase Docker container “sees” all my config files and JAR files, I needed to mount volumes when I started the container. The first volume mount maps from my local “changelog” directory to the “/liquibase/changelog” directory in the container. The second maps from the local “lib” directory to the right spot in the container. And by mounting all those JARs into the container’s “lib” directory—while also setting the “–include-system-classpath” flag to ensure it loads everything it finds there—the container has everything it needs. Here’s the whole Docker command:

    docker run --rm -v /home/richard/lq/changelog:/liquibase/changelog -v /home/richard/lq/lib:/liquibase/lib liquibase/liquibase --include-system-classpath=true --changeLogFile=changelog/changelog.yaml --defaultsFile=/liquibase/changelog/liquibase.properties update
    

    After 30 seconds or so, I saw the new column added to my BigQuery table.

    To be honest, this doesn’t feel like it’s that much simpler than just using the CLI, but, by learning how to use the container mechanism, I could now embed this database change process into a container-native cloud build tool.

    Scenario #3 – Automate using Cloud Build

    Those first two scenarios are helpful for learning how to do declarative changes to your database. Now it’s time to do something more automated and sustainable. In this scenario, I tried using Google Cloud Build to automate the deployment of my database changes.

    Cloud Build runs each “step” of the build process in a container. These steps can do all sorts of things, ranging from compiling your code, running tests, pushing to artifact storage, or deploy a workload. Since it can honestly run any container, we could also use the Liquibase container image as a “step” of the build. Let’s see how it works.

    My first challenge related to getting all those JDBC and driver JAR files into Cloud Build! How could the Docker container “see” them? To start, I put all the JAR files and config files (updated with a new column named “title”) into Google Cloud Storage buckets. This gave me easy, anywhere access to the files.

    Then, I decided to take advantage of Cloud Build’s built-in volume for sharing data between the independent build steps. This way, I could retrieve the files, store them, and then the Liquibase container could see them on the shared volume. In real life, you’d probably grab the config files from a Git repo, and the JAR files from a bucket. We’ll do that in the next scenario! Be aware that there’s also a project out there for mounting Cloud Storage buckets as volumes, but I didn’t feel like trying to do that. Here’s my complete Cloud Build manifest:

    steps: 
    - id: "Get Liquibase Jar files"
      name: 'gcr.io/cloud-builders/gsutil'
      dir: 'lib'
      args: ['cp', 'gs://liquibase-jars/*.jar', '/workspace/lib']
    - id: "Get Liquibase config files"
      name: 'gcr.io/cloud-builders/gsutil'
      dir: 'changelog'
      args: ['cp', 'gs://liquibase-configs/*.*', '/workspace/changelog']
    - id: "Update BQ"
      name: 'gcr.io/cloud-builders/docker'
      args: [ "run", "--network=cloudbuild", "--rm", "--volume", "/workspace/changelog:/liquibase/changelog", "--volume", "/workspace/lib:/liquibase/lib", "liquibase/liquibase", "--include-system-classpath=true", "--changeLogFile=changelog/changelog.yaml", "--defaultsFile=/liquibase/changelog/liquibase.properties", "update" ]
    

    The first “step” uses a container that’s pre-loaded with the Cloud Storage CLI. I executed the “copy” command and put all the JAR files into the built-in “workspace” volume. The second step does something similar by grabbing all the “config” files and dropping them into another folder within the “workspace” volume.

    Then the “big” step executed a virtually identical Docker “run” command as in scenario #2. I pointed to the “workspace” directories for the mounted volumes. Note the “–network” flag which is a magic command for using default credentials.

    I jumped into the Google Cloud Console and created a new Cloud Build trigger. Since I’m not (yet) using a git repo for configs, but I have to pick SOMETHING when building a trigger, I chose a random repo of mine. I chose an “inline” Cloud Build definition and pasted in the YAML above.

    That’s it. I saved the trigger, ensured the “Cloud Build” account had appropriate permissions to update BigQuery, and “ran” the Cloud Build job.

    I saw the new column in my BigQuery table as a result and if I looked at the “change table” managed by Liquibase, I saw each of the three change we did so far.

    Scenario #4 – Automate using Cloud Build and Cloud Deploy

    So far so good. But it doesn’t feel “done” yet. What I really want is to take a web application that writes to BigQuery, and deploy that, along with BigQuery changes, in one automated process. And I want to use the “right” tools, so I should use Cloud Build to package the app, and Google Cloud Deploy to push the app to GKE.

    I first built a new web app using Node.js. This very simple app asks you to enter the name of an employee, and it adds that employee to a BigQuery table. I’m seeking seed funding for this app now if you want to invest. The heart of this app’s functionality is in its router:

    router.post('/', async function(req, res, next) {
        console.log('called post - creating row for ' + req.body.inputname)
    
        const row = [
            {empid: uuidv4(), fullname: req.body.inputname}
          ];
    
        // Insert data into a table
        await bigquery
        .dataset('employee_dataset')
        .table('names_1')
        .insert(row);
        console.log(`Inserted 1 rows`);
    
    
        res.render('index', { title: 'Employee Entry Form' });
      });
    

    Before defining our Cloud Build process that packages the app, I wanted to create all the Cloud Deploy artifacts. These artifacts consist of a set of Kubernetes deployment files, a Skaffold configuration, and finally, a pipeline definition. The Kubernetes deployments get associated to a profile (dev/prod) in the Skaffold file, and the pipeline definition identifies the target GKE clusters.

    Let’s look at the Kubernetes deployment file for the “dev” environment. To execute the Liquibase container before deploying my Node.js application, I decided to use Kubernetes init containers. These run (and finish) before the actual container you care about. But I had the same challenge as with Cloud Build. How do I pass the config files and JAR files to the Liquibase container? Fortunately, Kubernetes offers up Volumes as well. Basically, the below deployment file does the following things:

    • Create an empty volume called “workspace.”
    • Runs an init container that executes a script to create the “changelog” and “lib” folders in the workspace volume. For whatever reason, the Cloud Storage CLI wouldn’t do it automatically for me, so I added this distinct step.
    • Runs an init container that git clones the latest config files from my GitHub project (no longer using Cloud Storage) and stashes them in the “changelog” directory in the workspace volume.
    • Runs a third init container to retrieve the JAR files from Cloud Storage and stuff them into the “lib” directory in the workspace volume.
    • Runs a final init container that mounts each directory to the right place in the container (using subpath references), and runs the “liquibase update” command.
    • Runs the application container holding our web app.
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: db-ci-deployment-dev
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: web-data-app-dev
      template:
        metadata:
          labels:
            app: web-data-app-dev
        spec:
          volumes:
          - name: workspace
            emptyDir: {}
          initContainers:
            - name: create-folders
              image: alpine
              command:
              - /bin/sh
              - -c
              - |
                cd liquibase
                mkdir changelog
                mkdir lib
                ls
                echo "folders created"
              volumeMounts:
              - name: workspace
                mountPath: /liquibase
                readOnly: false      
            - name: preload-changelog
              image: bitnami/git
              command:
              - /bin/sh
              - -c
              - |
                git clone https://github.com/rseroter/web-data-app.git
                cp web-data-app/db_config/* liquibase/changelog
                cd liquibase/changelog
                ls
              volumeMounts:
              - name: workspace
                mountPath: /liquibase
                readOnly: false
            - name: preload-jars
              image: gcr.io/google.com/cloudsdktool/cloud-sdk
              command: ["gsutil"]
              args: ['cp', 'gs://liquibase-jars/*', '/liquibase/lib/']
              volumeMounts:
              - name: workspace
                mountPath: /liquibase
                readOnly: false
            - name: run-lq
              image: liquibase/liquibase
              command: ["liquibase"]
              args: ['update', '--include-system-classpath=true', '--changeLogFile=/changelog/changelog.yaml', '--defaultsFile=/liquibase/changelog/liquibase.properties']
              volumeMounts:
              - name: workspace
                mountPath: /liquibase/changelog
                subPath: changelog
                readOnly: false
              - name: workspace
                mountPath: /liquibase/lib
                subPath: lib
                readOnly: false
          containers:
          - name: web-data-app-dev
            image: web-data-app
            env:
            - name: PORT
              value: "3000"
            ports:
              - containerPort: 3000
            volumeMounts:
            - name: workspace
              mountPath: /liquibase
    

    The only difference between the “dev” and “prod” deployments is that I named the running containers something different. Each deployment also has a corresponding “service.yaml” file that exposes the container with a public endpoint.

    Ok, so we have configs. That’s the hard part, and took me the longest to figure out! The rest is straightforward.

    I defined a skaffold.yaml file which Cloud Deploy uses to render right assets for each environment.

    apiVersion: skaffold/v2beta16
    kind: Config
    metadata:
     name: web-data-app-config
    profiles:
     - name: prod
       deploy:
         kubectl:
           manifests:
             - deployment-prod.yaml
             - service-prod.yaml
     - name: dev
       deploy:
         kubectl:
           manifests:
             - deployment-dev.yaml
             - service-dev.yaml
    

    Skaffold is a cool tool for local development, but I won’t go into it here. The only other asset we need for Cloud Deploy is the actual pipeline definition! Here, I’m pointing to my two Google Kubernetes Engine clusters (with platform-wide access scopes) that represent dev and prod environments.

    apiVersion: deploy.cloud.google.com/v1
    kind: DeliveryPipeline
    metadata:
     name: data-app-pipeline
    description: application pipeline for app and BQ changes
    serialPipeline:
     stages:
     - targetId: devenv
       profiles:
       - dev
     - targetId: prodenv
       profiles:
       - prod
    ---
    
    apiVersion: deploy.cloud.google.com/v1
    kind: Target
    metadata:
     name: devenv
    description: development GKE cluster
    gke:
     cluster: projects/seroter-project-base/locations/us-central1-c/clusters/cluster-seroter-gke-1110
    
    ---
    
    apiVersion: deploy.cloud.google.com/v1
    kind: Target
    metadata:
     name: prodenv
    description: production GKE cluster
    gke:
     cluster: projects/seroter-project-base/locations/us-central1-c/clusters/cluster-seroter-gke-1117
    

    I then ran the single command to deploy that pipeline (which doesn’t yet care about the Skaffold and Kubernetes files):

    gcloud deploy apply --file=clouddeploy.yaml --region=us-central1 --project=seroter-project-base
    

    In the Cloud Console, I saw a visual representation of my jazzy new pipeline.

    The last step is to create the Cloud Build definition which builds my Node.js app, stashes it into Google Cloud Artifact Registry, and then triggers a Cloud Deploy “release.” You can see that I point to the Skaffold file, which in turns knows where the latest Kubernetes deployment/service YAML files are at. Note that I use a substitution value here with –images where the “web-data-app” value in each Kubernetes deployment file gets swapped out with the newly generated image identifier.

    steps:
      - name: 'gcr.io/k8s-skaffold/pack'
        id: Build Node app
        entrypoint: 'pack'
        args: ['build', '--builder=gcr.io/buildpacks/builder', '--publish', 'gcr.io/$PROJECT_ID/web-data-app:$COMMIT_SHA']
      - name: gcr.io/google.com/cloudsdktool/cloud-sdk
        id: Create Cloud Deploy release
        args: 
            [
              "deploy", "releases", "create", "test-release-$SHORT_SHA",
              "--delivery-pipeline", "data-app-pipeline",
              "--region", "us-central1",
              "--images", "web-data-app=gcr.io/$PROJECT_ID/web-data-app:$COMMIT_SHA",
              "--skaffold-file", "deploy_config/skaffold.yaml"
            ]
        entrypoint: gcloud
    

    To make all this magic work, I went into Google Cloud Build to set up my new trigger. It points at my GitHub repo and refers to the cloudbuild.yaml file there.

    I ran my trigger manually (I could also set it to run on every check-in) to build my app and initiate a release in Cloud Deploy. The first part ran quickly and successfully.

    The result? It worked! My “dev” GKE cluster got a new workload and service endpoint, and my BigQuery table got a new column.

    When I went back into Cloud Deploy, I “promoted” this release to production and it ran the production-aligned files and popped a workload into the other GKE cluster. And it didn’t make any BigQuery changes, because we already did on the previous run. In reality, you would probably have different BigQuery tables or datasets for each environment!

    Wrap up

    Did you make it this far? You’re amazing. It might be time to shift from just shipping the easy stuff through automation to shipping ALL the stuff via automation. Software like Liquibase definition gets you further along in the journey, and it’s good to see Google Cloud make it easier.

  • Building a long-running, serverless, event-driven system with as little code as possible

    Building a long-running, serverless, event-driven system with as little code as possible

    Is code a liability or an asset? What it does should be an asset, of course. But there’s a cost to running and maintaining code. Ideally, we take advantage of (managed) services that minimize how much code we have to write to accomplish something.

    What if I want to accept document from a partner or legacy business system, send out a request for internal review of that document, and then continue processing? In ye olden days, I’d build file watchers, maybe a database to hold state of in-progress reviews, a poller that notified reviewers, and a web service endpoint to handle responses and update state in the database. That’s potentially a lot of code. Can we get rid of most that?

    Google Cloud Workflows recently added a “callback” functionality which makes it easier to create long-running processes with humans in the middle. Let’s build out an event-driven example with minimal code, featuring Cloud Storage, Eventarc, Cloud Workflows, and Cloud Run.

    Step 1 – Configure Cloud Storage

    Our system depends on new documents getting added to a storage location. That should initiate the processing. Google Cloud Storage is a good choice for an object store.

    I created a new bucket named “loan-application-submissions’ in our us-east4 region. At the moment, the bucket is empty.

    Step 2 – Create Cloud Run app

    The only code in our system is the application that’s used to review the document and acknowledge it. The app accepts a querystring parameter that includes the “callback URL” that points to the specific Workflow instance waiting for the response.

    I built a basic Go app with a simple HTML page, and a couple of server-side handlers. Let’s go through the heart of it. Note that the full code sample is on GitHub.

    func main() {
    
    	fmt.Println("Started up ...")
    
    	e := echo.New()
    	e.Use(middleware.Logger())
    	e.Use(middleware.Recover())
    
    	t := &Template{
    		Templates: template.Must(template.ParseGlob("web/home.html")),
    	}
    
    	e.Renderer = t
    	e.GET("/", func(c echo.Context) error {
    		//load up object with querystring parameters
    		wf := workflowdata{LoanId: c.QueryParam("loanid"), CallbackUrl: c.QueryParam("callbackurl")}
    
    		//passing in the template name (not file name)
    		return c.Render(http.StatusOK, "home", wf)
    	})
    
    	//respond to POST requests and send message to callback URL
    	e.POST("/ack", func(c echo.Context) error {
    		loanid := c.FormValue("loanid")
    		fmt.Println(loanid)
    		callbackurl := c.FormValue("callbackurl")
    
    		fmt.Println("Sending workflow callback to " + callbackurl)
    
    		wf := workflowdata{LoanId: loanid, CallbackUrl: callbackurl}
    
    		// Fetch an OAuth2 access token from the metadata server
    		oauthToken, errAuth := metadata.Get("instance/service-accounts/default/token")
    		if errAuth != nil {
    			fmt.Println(errAuth)
    		}
    
    		//load up oauth token
    		data := OAuth2TokenInfo{}
    		errJson := json.Unmarshal([]byte(oauthToken), &data)
    		if errJson != nil {
    			fmt.Println(errJson.Error())
    		}
    		fmt.Printf("OAuth2 token: %s", data.Token)
    
    		//setup callback request
    		workflowReq, errWorkflowReq := http.NewRequest("POST", callbackurl, strings.NewReader("{}"))
    		if errWorkflowReq != nil {
    			fmt.Println(errWorkflowReq.Error())
    		}
    
    		//add oauth header
    		workflowReq.Header.Add("authorization", "Bearer "+data.Token)
    		workflowReq.Header.Add("accept", "application/json")
    		workflowReq.Header.Add("content-type", "application/json")
    
    		//inboke callback url
    		client := &http.Client{}
    		workflowResp, workflowErr := client.Do(workflowReq)
    
    		if workflowErr != nil {
    
    			fmt.Printf("Error making callback request: %s\n", workflowErr)
    		}
    		fmt.Printf("Status code: %d", workflowResp.StatusCode)
    
    		return c.Render(http.StatusOK, "home", wf)
    	})
    
    	//simple startup
    	e.Logger.Fatal(e.Start(":8080"))
    }
    

    The “get” request shows the details that came in via the querystrings. The “post” request generates the required OAuth2 token, adds it to the header, and calls back into Google Cloud Workflows. I got stuck for a while because I was sending an ID token and the service expects an access token. There’s a difference! My colleague Guillaume Laforge, who doesn’t even write Go, put together the code I needed to generate the necessary OAuth2 token.

    From a local terminal, I ran a single command to push this source code into our fully managed Cloud Run environment:

    gcloud run deploy
    

    After a few moments, the app deployed and I loaded it up the browser with some dummy querystring values.

    Step 3 – Create Workflow with event-driven trigger

    That was it for coding! The rest of our system is composed of managed services. Specifically, Cloud Workflows, and Eventarc which processes events in Google Cloud and triggers consumers.

    I created a new Workflow called “workflow-loans” and chose the new “Eventarc” trigger. This means that the Workflow starts up as a result of an event happening elsewhere in Google Cloud.

    A new panel popped up and asked me to name my trigger and pick a source. We offer nearly every Google Cloud service as a source for events. See here that I chose Cloud Storage. Once I chose the event provider, I’m offered a contextual set of events. I selected the “finalized” event which fires for any new object added to the bucket.

    Then, I’m asked to choose my storage bucket, and we have a nice picker interface. No need to manually type it in. Once I chose my bucket, which resides in a different region from my Workflow, I’m told as much.

    The final step is to add the Workflow definition itself. These can be in YAML or JSON. My Workflow accepts some arguments (properties of the Cloud Storage doc, including the file name), and runs through a series of steps. It extracts the loan number from file name, creates a callback endpoint, logs the URL, waits for a callback, and processes the response.

    The full Workflow definition is below, and also in my GitHub repo.

    main:
        params: [args]
        steps:
            - setup_variables:
                #define and assign variables for use in the workflow
                assign:
                    - version: 100                  #can be numbers
                    - filename: ${args.data.name}   #name of doc
            - log_receipt:
                #write a log to share that we started up
                call: sys.log          
                args:
                    text: ${"Loan doc received"}
            - extract_loan_number:
                #pull out substring containing loan number
                assign:
                    - loan_number : ${text.substring(filename, 5, 8)}
            - create_callback:
                #establish a callback endpoint
                call: events.create_callback_endpoint
                args:
                    http_callback_method: "POST"
                result: callback_details
            - print_callback_details:
                #print out formatted URL
                call: sys.log
                args:
                    severity: "INFO"
                    # update with the URL of your Cloud Run service
                    text: ${"Callback URL is https://[INSERT CLOUD RUN URL HERE]?loanid="+ loan_number +"&callbackurl=" + callback_details.url}
            - await_callback:
                #wait impatiently
                call: events.await_callback
                args:
                    callback: ${callback_details}
                    timeout: 3600
                result: callback_request
            - print_callback_request:
                #wlog the result
                call: sys.log
                args:
                    severity: "INFO"
                    text: ${"Received " + json.encode_to_string(callback_request.http_request)}
            - return_callback_result:
                return: ${callback_request.http_request}
    

    I deployed the Workflow which also generated the Eventarc trigger itself.

    Step 4 – Testing it all out

    Let’s see if this serverless, event-driven system now works! To start, I dropped a new PDF named “loan600.pdf” into the designated Storage bucket.

    Immediately, Eventarc triggered a Workflow instance because that PDF was uploaded to Cloud Storage. See that the Workflow instance in an “await_callback” stage.

    On the same page, notice the logs for the Workflow instance, including the URL for my Cloud Run with all the right querystring parameters loaded.

    I plugged that URL into my browser and got my app loaded with the right callback URL.

    After clicking the “acknowledge loan submission” button which called back to my running Workflow instance, I switched back to Cloud Workflows and saw that my instance completed successfully.

    Summary

    There are many ways to solve the problem I called out here. I like this solution. By using Google Cloud Eventarc and Workflows, I eliminated a LOT of code. And since all these services, including Cloud Run, are fully managed serverless services, it only costs me money when it does something. When idle, it costs zero. If you follow along and try it for yourself, let me know how it goes!

  • Invest in yourself with my new Pluralsight course about personal productivity

    Invest in yourself with my new Pluralsight course about personal productivity

    I’m actively trying to be less productive. You don’t hear that very often, do ya? These past couple pandemic years limited my travel—both to an office or on an airplane—and I found myself working more than ever. But that’s proven temporary, thankfully, and I want to establish a model where I do fewer things, better, while also taking more time to relax and goof around.

    So when Pluralsight folks reached out to me and asked if I wanted to revisit my 2014 course about personal productivity tips, I jumped at the chance.

    I’ve learned a lot since 2014, and this was a good chance to capture new lessons learned, while re-imaging the course as a whole. The result? A 75 minute training course, Productivity Tips for the Busy Tech Professional, that I’m very proud of.

    Whether you’re in tech or not, this course can help you become more intentional about what you do, and complete more tasks that matter. The three modules of the course are:

    1. Productivity Explained. Here, we look at the over-emphasis on busy-ness, what productivity is all about, what we want to try and avoid, and the sorts of things that get in our way.
    2. Productivity Systems and Tools. There are formal and informal systems and habits you can adopt to become more productive. I dig into six different systems and six different categories of tools. In my life, I use a mix of all of it.
    3. Productivity Tips. This module includes a series of specific tips that you can adopt or tweak to establish more control over how you work. Each one includes some examples of how to put it in practice.

    As always, I learned a lot by preparing this course and studying the latest research about personal productivity. I hope you watch and enjoy, and share any tips you have!

  • Loading data directly into a warehouse via your messaging engine? Here’s how this handy new feature works in Google Cloud.

    Loading data directly into a warehouse via your messaging engine? Here’s how this handy new feature works in Google Cloud.

    First off, I am NOT a data analytics person. My advice is sketchy enough when it comes to app development and distributed systems that I don’t need to overreach into additional areas. That said, we at Google Cloud quietly shipped a new data-related feature this week that sparked my interest, and I figured that we could explore it together.

    To be sure, loading data into a data warehouse is a solved problem. Many of us have done this via ETL (extract-transform-load) tools and streaming pipelines for years. It’s all very mature technology, even when steering your data towards newfangled cloud data warehouses like the fully-managed Google Cloud’s BigQuery. Nowadays, app developers can also insert directly into these systems from their code. But what about your event-driven apps? It could be easier than it is today! This is why I liked this new subscription type for Google Cloud Pub/Sub—our messaging engine for routing data between systems—that is explicitly for BigQuery. That’s right, you can directly subscribe your data warehouse to your messaging system.

    Let’s try it out, end to end.

    First, I needed some data. BigQuery offers an impressive set of public data sets, including those with crime statistics, birth data summaries, GitHub activity, census data, and even baseball statistics. I’m not choosing any of those, because I wanted to learn more about how BigQuery works. So, I built a silly comma-separated file of “pet visits” to my imaginary pet store chain.

    1,"store400","2022-07-26 06:22:10","Mittens","cat","camp",806
    2,"store405","2022-07-26 06:29:15","Jessie","dog","bath",804
    3,"store400","2022-07-26 07:01:34","Ellie","dog","nailtrim",880
    4,"store407","2022-07-26 07:02:00","Rocket","cat","bath",802
    5,"store412","2022-07-26 07:06:45","Frank","cat","bath",853
    6,"store400","2022-07-26 08:08:08","Nala","cat","nailtrim",880
    7,"store407","2022-07-26 08:15:04","Rocky","dog","camp",890
    8,"store402","2022-07-26 08:39:16","Cynthia","bird","spa",857
    9,"store400","2022-07-26 08:51:14","Watson","dog","haircut",831
    10,"store412","2022-07-26 09:05:58","Manny","dog","camp",818

    I saved this data as “pets.csv” and uploaded it into a private, regional Google Cloud Storage Bucket.

    Excellent. Now I wanted this data loaded into a BigQuery table that I could run queries against. And eventually, load new data into when it flows through Pub/Sub.

    I’m starting with no existing data sets or tables in BigQuery. You can see here that all I have is my “project.” And there’s no infrastructure to provision or manage here, so all we have to think about is our data. Amazing.

    As an aside, we make it very straightforward to pull in data from all sorts of sources, even those outside of Google Cloud. So, this really can be a single solution for all your data analytics needs. Just sayin’. In this scenario, I wanted to add data to a BigQuery table, so I started by selecting my project and choosing to “create a dataset“, which is really just a container for data tables.

    Next, I picked my data set and click the menu option to “create table.” Here’s where it gets fun. I can create an empty table, upload some data or point to object storage repos like Google Cloud Storage, Amazon S3, or Azure Blob Storage. I chose Cloud Storage. Then I located my Storage bucket and chose “CSV” as the file format. Other options include JSON, Avro, and Parquet. Then I gave my table a name (“visits_table”). So far so good.

    The last part of this table creation process involves schema definition. BigQuery can autodetect the schema (data types and such), but I wanted to define it manually. The graphical interface offers a way to define column name, data type, and whether it’s a required data point or not.

    After creating the table, I could see the schema and run queries against the data. For example, this is a query that returns the count of each animal type coming into my chain of pet stores for service.

    You could imagine there might be some geospatial analysis, machine learning models, or other things we constantly do with this data set over time. That said, let’s hook it up to Pub/Sub so that we can push a real-time stream of “visits” from our event-driven architecture.

    Before we forget, we need to change permissions to allow Pub/Sub to send data to BigQuery tables. From within Google Cloud IAM, I chose to “include Google-provided role grants” in the list of principals, located my built-in Pub/Sub service account, and added the “BigQuery Data Editor” and “BigQuery Metadata Viewer” roles.

    When publishing from Pub/Sub to BigQuery you have a couple of choices for how to handle the data. One option is to dump the entire payload into a single “data” field, which doesn’t sound exciting. The other option is to use a Pub/Sub schema so that the data fields map directly to BigQuery table columns. That’s better. I navigated to the Pub/Sub “Schemas” dashboard and created a new schema.

    If kids are following along at home, the full schema looks like this:

    {
        "type": "record",
        "name": "Avro",
        "fields": [
          {
            "name": "apptid",
            "type": "int"
          },
          {
            "name": "storeid",
            "type": "string"
          },
          {
            "name": "visitstamp",
            "type": "string"
          },
          {
            "name": "petname",
            "type": "string"
          },
          {
            "name": "animaltype",
            "type": "string"
          },
          {
            "name": "servicetype",
            "type": "string"
          },
          {
            "name": "customerid",
            "type": "int"
          }
        ]
      }
    

    We’re almost there. Now we just needed to create the actual Pub/Sub topic and subscription. I defined a new topic named “pets-topic”, and selected the box to “use a schema.” Then I chose the schema we created above.

    Now for the subscription itself. As you see below, there’s a “delivery type” for “Write to BigQuery” which is super useful. Once I chose that, I was asked for the dataset and table, and I chose the option to “use topic schema” so that the message body would map to the individual columns in the table.

    This is still a “regular” Pub/Sub subscription, so if I wanted to, I could set properties like message retention duration, expiration period, subscription filters, and retry policies.

    Nothing else to it. And we did it all from the Cloud Console. To test this out, I went to my topic in the Cloud Console, and chose to send a message. Here, I sent a single message that conformed to the topic schema.

    Almost immediately, my BigQuery table got updated and I saw the new data in my query results.

    When I searched online, I saw various ways that people have stitched together their (cloud) messaging engines with their data warehouse. But from what I can tell, what we did here is the simplest, most-integrated way to pull that off. Try it out and tell me what you think!