Category: Google Cloud

Exploring a fast inner dev loop for Spring Boot apps targeting Google Cloud Run
It’s a gift to the world that no one pays me to write software any longer. You’re welcome. But I still enjoy coding and trying out a wide variety of things. Given that I rarely have hours upon hours to focus on writing software, I seek things that make me more productive with the time I have. My inner development loop matters. You know, the iterative steps we perform to write, build, test, and commit code.

So let’s say I want to build a REST API in Java. This REST API stores and returns the names of television characters. What’s the bare minimum that I need to get going?
- An IDE or code editor
- A database to store records
- A web server to host the app
- A route to reach the app
What are things I personally don’t want to deal with, especially if I’m experimenting and learning quickly?
- Provisioning lots of infrastructure. Either locally to emulate the target platform, or elsewhere to actually run my app. It takes time, and I don’t know what I need.
- Creating database stubs or mocks, or even configuring Docker containers to stand-in for my database. I want the real thing, if possible.
- Finding a container registry to use. All this stuff just needs to be there.
- Writing Dockerfiles to package an app. I usually get them wrong.
- Configuring API gateways or network routing rules. Just give me an endpoint.
Based on this, one of the quickest inner loop I know of involves Spring Boot, the Google Cloud SDK, Cloud Firestore, and Google Cloud Run. Spring Boot makes it easy to spin up API projects and it’s ORM capabilities make it simple to interact with a database. Speaking of databases, Cloud Firestore is powerful and doesn’t force me into a schema. That’s great when I don’t know the final state of my data structure. And Cloud Run seems like the single best way to run custom-built apps in the cloud. How about we run through this together?

On my local machine, I’ve installed Visual Studio Code—the FASTEST possible inner loop might have involved using the Google Cloud Shell and skipping any local work, but I still like doing local dev—along with the latest version of Java, and the Google Cloud SDK. The SDK comes with lots of CLI tools and emulators, including one for Firestore and Datastore (an alternate API).

Time to get to work. I visited start.spring.io to generate a project. I could choose a few dependencies from the curated list, including a default one for Google Cloud services, and another for exposing my data repository as a series of REST endpoints.

I generated the project, and opened it in Visual Studio Code. Then, I opened the pom.xml file and added one more dependency. While I’m using the Firestore database, I’m using it in “Datastore mode” which works better with Spring Data REST. Here’s my finished pom file.
```
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
	<modelVersion>4.0.0</modelVersion>
	<parent>
		<groupId>org.springframework.boot</groupId>
		<artifactId>spring-boot-starter-parent</artifactId>
		<version>2.4.4</version>
		<relativePath/> 
	</parent>
	<groupId>com.seroter</groupId>
	<artifactId>boot-gcp-run-firestore</artifactId>
	<version>0.0.1-SNAPSHOT</version>
	<name>boot-gcp-run-firestore</name>
	<description>Demo project for Google Cloud and Spring Boot</description>
	<properties>
		<java.version>11</java.version>
		<spring-cloud-gcp.version>2.0.0</spring-cloud-gcp.version>
		<spring-cloud.version>2020.0.2</spring-cloud.version>
	</properties>
	<dependencies>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-data-rest</artifactId>
		</dependency>
		<dependency>
			<groupId>com.google.cloud</groupId>
			<artifactId>spring-cloud-gcp-starter</artifactId>
		</dependency>
		<dependency>
			<groupId>com.google.cloud</groupId>
			<artifactId>spring-cloud-gcp-starter-data-datastore</artifactId>
			<version>2.0.2</version>
		</dependency>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-test</artifactId>
			<scope>test</scope>
		</dependency>
	</dependencies>
	<dependencyManagement>
		<dependencies>
			<dependency>
				<groupId>org.springframework.cloud</groupId>
				<artifactId>spring-cloud-dependencies</artifactId>
				<version>${spring-cloud.version}</version>
				<type>pom</type>
				<scope>import</scope>
			</dependency>
			<dependency>
				<groupId>com.google.cloud</groupId>
				<artifactId>spring-cloud-gcp-dependencies</artifactId>
				<version>${spring-cloud-gcp.version}</version>
				<type>pom</type>
				<scope>import</scope>
			</dependency>
		</dependencies>
	</dependencyManagement>

	<build>
		<plugins>
			<plugin>
				<groupId>org.springframework.boot</groupId>
				<artifactId>spring-boot-maven-plugin</artifactId>
			</plugin>
		</plugins>
	</build>

</project>
```
Let’s sling a little code, shall we? Spring Boot almost makes this too easy. First, I created a class to describe a “character.” I started with just a couple of characteristics—full name, and role.
```
package com.seroter.bootgcprunfirestore;

import com.google.cloud.spring.data.datastore.core.mapping.Entity;
import org.springframework.data.annotation.Id;

@Entity
class Character {

    @Id
    private Long id;
    private String FullName;
    private String Role;
    
    public String getFullName() {
        return FullName;
    }
    public String getRole() {
        return Role;
    }
    public void setRole(String role) {
        this.Role = role;
    }
    public void setFullName(String fullName) {
        this.FullName = fullName;
    }
}
```
All that’s left is to create a repository resource and Spring Data handles the rest. Literally!
```
package com.seroter.bootgcprunfirestore;

import com.google.cloud.spring.data.datastore.repository.DatastoreRepository;
import org.springframework.data.rest.core.annotation.RepositoryRestResource;

@RepositoryRestResource
interface CharacterRepository extends DatastoreRepository<Character, Long> {
    
}
```
That’s kinda it. No other code is needed. Now I want to test it out and see if it works. The first option is to spin up an instance of the Datastore emulator—not Firestore since I’m using the Datastore API—when my app starts. That’s handy. It’s one line in my app.properties file.
```
spring.cloud.gcp.datastore.emulator.enabled=true
```
When I execute ./mvnw spring-boot:run I see the app compile, and get a notice that the Datastore emulator was started up. I went to Postman to call the API. First I added a record.

Then I called the endpoint to retrieve the store data. It worked. It’s great that Spring Data REST wires up all these endpoints automatically.

Now, I really like that I can start up the emulator as part of the build. But, that instance is ephemeral. When I stop running the app locally, my instance goes away. What if my inner loop involves constantly stopping the app to make changes, recompile, and start up again? Don’t worry. It’s also easy to stand up the emulator by itself, and attach my app to it. First, I ran gcloud beta emulators datastore start to get the local instance running in about 2 seconds.

Then I updated my app.properties file by commenting out the statement that enables local emulation, and replacing with this statement that points to the emulator:
```
spring.cloud.gcp.datastore.host=localhost:8081
```
Now I can start and stop the app as much as I want, and the data persists. Both options are great, depending on how you’re doing local development.

Let’s deploy. I wanted to see this really running, and iterate further after I’m confident in how it behaves in a production-like environment. The easiest option for any Spring Boot developer is Cloud Run. It’s quick, it’s serverless, and we support buildpacks, so you never need to see a container.

I issued a single CLI command—gcloud beta run deploy boot-app --memory=1024 --source .— to package up my app and get it to Cloud Run.

After a few moments, I had a container in the registry, and an instance of Cloud Run. I don’t have to do any other funny business to reach the endpoint. No gateways, proxies, or whatever. And everything is instantly wired up to Cloud Logging and Cloud Monitoring for any troubleshooting. And I can provision up to 8GB of RAM and 4 CPUs, while setting up to 250 concurrent connections per container, and 1000 maximum instances. There’s a lot you can run with that horsepower.

I pinged the public endpoint, and sure enough, it was easy to publish and retrieve data from my REST API …

… and see the data sitting in the database!

When I saw the results, I realized I wanted more data fields in here. No problem. I went back to my Spring Boot app, and added a new field, isHuman. There are lots of animals on my favorite shows.

This time when I deployed, I chose the “no traffic” flag—cloud beta run deploy boot-app --memory=1024 --source . --no-traffic—so that I could control who saw the new field. Once it deployed, I saw two “revisions” and had the ability to choose the amount of traffic to send to each.

I switched 50% of the traffic to the new revision, liked what I saw, and then flipped it to 100%.

So there you go. It’s possible to fly through this inner loop in minutes. Because I’m leaning on managed serverless technologies for things like application runtime and database, I’m not wasting any time building or managing infrastructure. The local dev tooling from Google Cloud is terrific, so I have easy use of IDE integrations, emulators and build tools. This stack makes it simple for me to iterate quickly, cheaply, and with tech that feels like the future, versus wrestling with old stuff that’s been retrofitted for today’s needs.
April 13, 2021
Want secure access to (cloud) services from your Kubernetes-based app? GKE Workload Identity is the answer.
My name is Richard, and I like to run as admin. There, I said it. You should rarely listen to me for good security advice since I’m now (always?) a pretend developer who does things that are easy, not necessarily right. But identity management is something I wanted to learn more about in 2021, so now I’m actually trying. Specifically, I’m exploring the best ways for my applications to securely access cloud services. In this post, I’ll introduce you to GKE Workload Identity, and why it seems like a terrific way to do the right thing.

First, let’s review some of your options for providing access to distributed components—think databases, storage, message queues, and the like—from your application.
- Store credentials in application variables. This is terrible. Which means I’ve done it before myself. Never do this, for roughly 500 different reasons.
- Store credentials in property files. This is also kinda awful. First, you tend to leak your secrets often because of this. Second, it might as well be in the code itself, as you still have to change, check in, do a build, and do a deploy to make the config change.
- Store credentials in environment variables. Not great. Yes, it’s out of your code and config, so that’s better. But I see at least three problems. First, it’s likely not encrypted. Second, you’re still exporting creds from somewhere and storing them here. Third, there’s no version history or easy management (although clouds offer some help here). Pass.
- Store credentials in a secret store. Better. At least this is out of your code, and in a purpose-built structure for securely storing sensitive data. This might be something robust like Vault, or something more basic like Kubernetes Secrets. The downside is still that you are replicating credentials outside the Identity Management system.
- Use identity federation. Here we go. How about my app runs under an account that has the access it needs to a given service? This way, we’re not extracting and stashing credentials. Seems like the ideal choice.
So, if identity federation is a great option, what’s the hard part? Well, if my app is running in Kubernetes, how do I run my workload with the right identity? Maybe through … Workload Identity? Basically, Workload Identity lets you map a Kubernetes service account to a given Google Cloud service account (there are similar types of things for EKS in AWS, and AKS in Azure). At no point does my app need to store or even reference any credentials. To experiment, I created a basic Spring Boot web app that uses Spring Cloud GCP to talk to Cloud Storage and retrieve all the files in a given bucket.
```
package com.seroter.gcpbucketreader;

import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;

import com.google.api.gax.paging.Page;
import com.google.cloud.storage.Blob;
import com.google.cloud.storage.Storage;

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.stereotype.Controller;
import org.springframework.ui.Model;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;

@Controller
@SpringBootApplication
public class GcpBucketReaderApplication {

	public static void main(String[] args) {
		SpringApplication.run(GcpBucketReaderApplication.class, args);
	}

	//initiate auto-configuration magic that pulls in the right credentials at runtime
	@Autowired(required=false)
	private Storage storage;

	@GetMapping("/")
	public String bucketList(@RequestParam(name="bucketname", required=false, defaultValue="seroter-bucket-logs") String bucketname, Model model) {

		List<String> blobNames = new ArrayList<String>();

		try {

			//get the objects in the bucket
			Page<Blob> blobs = storage.list(bucketname);
			Iterator<Blob> blobIterator = blobs.iterateAll().iterator();

			//stash bucket names in an array
			while(blobIterator.hasNext()) {
				Blob b = blobIterator.next();
				blobNames.add(b.getName());
			}
		}
		//if anything goes wrong, catch the generic error and add to view model
		catch (Exception e) {
			model.addAttribute("errorMessage", e.toString());
		}

		//throw other values into the view model
		model.addAttribute("bucketname", bucketname);
		model.addAttribute("bucketitems", blobNames);

		return "bucketviewer";
	}
}
```
I built and containerized this app using Cloud Build and Cloud Buildpacks. It only takes a few lines of YAML and one command (gcloud builds submit --config cloudbuild.yaml .) to initiate the magic.
```
steps:
# use Buildpacks to create a container image
- name: 'gcr.io/k8s-skaffold/pack'
  entrypoint: 'pack'
  args: ['build', '--builder=gcr.io/buildpacks/builder', '--publish', 'us-west1-docker.pkg.dev/seroter-anthos/seroter-images/boot-bucketreader:$COMMIT_SHA']
```
In a few moments, I had a container image in Artifact Registry to use for testing.

Then I loaded up a Cloud Storage bucket with a couple of nonsense files.

Let’s play through a few scenarios to get a better sense of what Workload Identity is all about.

Scenario #1 – Cluster runs as the default service account

Without Workload Identity, a pod in GKE assumes the identity of the service account associated with the cluster’s node pool.

When creating a GKE cluster, you choose a service account for a given node pool. All the nodes runs as this account.

I built a cluster using the default service account, which can basically do everything in my Google Cloud account. That’s fun for me, but rarely something you should ever do.

From within the GKE console, I went ahead and deployed an instance of our container to this cluster. Later, I’ll use Kubernetes YAML files to deploy pods and expose services, but the GUI is fun to use for basic scenarios.

Then, I created a service to route traffic to my pods.

Once I had a public endpoint to ping, I sent a request to the page and provided the bucket name as a querystring parameter.

That worked, as expected. Since the pod runs as a super-user, it had full permission to Cloud Storage, and every bucket inside. While that’s a fun party trick, there aren’t many cases where the workloads in a cluster should have access to EVERYTHING.

Scenario #2 – Cluster runs as a least privilege service account

Let’s do the opposite and see what happens. This time, I started by creating a new Google Cloud service account that only had “read” permissions to the Artifact Registry (so that it could pull container images) and Kubernetes cluster administration rights.

Then, I built another GKE cluster, but this time, chose this limited account as the node pool’s service account.

After building the cluster, I went ahead and deployed the same container image to the new cluster. Then I added a service to make these pods accessible, and called up the web page.

As expected, the attempt to read my Storage bucket failed, This least privilege account didn’t have rights to Cloud Storage.

This is a more secure setup, but now I need a way for this app to securely call the Cloud Storage service. Enter Workload Identity.

Scenario #3 – Cluster has Workload Identity configured with a mapped service account

I created yet another cluster. This time, I chose the least privilege account, and also chose to install Workload Identity. How does this work? When my app ran before, it used (via the Spring Cloud libraries) the Compute Engine metadata server to get a token to authenticate with Cloud Storage. When I configure Workload Identity, those requests to the metadata server get routed to the GKE metadata server. This server runs on each cluster node, mimics the Compute Engine metadata server, and gives me a token for whatever service account the pod has access to.

If I deploy my app now, it still won’t work. Why? I haven’t actually mapped a service account to the namespace my pod gets deployed into!

I created the namespace, created a Kubernetes service account, created a Google Cloud storage account, mapped the two together, and annotated our service account. Let’s go step by step.

First, I created the namespace to hold my app.

kubectl create namespace blog-demos

Next, I created a Kubernetes service account (“sa-storageapp”) that’s local to the cluster, and namespace.

kubectl create serviceaccount --namespace blog-demos sa-storageapp

After that, I created a new Google Cloud service account named gke-storagereader.

gcloud iam service-accounts create gke-storagereader

Now we’re ready for some account mapping. First, I made the Kubernetes service account a member of my Google Cloud storage account.
```
gcloud iam service-accounts add-iam-policy-binding \
  --role roles/iam.workloadIdentityUser \
  --member "serviceAccount:seroter-anthos.svc.id.goog[blog-demos/sa-storageapp]" \
  gke-storagereader@seroter-anthos.iam.gserviceaccount.com
```
Now, to give the Google Cloud service account the permission it needs to talk to Cloud Storage.
```
gcloud projects add-iam-policy-binding seroter-anthos \
    --member="serviceAccount:gke-storagereader@seroter-anthos.iam.gserviceaccount.com" \
    --role="roles/storage.objectViewer"
```
The final step? I had to add an annotation to the Kubernetes service account that links to the Google Cloud service account.
```
kubectl annotate serviceaccount \
  --namespace blog-demos \
  sa-storageapp \
  iam.gke.io/gcp-service-account=gke-storagereader@seroter-anthos.iam.gserviceaccount.com
```
Done! All that’s left is to deploy my Spring Boot application.

First I set my local Kubernetes context to the target namespace in the cluster.

kubectl config set-context --current --namespace=blog-demos

In my Kubernetes deployment YAML, I pointed to my container image, and provided a service account name to associate with the deployment.
```
apiVersion: apps/v1
kind: Deployment
metadata:
  name: boot-bucketreader
spec:
  replicas: 1
  selector:
    matchLabels:
      app: boot-bucketreader
  template:
    metadata:
      labels:
        app: boot-bucketreader
    spec:
      serviceAccountName: sa-storageapp
      containers:
      - name: server
        image: us-west1-docker.pkg.dev/seroter-anthos/seroter-images/boot-bucketreader:latest
        ports:
        - containerPort: 8080
```
I then deployed a YAML file to create a routable service, and pinged my application. Sure enough, I now had access to Cloud Storage.

Wrap

Thanks to Workload Identity for GKE, I created a cluster that had restricted permissions, and selectively gave permission to specific workloads. I could get even more fine-grained by tightening up the permissions on the GCP service account to only access a specific bucket (or database, or whatever). Or have different workloads with different permissions, all in the same cluster.

To me, this is the cleanest, most dev-friendly way to do access management in a Kubernetes cluster. And we’re bringing this functionality to GKE clusters that run anywhere, via Anthos.

What about you? Any other ways you really like doing access management for Kubernetes-based applications?
March 16, 2021
Let’s compare the cloud shells offered by AWS, Microsoft Azure, and Google Cloud Platform
I keep getting more and more powerful laptops, and then offloading more and more processing to the cloud. SOMETHING’S GOTTA GIVE! My local machine doesn’t just run web browsers and chat apps. No, my laptop is still loaded up with dev tools, while all my virtual machines and container clusters now live in the cloud. That helps. But we’re seeing more and more of the dev tools sneak into the cloud, too.

One of those dev tools is the shell experience. If you’re like me—actually, you’re probably much more advanced than me—you invest in a loaded terminal on your machine. On my Mac, I directly install a few tools (e.g. git, gcloud CLI) but use Homebrew to keep most of my favorite tools close by.

It’s no small effort to maintain a local terminal environment that’s up to date, and authenticated to various endpoints. To make all this easier, each of three hyperscalers now has a “cloud shell” experience that offers developers a hosted, pre-loaded terminal for working with that cloud.

In this blog post, I’m going to look at the cloud shells from AWS, Microsoft Azure, and Google Cloud, and see what they really have to offer. Specifically, I’m going to assess:
- Shell access. How exactly do you reach and use the shell?
- Shells offered. Bash? Powershell?
- Amount of storage provided. How much can you stash in your environment?
- Durability period. How long does each cloud hold onto your compute environment? Storage?
- Platform integrations. What ways does the shell integrate with the cloud experience?
- Embedded tools. What comes pre-loaded in the shell?
- Code editing options. Is there a way to edit files or build apps?
- Compute environment configuration/extensibility. Can you change the shell environment temporarily or permanently?
- UX and usability controls. What can you do to tweak the appearance or behavior?
Let’s take a look.

Disclaimer: I work for Google Cloud, so obviously I’ll have some biases. That said, I’ve used AWS for over a decade, was an Azure MVP for years, and can be mostly fair when comparing products and services. Please call out any mistakes I make!

Google Cloud Platform

GCP offers a Cloud Shell that runs within a Docker container on a dedicated Google Compute Engine virtual machine. Not that you see any of that. You just see a blinking cursor.

How do you reach that cursor? From within the GCP Console, there’s an ever-present button in the top navigation. Of note, you can also access it via a dedicated link at shell.cloud.google.com.

Once you launch the Cloud Shell—and if it’s the first time, you’ll see a brief message about provisioning your infrastructure—you see a new frame on your screen. Note that this is a globally distributed service, and you’re automatically assigned to the closest geographic region.

Each user gets 5GB of persistent storage that’s mounted into this underlying virtual machine. This VM terminates after 20 minutes of inactivity. If you don’t use Cloud Shell at all for 120 days, the home disk goes away too.

You have two default shell interpreters (Bash and sh) at your disposal here. Google Cloud Shell lets you create unique sessions via tabs, and see below that I’m using one tab to list all the shells. I was able to switch between shells, including PowerShell too!

Cloud Shell comes with lots of pre-loaded tools including gcloud, vim, emacs, gradle, helm, maven, npm, pip, git, docker, MySQL client, TensorFlow, and Terraform. It also has built-in language support for Java, Go, Python, Node.js, Ruby, PHP, and .NET Core.

If you want tools that aren’t pre-loaded by Google Cloud, you’ve got a few options. You can manually install tools during your session, or, create a customer_environment script that runs whenever your instance boots up.

What about platform integrations? If you call a Google Cloud API that requires credentials, there’s a prompt for authorization. There’s also an “Open in Cloud Shell” feature that makes it simple to create links that trigger opinionated Cloud Shell instances. If you’re writing tutorials or want people to try the code in your git repo, you can generate a link. There’s also a baked-in cloudshell CLI to launch tutorials, download files, and more. You can also use the gcloud CLI on your local workstation to tunnel into the Cloud Shell, thanks to the gcloud beta cloud-shell operation.

The Google Cloud Shell also has a full-fledged code editor built in. This editor—also available directly via ide.cloud.google.com—gets launched right from the Cloud Shell, either through the button on the Cloud Shell navigation or by invoking the cloudshell edit . command.

This editor is based on Eclipse Theia and has the Cloud Code extensions built in. This means I can create apps, use source control, link to GCP services, run tests, and more. Because Cloud Shell supports Web Preview, you can also start up web applications and hit a local endpoint.

Let’s look at the overall user experience. In the Cloud Shell navigation menu, I have options to send key combinations (e.g. Ctrl+V), change the look and feel (e.g. color, font), upload or download files, run in safe mode, restart the Cloud Shell instance, minimize the frame itself, break it out into its own window, or close the terminal entirely.

With this mix of free storage, a wide set of tools, a fully functional code editor, and easily extendible environments, the Google Cloud Shell feels like a very complete experience.

Microsoft Azure

Azure provides a Cloud Shell that runs on a temporary virtual machine. Like with GCP, all the infrastructure details are invisible, and users just get a virtual terminal.

You have a few ways to reach Azure’s Cloud Shell. There’s an always-there button in the Portal and a direct link available at shell.azure.com.

Once you trigger the Cloud Shell, you quickly get a new resizable frame holding your terminal instance.

The compute instance is available at no charge. These instances use a 5GB persistent storage image in your file share, and it appears that you pay for that. Like the Google Cloud Shell, the Azure one uses non-durable compute nodes that time out after 20 minutes of inactivity.

You have two shell experiences: bash or PowerShell. Storage is shared between each.

The Azure Cloud Shell comes absolutely loaded with tools. You have all the standard Azure tools (Azure CLI, azcopy, etc) along with things like vim, emacs, git, maven, npm, Docker, kubectl, Helm, MySQL client, PostgreSQL client, Cloud Foundry CLI, Terraform, Ansible, Packer, and more. There’s also built-in language support for Go, Ruby. .NET Core, Java, Node.js, PowerShell, and Python. I didn’t see any obvious way to customize the experience that lasts beyond a given session.

As far as integrations, it appears there is SSO with Azure Active Directory. There’s also a special PowerShell commandlet for managing Exchange Online. Try to control yourselves. Similar to GCP, the Azure Cloud Shell supports a URL format that lets tutorial creators launch the Cloud Shell from anywhere. Visual Studio Code users can also integrate the Azure Cloud Shell into their local dev experience.

Azure also provides a handy code editor within their Cloud Shell experience. Based on the open source Monaco editor, has a basic file explorer, command palette, and language highlighting.

Let’s look at the user experience. In the Cloud Shell navigation bar, you have buttons to restart the shell, configure font style and size, download files, upload files, open the code editor, trigger a local web server, minimize the frame, or shut it down.

All in all, it’s a solid experience. Not as rich as what GCP has, but entirely functional with nice touches like the code editor, and easy switching between bash and PowerShell.

AWS

AWS is the newest entrant to the cloud-based terminal with their AWS CloudShell. AWS seems careful to call the host a “computing environment” versus ever saying “virtual machine.” It’s possible that you get a container in a shared environment.

It looks like you have one way to reach the CloudShell. There’s a button in the AWS Console navigation bar.

Clicking that button pops up a new browser instance holding your terminal.

There’s no cost for AWS CloudShell and you get 1GB of persistent storage (also for free). The service is available in a handful of AWS regions (3 in the US, 1 in Ireland, 1 in Tokyo). Sessions expire after 20-30 minutes, and data is held for 120 days.

AWS CloudShell has three shell experiences including bash, PowerShell, and z shell.

The AWS CloudShell comes with a handful of useful pre-loaded tools. You get the AWS tools (e.g. AWS CLI, AWS SAM), as well as git, make, ssh, and vim. You can modify the default environment by creating a .bashrc script that runs whenever the bash shell fires up. There’s native language support for Node.js and Python.

There’s one platform integration I noticed, which helps you push and pull code from AWS CodeCommit.

There are some nice touches in the AWS CloudShell user experience. I like that you can stack tabs (session) or put them side by side. You can also download and upload files. AWS also offers settings to change the font size or switch from dark mode to light mode.

AWS offers a functional experience that’s basic, but useful for those living in an AWS world.

It’s great to see all the major clouds offering this functionality. GCP objectively has the most feature-rich experience, but each one is useful. Try them out, and see if they can make your dev environment simpler.
February 3, 2021
How GitOps and the KRM make multi-cloud less scary.
I’m seeing the usual blitz of articles that predict what’s going to happen this year in tech. I’m not smart enough to make 2021 predictions, but one thing that seems certain is that most every company is deploying more software to more places more often. Can we agree on that? Companies large and small are creating and buying lots of software. They’re starting to do more continuous integration and continuous delivery to get that software out the door faster. And yes, most companies are running that software in multiple places—including multiple public clouds.

So we have an emerging management problem, no? How do I create and maintain software systems made up of many types of components—virtual machines, containers, functions, managed services, network configurations—while using different clouds? And arguably the trickiest part isn’t building the system itself, but learning and working within each cloud’s tenancy hierarchy, identity system, administration tools, and API model.

Most likely, you’ll use a mix of different build orchestration tools and configuration management tools based on each technology and cloud you’re working with. Can we unify all of this without forcing a lowest-common-denominator model that keeps you from using each cloud’s unique stuff? I think so. In this post, I’ll show an example of how to provision and manage infrastructure, apps, and managed services in a consistent way, on any cloud. As a teaser for what we’re building here, see that we’ve got a GitHub repo of configurations, and 1st party cloud managed services deployed and configured in Azure and GCP as a result.

Before we start, let’s define a few things. GitOps—a term coined by Alexis and championed by the smart folks at Weaveworks—is about declarative definitions of infrastructure, stored in a git repo, and constantly applied to the environment so that you remain in the desired state.

@kelseyhightower discusses all things #GitOps with a personal twist in his latest talk delivered at @GitHub Universe 2020. If you're looking for a clear explanation of GitOps that includes an equally simple demo, this is it. https://t.co/umAqItpnsW
— Weaveworks (@weaveworks) January 6, 2021

Next, let’s talk about the Kubernetes Resource Model (KRM). In Kubernetes, you define resources (built in, or custom) and the system uses controllers to create and manage those resources. It treats configurations as data without forcing you to specify *how* to achieve your desired state. Kubernetes does that for you. And this model is extendable to more than just containers!

Infrastructure-as-Code isn't enough. There are challenges to an imperative model for building infrastructure.

Configuration as Data, based on the Kubernetes Resource Model, looks like the future. @kelseyhightower and @markbalch make a very strong case: https://t.co/7v5402wfFg
— Richard Seroter (@rseroter) November 19, 2020

The final thing I want you to know about is Google Cloud Anthos. That’s what’s tying all this KRM and GitOps stuff together. Basically, it’s a platform designed to create and manage distributed Kubernetes clusters that are consistent, connected, and application ready. There are four capabilities you need to know to grok this KRM/GitOps scenario we’re building:
1. Anthos clusters and the cloud control plane. That sounds like the title of a terrible children’s book. For tech folks, it’s a big deal. Anthos deploys GKE clusters to GCP, AWS, Azure (in preview), vSphere, and bare metal environments. These clusters are then visible to (and configured by) a control plane in GCP. And you can attach any existing compliant Kubernetes cluster to this control plane as well.
2. Config Connector. This is a KRM component that lets you manage Google Cloud services as if they were Kubernetes resources—think BigQuery, Compute Engine, Cloud DNS, and Cloud Spanner. The other hyperscale clouds liked this idea, and followed our lead by shipping their own flavors of this (Azure version, AWS version).
3. Environs. These are logical groupings of clusters. It doesn’t matter where the clusters physically are, and which provider they run on. An environ treats them all as one virtual unit, and lets you apply the same configurations to them, and join them all to the same service mesh. Environs are a fundamental aspect of how Anthos works.
4. Config Sync. This Google Cloud components takes git-stored configurations and constantly applies them to a cluster or group of clusters. These configs could define resources, policies, reference data, and more.
Now we’re ready. What are we building? I’m going to provision two Anthos clusters in GCP, then attach an Azure AKS cluster to that Anthos environ, apply a consistent configuration to these clusters, install the GCP Config Connector and Azure Service Operators into one cluster, and use Config Sync to deploy cloud managed services and apps to both clouds. Why? Once I have this in place, I have a single way to create managed services or deploy apps to multiple clouds, and keep all these clusters identically configured. Developers have less to learn, operators have less to do. GitOps and KRM, FTW!

Step 1: Create and Attach Clusters

I started by creating two GKE clusters in GCP. I can do this via the Console, CLI, Terraform, and more. Once I created these clusters (in different regions, but same GCP project), I registered both to the Anthos control plane. In GCP, the “project” (here, seroter-anthos) is also the environ.

Next, I created a new AKS cluster via the Azure Portal.

In 2020, our Anthos team added the ability to attach existing clusters an an Anthos environ. Before doing anything else, I created a new minimum-permission GCP service account that the AKS cluster would use, and exported the JSON service account key to my local machine.

From the GCP Console, I followed the option to “Add clusters to environ” where I provided a name, and got back a single command to execute against my AKS cluster. After logging into my AKS cluster, I ran that command—which installs the Connect agent—and saw that the AKS cluster connected successfully to Anthos.

I also created a service account in my AKS cluster, bound it to the cluster-admin role, and grabbed the password (token) so that GCP could log into that cluster. At this point, I can see the AKS cluster as part of my environ.

You know what’s pretty awesome? Once this AKS cluster is connected, I can view all sorts of information about cluster nodes, workloads, services, and configurations. And, I can even deploy workloads to AKS via the GCP Console. Wild.

But I digress. Let’s keep going.

Step 2: Instantiate a Git Repo

GitOps requires … a git repo. I decided to use GitHub, but any reachable git repository works. I created the repo via GitHub, opened it locally, and initialized the proper structure using the nomos CLI. What does a structured repo look like and why does the structure matter? Anthos Config Management uses this repo to figure out the clusters and namespaces for a given configuration. The clusterregistry directory contains ClusterSelectors that let me scope configs to a given cluster or set of clusters. The cluster directory holds any configs that you want applied to entire clusters versus individual namespaces. And the namespaces directory holds configs that apply to a specific namespace.

Now, I don’t want all my things deployed to all the clusters. I want some namespaces that span all clusters, and others that only sit in one cluster. To do this, I need ClusterSelectors. This lets me define labels that apply to clusters so that I can control what goes where.

For example, here’s my cluster definition for the AKS cluster (notice the “name” matches the name I gave it in Anthos) that applies an arbitrary label called “cloud” with a value of “azure.”
```
kind: Cluster
apiVersion: clusterregistry.k8s.io/v1alpha1
metadata:
  name: aks-cluster-1
  labels:
    environment: prod
    cloud: azure
```
And here’s the corresponding ClusterSelector. If my namespace references this ClusterSelector, it’ll only apply to clusters that match the label “cloud: azure.”
```
kind: ClusterSelector
apiVersion: configmanagement.gke.io/v1
metadata:
    name: selector-cloud-azure
spec:
    selector:
        matchLabels:
            cloud: azure
```
After creating all the cluster definitions and ClusterSelectors, I committed and published the changes. You can see my full repo here.

Step 3: Install Anthos Config Management

The Anthos Config Management (ACM) subsystem lets you do a variety of things such as synchronize configurations across clusters, apply declarative policies, and manage a hierarchy of namespaces.

Enabling and installing ACM on GKE clusters and attached clusters is straightforward. First, we need credentials to talk to our git repo. One option is to use an SSH keypair. I generated a new keypair, and added the public key to my GitHub account. Then, I created a secret in each Kubernetes cluster that references the private key value.
```
kubectl create ns config-management-system && \
kubectl create secret generic git-creds \
  --namespace=config-management-system \
  --from-file=ssh="[/path/to/KEYPAIR-PRIVATE-KEY-FILENAME]"
```
With that done, I went through the GCP Console (or you can do this via CLI) to add ACM to each cluster. I chose to use SSH as the authentication mechanism, and then pointed to my GitHub repo.

After walking through the GKE clusters, I could see that ACM was installed and configured. Then I installed ACM on the AKS cluster too, all from the GCP Console.

With that, the foundation of my multi-cloud platform was all set up.

Step 4: Install Config Connector and Azure Service Operator

As mentioned earlier, the Config Connector helps you treat GCP managed services like Kubernetes resources. I only wanted the Config Connector on a single GKE cluster, so I went to gke-cluster-2 in the GCP Console and “enabled” Workload Identity and the Config Connector features. Workload Identity connects Kubernetes service accounts to GCP identities. It’s pretty cool. I created a new service account (“seroter-cc”) that Config Connector would use to create managed services.

To confirm installation, I ran a “kubectl get crds” command to see all the custom resources added by the Config Connector.

There’s only one step to configure the Config Connector itself. I created a single configuration that referenced the service account and GCP project used by Config Connector.
```
# configconnector.yaml
apiVersion: core.cnrm.cloud.google.com/v1beta1
kind: ConfigConnector
metadata:
  # the name is restricted to ensure that there is only one
  # ConfigConnector instance installed in your cluster
  name: configconnector.core.cnrm.cloud.google.com
spec:
 mode: cluster
 googleServiceAccount: "seroter-cc@seroter-anthos.iam.gserviceaccount.com"
```
I ran “kubectl apply -f configconnector.yaml” for the configuration, and was all set.

Since I also wanted to provision Microsoft Azure services using the same GitOps + KRM mechanism, I installed the Azure Service Operators. This involved installing a cert manager, installing Helm, creating an Azure Service Principal (that has rights to create services), and then installing the operator.

Step 5: Check-In Configs to Deploy Managed Services and Applications

The examples for the Config Connector and Azure Service Operator talk about running “kubectl apply” for each service you want to create. But I want GitOps! So, that means setting up git directories that hold the configurations, and relying on ACM (and Config Sync) to “apply” these configurations on the target clusters.

I created five namespace directories in my git repo. The everywhere-apps namespace applies to every cluster. The gcp-apps namespace should only live on GCP. The azure-apps namespace only runs on Azure clusters. And the gcp-connector and azure-connector namespaces should only live on the cluster where the Config Connector and Azure Service Operator live. I wanted something like this:

How do I create configurations that make that above image possible? Easy. Each “namespace” directory in the repo has a namespace.yaml file. This file provides the name of the namespace, and optionally, annotations. The annotation for the gcp-connector namespace used the ClusterSelector that only applied to gke-cluster-2. I also added a second annotation that told the Config Connector which GCP project hosted the generated managed services.
```
apiVersion: v1
kind: Namespace
metadata:
  name: gcp-connector
  annotations:
    configmanagement.gke.io/cluster-selector: selector-specialrole-connectorhost
    cnrm.cloud.google.com/project-id: seroter-anthos
```
I added namespace.yaml files for each other namespace, with ClusterSelector annotations on all but the everywhere-apps namespace, since that one runs everywhere.

Now, I needed the actual resource configurations for my cloud managed services. In GCP, I wanted to create a Cloud Storage bucket. With this “configuration as data” approach, we just define the resource, and ask Anthos to instantiate and manage it. The Cloud Storage configuration looks like this:
```
  apiVersion: storage.cnrm.cloud.google.com/v1beta1
  kind: StorageBucket
  metadata:
    annotations:
      cnrm.cloud.google.com/project-id : seroter-anthos
      #configmanagement.gke.io/namespace-selector: config-supported
    name: seroter-config-bucket
  spec:
    lifecycleRule:
      - action:
          type: Delete
        condition:
          age: 7
    uniformBucketLevelAccess: true
```
The Azure example really shows the value of this model. Instead of programmatically sequencing the necessary objects—first create a resource group, then a storage account, then a storage blob—I just need to define those three resources, and Kubernetes reconciles each resource until it succeeds. The Storage Blob resource looks like:
```
apiVersion: azure.microsoft.com/v1alpha1
kind: BlobContainer
metadata:
  name: blobcontainer-sample
spec:
  location: westus
  resourcegroup: resourcegroup-operators
  accountname: seroterstorageaccount
  # accessLevel - Specifies whether data in the container may be accessed publicly and the level of access.
  # Possible values include: 'Container', 'Blob', 'None'
  accesslevel: Container
```
The image below shows my managed-service-related configs. I checked all these configurations into GitHub.

A few seconds later, I saw that Anthos was processing the new configurations.

Ok, it’s the moment of truth. First, I checked Cloud Storage and saw my brand new bucket, provisioned by Anthos.

Switching over to the Azure Portal, I navigated to Storage area and saw my new account and blob container.

How cool is that? Now i just have to drop resource definitions into my GitHub repository, and Anthos spins up the service in GCP or Azure. And if I delete that resource manually, Anthos re-creates it automatically. I don’t have to learn each API or manage code that provisions services.

Finally, we can also deploy applications this way. Imagine using a CI pipeline to populate a Kubernetes deployment template (using kpt, or something else) and dropping it into a git repo. Then, we use the Kubernetes resource model to deploy the application container. In the gcp-apps directory, I added Kubernetes deployment and service YAML files that reference a basic app I containerized.

As you might expect, once the repo synced to the correct clusters, Anthos created a deployment and service that resulted in a routable endpoint. While there are tradeoffs for deploying apps this way, there are some compelling benefits.

Step 6: “Move” App Between Clouds by Moving Configs in GitHub

This last step is basically my way of trolling the people who complain that multi-cloud apps are hard. What if I want to take the above app from GCP and move it to Azure? Does it require a four week consulting project and sacrificing a chicken? No. I just have to copy the Kubernetes deploy and service YAML files to the azure-apps directory.

After committing my changes to GitHub, ACM fired up and deleted the app from GCP, and inflated it on Azure, including an Azure Load Balancer instance to get a routable endpoint. I can see all of that from within the GCP Console.

Now, in real life, apps aren’t so easily portable. There are probably sticky connections to databases, and other services. But if you have this sort of platform in place, it’s definitely easier.

Thanks to deep support for GitOps and the KRM, Anthos makes it possible to manage infrastructure, apps, and managed services in a consistent way, on any cloud. Whether you use Anthos or not, take a look at GitOps and the KRM and start asking your preferred vendors when they’re going to adopt this paradigm!
January 12, 2021
I want to learn about these six things in 2021

I know a few things. I don’t know most things. Each year, I try to learn new stuff and challenge my existing knowledge/assumptions. There are always more things to learn than time available in the day, so I have to be selective. What should I focus on? Some folks choose to go deeper in their areas of expertise, others choose to bolster weak areas. Next year, I’m going to do the latter.

Here are six topics—four related to tech, two related to professional skills—I want to learn more about, and I’ll include some thoughts on my approach to learning each.

Technology Skills

Each year, I try out a variety of technologies. Next year won’t be different. Besides these four topics below, I suspect that I’ll keep messing around with serverless technologies, Kubernetes, service meshes, and public cloud services. But I’m going to spend special attention on:

Identity and access management

In my 20+ year career, I’ve learned enough about identity management to be dangerous. But in reality, I’m barely competent on this topic. It’s time to truly understand how all this works. With so many folks building increasingly distributed architectures, identity management seems more important than ever. I’d like to dig into things like authorization flows, application identities within clusters, and access management within cloud tenancy structures.

How? I plan on taking some Pluralsight courses on Google Cloud Identity, OAuth2 flows, and overall security practices. Then I’ll invest in some hands-on time with things like Workload Identity, Identity Aware Proxy, and the BeyondCorp assets we’ve created. May also read some Gartner and Forrester reports on the topic.

BigQuery

This is a crown jewel in Google Cloud’s portfolio. It’s a well-built, popular service that stands out among public cloud offerings. I’ve spent precious little time in the data analytics domain, and want to change that. A little. I’m not interested in being a full-on analytics guy, but I want to understand how BigQuery works and the role it can play for companies adopting cloud.

How? There are a handful of Pluralsight courses that look good here. I’ll also go hands on a lot. That may involve some QwikLabs, or just me playing with datasets.

Angular

I’ve mostly declared bankruptcy on front-end frameworks. My career has been server-side, with only enough investment in the front-end to build decent looking demos. But I like what I’m seeing here and it’s obvious how much processing we’re doing client-side now. There are roughly five hundred viable frameworks to choose from, so I might as well pick a popular one with some Google heritage.

How? Pluralsight has a great Angular learning path. I just need to get some reps with the tech, and make it second nature to use on any apps I build. Plus, learning this gives me an excuse to use compute platforms like Cloud Run and GKE to host my app.

Application deployment tools and strategies

While CI/CD is a fairly mature domain, I’m still seeing lots of fresh thinking here. I want to learn more about how forward-thinking companies are packaging up and shipping software. Shipping is more sophisticated now with so many components to factor in, and less tolerance for downtime. The tooling for continuous deployment (and progressive delivery) is getting better.

How? I’m looking forward to trying out a lot of technologies here. I’m sure i’ll find a lot of books or courses about what I’m after, so this is a very “hands on” journey.

Professional Skills

I’m also looking for to building up my business and management skills next year. The two things that I’ll invest the most in are:

Product management

Given my position in Google Cloud, I’m supposed to know what I’m doing. But I’m learning new things every day. In 2021, I want to double-down on the practices of product development and full product lifecycle management. I’ve got so much to learn on how to better identify customer problems, scope an experiment, communicate value, measure usage, and build a sustainable business around the product.

How? Much of this will happen by watching my peers. The product discipline at Google Cloud is excellent. In addition, I’ve got my eye on new books, and some product-focused conferences. I also plan on reading some of the good Gartner research on product management.

Coaching and sponsorship

I’ve done some mentorship in my career, but I haven’t done much coaching or sponsorship. Some of that is because of imposter syndrome (“why would anyone want to learn anything from ME?”) and some is because I haven’t made it a priority. I now have more appreciation for what I can give back to others. I’ve been making myself more available this year, and want to intentionally continue that next year.

How? Some of this will happen through study and watching others, and some by actually doing it! Our industry is full of high-potential individuals who haven’t had someone in their corner, and I’m going to do my part to fix that.

What about you? What topics deserve your special attention in 2021? I’m looking forward to learning in public and getting your feedback along the way.

December 7, 2020

Four reasons that Google Cloud Run is better than traditional FaaS offerings

Has the “serverless revolution stalled”? I dunno. I like serverless. Taught a popular course about it. But I reviewed and published an article written by Bernard Brode that made that argument, and it sparked a lot of discussion. If we can agree that serverless computing means building an architecture out of managed services that scale to zero—we’re not strictly talking about function-as-a-service—that’s a start. Has this serverless model crossed the chasm from early adopters to an early majority? I don’t think so. And the data shows that usage of FaaS—still a fundamental part of most people’s serverless architecture—has flattened a bit. Why is that? I’m no expert, but I wonder if some of the inherent friction of the 1st generation FaaS gets in the way.

We’re seeing a new generation of serverless computing that removes that friction and may restart the serverless revolution. I’m talking here about Google Cloud Run. Based on the Knative project, it’s a fully managed service that scales container-based apps to zero. To me, it takes the best attributes from three different computing paradigms:

Paradigm	Best Attributes
Platform-as-a-Service	– focus on the app, not underlying infrastructure – auto-wire networking components to expose your endpoint
Container-as-a-Service	– use portable app packages – develop and test locally
Function-as-a-Service	– improve efficiency by scaling to zero – trigger action based on events

Each of those above paradigms has standalone value. By all means, use any of them if they suit your needs. Right now, I’m interested in what it will take for large companies to adopt serverless computing more aggressively. I think it requires “fixing” some of the flaws of FaaS, and there are four reasons Cloud Run is positioned to do so.

1. It doesn’t require rearchitecting your systems

First-generation serverless doesn’t permit cheating. No, you have to actually refactor or rebuild your system to run this way. That’s different than all the previous paradigms. IaaS? You could take existing bare metal workloads and run them unchanged in a cloud VM platform. PaaS? It catered to 12-factor apps, but you could still run many existing things there. CaaS? You can containerize a lot of things without touching the source code. FaaS? Nope. Nothing in your data center “just works” in a FaaS platform.

While that’s probably a good thing from a purity perspective—stop shifting your debt from one abstraction to another without paying it down!—it’s impractical. Simultaneously, we’re asking staff at large companies to: redesign teams for agile, introduce product management, put apps on CI pipelines, upgrade their programming language/framework, introduce new databases, decouple apps into microservices, learn cloud and edge models, AND keep all the existing things up and running. It’s a lot. The companies I talk to are looking for ways to get incremental benefits for many workloads, and don’t have the time or people to rebuild many things at once.

This is where Cloud Run is better than FaaS. It hosts containers that respond to web requests or event-based triggers. You can write functions, or, containerize a complete app—Migrate for Anthos makes it easy. Your app’s entry point doesn’t have to conform to a specific method signature, and there are no annotations or code changes required to operate in Cloud Run. Take an existing custom-built app written in any language, or packaged (or no source-code-available) software and run it. You don’t have to decompose your existing API into a series of functions, or break down your web app into a dozen components. You might WANT to, but you don’t HAVE to. I think that’s powerful, and significantly lowers the barrier to entry.

2. It runs anywhere

Lock-in concerns are overrated. Everything is lock-in. You have to decide whether you’re getting unique value from the coupling. If so, go for it. A pristine serverless architecture consists of managed services with code (FaaS) in the gaps. The sticky part is all those managed services, not the snippets of code running in the FaaS. Just making a FaaS portable doesn’t give you all the benefits of serverless.

That said, I don’t need all the aspects of serverless to get some of the benefits. Replacing poorly utilized virtual machines with high-density nodes hosting scale-to-zero workloads is great. Improving delivery velocity by having an auto-wired app deployment experience versus ticket-defined networking is great. I think it’s naive to believe that most folks can skip from traditional software development directly to fully serverless architectures. There’s a learning and adoption curve. And one step on the journey is defining more distributed services, and introducing managed services. Cloud Run offers a terrific best-of-both-worlds model that makes the journey less jarring. And uniquely, it’s not only available on a single cloud.

Cloud Run is great on Google Cloud. Given the option, you should use it there. It’s fully managed and elastic, and integrates with all types of GCP-only managed services, security features, and global networking. But you won’t only use Google Cloud in your company. Or Azure. Or AWS. Or Cloudflare. Cloud Run for Anthos puts this same runtime most anywhere. Use it in your data center. Use it in your colocation or partner facility. Use it at the edge. Soon, use it on AWS or Azure. Get one developer-facing surface for apps running on a variety of hosts.

A portable Faas, based on open source software, is powerful. And I believe, necessary, to break into mainstream adoption within the enterprise. Bring the platform to the people!

3. It makes the underlying container as invisible, or visible, as you want

Cloud Run uses containers. On one hand, it’s a packaging mechanism, just like a ZIP file for AWS Lambda. On the other, it’s a way to bring apps written in any language, using any libraries, to a modern runtime. There’s no “supported languages” page on the website for Cloud Run. It’s irrelevant.

Now, I personally don’t like dealing with containers. I want to write code, and see that code running somewhere. Building containers is an intermediary step that should involve as little effort as possible. Fortunately, tools like Cloud Code make that a reality for me. I can use Visual Studio Code to sling some code, and then have it automatically containerized during deployment. Thanks Cloud Buildpacks! If I choose to, I can use Cloud Run while being blissfully unaware that there are containers involved.

That said, maybe I want to know about the container. My software may depend on specific app server settings, file system directories, or running processes. During live debugging, I may like knowing I can tunnel into the container and troubleshoot in sophisticated ways.

Cloud Run lets you choose how much you want to care about the container image and running container itself. That’s a flexibility that’s appealing.

4. It supports advanced use cases

Cloud Run is great for lots of scenarios. Do server-side streaming with gRPC. Build or migrate web apps or APIs that take advantage of our new API Gateway. Coordinate apps in Cloud Run with other serverless compute using the new Cloud Workflows. Trigger your Cloud Run apps based on events occurring anywhere within Google Cloud. Host existing apps that need a graceful shutdown before scaling to zero. Allocate more horsepower to new or existing apps by assigning up to 4 CPUs and 4GB of RAM, and defining concurrency settings. Decide if your app should always have an idle instance (no cold starts) and how many instances it should scale up to. Route traffic to a specific port that your app listens on, even if it’s not port 80.

If you use Cloud Run for Anthos (in GCP or on other infrastructure), you have access to underlying Kubernetes attributes. Create private services. Participate in the service mesh. Use secrets. Reference ConfigMaps. Turn on Workload Identity to secure access to GCP services. Even take advantage of GPUs in the cluster.

Cloud Run isn’t for every workload, of course. It’s not for background jobs. I wouldn’t run a persistent database. It’s ideal for web-based apps, new or old, that don’t store local state.

Give Cloud Run a look. It’s a fast-growing service, and it’s free to try out with our forever-free services on GCP. 2 million requests a month before we charge you anything! See if you agree that this is what the next generation of serverless compute should look like.

October 13, 2020

Let’s compare the CLI experiences offered by AWS, Microsoft Azure, and Google Cloud Platform
Real developers use the CLI, or so I’m told. That probably explains why I mostly use the portal experiences of the major cloud providers. But judging from the portal experiences offered by most clouds, they prefer you use the CLI too. So let’s look at the CLIs.

Specifically, I evaluated the cloud CLIs with an eye on five different areas:
1. API surface and patterns. How much of the cloud was exposed via CLI, and is there a consistent way to interact with each service?
2. Authentication. How do users identify themselves to the CLI, and can you maintain different user profiles?
3. Creating and viewing services. What does it feel like to provision instances, and then browse those provisioned instances?
4. CLI sweeteners. Are there things the CLI offers to make using it more delightful?
5. Utilities. Does the CLI offer additional tooling that helps developers build or test their software?
Let’s dig in.

Disclaimer: I work for Google Cloud, so obviously I’ll have some biases. That said, I’ve used AWS for over a decade, was an Azure MVP for years, and can be mostly fair when comparing products and services. Please call out any mistakes I make!

AWS

You have a few ways to install the AWS CLI. You can use a Docker image, or install directly on your machine. If you’re installing directly, you can download from AWS, or use your favorite package manager. AWS warns you that third party repos may not be up to date. I went ahead and installed the CLI on my Mac using Homebrew.

API surface and patterns

As you’d expect, the AWS CLI has wide coverage. Really wide. I think there’s an API in there to retrieve the name of Andy Jassy’s favorite jungle cat. The EC2 commands alone could fill a book. The documentation is comprehensive, with detailed summaries of parameters, and example invocations.

The command patterns are relatively consistent, with some disparities between older services and newer ones. Most service commands look like:
```
aws [service name] [action] [parameters]
```
Most “actions” start with create, delete, describe, get, list, or update.

For example:
```
aws elasticache create-cache-cluster --engine redis
aws kinesis describe-stream --stream-name seroter-stream
aws kinesis describe-stream --stream-name seroter-stream
aws qldb delete-ledger --name seroterledger
aws sqs list-queues
```
S3 is one of the original AWS services, and its API is different. It uses commands like cp, ls, and rm. Some services have modify commands, others use update. For the most part, it’s intuitive, but I’d imagine most people can’t guess the commands.

Authentication

There isn’t one way to authenticate to the AWS CLI. You might use SSO, an external file, or inline access key and ID, like I do below.

The CLI supports “profiles” which seems important when you may have different access to default values based on what you’re working on.

Creating and viewing service instances

By default, everything the CLI does occurs in the region of the active profile. You can override the default region by passing in a region flag to each command. See below that I created a new SQS queue without providing a region, and it dropped it into my default one (us-west-2). By explicitly passing in a target region, I created the second queue elsewhere.

The AWS Console shows you resources for a selected region. I don’t see obvious ways to get an all-up view. A few services, like S3, aren’t bound by region, and you see all resources at once. The CLI behaves the same. I can’t view all my SQS queues, or databases, or whatever, from around the world. I can “list” the items, region by region. Deletion behaves the same. I can’t delete the above SQS queue without providing a region flag, even though the URL is region-specific.

Overall, it’s fast and straightforward to provision, update, and list AWS services using the CLI. Just keep the region-by-region perspective in mind!

CLI sweeteners

The AWS CLI gives you control over the output format. I set the default for my profile to json, but you can also do yaml, text, and table. You can toggle this on a request by request basis.

You can also take advantage of command completion. This is handy, given how tricky it may be to guess the exact syntax of a command. Similarly, I really like you can be prompted for parameters. Instead of guessing, or creating giant strings, you can go parameter by parameter in a guided manner.

The AWS CLI also offers select opportunities to interact with the resources themselves. I can send and receive SQS messages. Or put an item directly into a DynamoDB table. There are a handful of services that let you create/update/delete data in the resource, but many are focused solely on the lifecycle of the resource itself.

Finally, I don’t see a way to self-update from within the CLI itself. It looks like you rely on your package manager or re-download to refresh it. If I’m wrong, tell me!

Utilities

It doesn’t look like the CLI ships with other tools that developers might use to build apps for AWS.

Microsoft Azure

The Microsoft Azure CLI also has broad coverage and is well documented. There’s no shortage of examples, and it clearly explains how to use each command.

Like AWS, Microsoft offers their CLI in a Docker image. They also offer direct downloads, or access via a package manager. I grabbed mine from Homebrew.

API surface and patterns

The CLI supports almost every major Azure service. Some, like Logic Apps or Blockchain, only show up in their experimental sandbox.

Commands follow a particular syntax:
```
az [service name] [object] create | list | delete | update [parameters]
```
Let’s look at a few examples:
```
az ad app create --display-name my-ad-app
az cosmosdb list --resource-group group1
az postgres db show --name mydb --resource-group group1 --server-name myserver
az service bus queue delete --name myqueue --namespace-name mynamespace --resource-group group1
```
I haven’t observed much inconsistency in the CLI commands. They all seem to follow the same basic patterns.

Authentication

Logging into the CLI is easy. You can simply do az login as I did below—this opens a browser window and has you sign into your Azure account to retrieve a token—or you can pass in credentials. Those credentials may be a username/password, service principal with a secret, or service principal with a client certificate.

Once you log in, you see all your Azure subscriptions. You can parse the JSON to see which one is active, and will be used as the default. If you wish to change the default, you can use az account set --subscription [name] to pick a different one.

There doesn’t appear to be a way to create different local profiles.

Creating and viewing service instances

It seems that most everything you create in Azure goes into a resource group. While a resource group has a “location” property, that’s related to the metadata, not a restriction on what gets deployed into it. You can set a default resource group (az configure --defaults group=[name]) or provide the relevant input parameter on each request.

Unlike other clouds, Azure has a lot of nesting. You have a root account, then a subscription, and then a resource group. And most resources also have parent-child relationships you must define before you can actually build the thing you want.

For example, if you want a service bus queue, you first create a namespace. You can’t create both at the same time. It’s two calls. Want a storage blob to upload videos into? Create a storage account first. A web application to run your .NET app? Provision a plan. Serverless function? Create a plan. This doesn’t apply to everything, but just be aware that there are often multiple steps involved.

The creation activity itself is fairly simple. Here are commands to create Service Bus namespace and then a queue
```
az servicebus namespace create --resource-group mydemos --name seroter-demos --location westus
az servicebus queue create --resource-group mydemos --namespace-name seroter-demos --name myqueue
```
Like with AWS, some Azure assets get grouped by region. With Service Bus, namespaces are associated to a geo. I don’t see a way to query all queues, regardless of region. But for the many that aren’t, you get a view of all resources across the globe. After I created a couple Redis caches in my resource group, a simple az redis list --resource-group mydemos showed me caches in two different parts of the US.

Depending on how you use resource groups—maybe per app or per project, or even by team—just be aware that the CLI doesn’t retrieve results across resource groups. I’m not sure the best strategy for viewing subscription-wide resources other than the Azure Portal.

CLI sweeteners

The Azure CLI has some handy things to make it easier to use.

There’s a find function for figuring out commands. There’s output formatting to json, tables, or yaml. You’ll also find a useful interactive mode to get auto-completion, command examples, and more. Finally, I like that the Azure CLI supports self-upgrade. Why leave the CLI if you don’t have to?

Utilities

I noticed a few things in this CLI that help developers. First, there’s an az rest command that lets you call Azure service endpoints with authentication headers taken care of for you. That’s a useful tool for calling secured endpoints.

Azure offers a wide array of extensions to the CLI. These aren’t shipped as part of the CLI itself, but you can easily bolt them on. And you can create your own. This is a fluid list, but az extension list-available shows you what’s in the pool right now. As of this writing, there are extensions for preview AKS capabilities, managing Azure DevOps, working with DataBricks, using Azure LogicApps, querying the Azure Resource Graph, and more.

Google Cloud Platform

I’ve only recently started seriously using the GCP CLI. What’s struck me most about the gcloud tool is that it feels more like a system—dare I say, platform—than just a CLI. We’ll talk more about that in a bit.

Like with other clouds, you can use the SDK/CLI within a supported Docker image, package manager, or direct download. I did a direct download, since this is also a self-updating CLI, so I didn’t want to create a zombie scenario with my package manager.

API surface and patterns

The gcloud CLI has great coverage for the full breadth of GCP. I can’t see any missing services, including things launched two weeks ago. There is a subset of services/commands available in the alpha or beta channels, and are fully integrated into the experience. Each command is well documented, with descriptions of parameters, and example calls.

CLI commands follow a consistent pattern:
```
gcloud [service] create | delete | describe | list | update [parameters]
```
Let’s see some examples:
```
gcloud bigtable instances create seroterdb --display-name=seroterdb --cluster=serotercluster --cluster-zone=us-east1-a
gcloud pubsub topics describe serotertopic
gcloud run services update --memory=1Gi
gcloud spanner instances delete myspanner
```
All the GCP services I’ve come across follow the same patterns. It’s also logical enough that I even guessed a few without looking anything up.

Authentication

A gcloud auth login command triggers a web-based authorization flow.

Once I’m authenticated, I set up a profile. It’s possible to start with this process, and it triggers the authorization flow. Invoking the gcloud init command lets me create a new profile/configuration, or update an existing one. A profile includes things like which account you’re using, the “project” (top level wrapper beneath an account) you’re using, and a default region to work in. It’s a guided processes in the CLI, which is nice.

And it’s a small thing, but I like that when it asks me for a default region, it actually SHOWS ME ALL THE REGION CODES. For the other clouds, I end up jumping back to their portals or docs to see the available values.

Creating and viewing service instances

As mentioned above, everything in GCP goes into Projects. There’s no regional affinity to projects. They’re used for billing purposes and managing permissions. This is also the scope for most CLI commands.

Provisioning resources is straightforward. There isn’t the nesting you find in Azure, so you can get to the point a little faster. For instance, provisioning a new PubSub topic looks like this:
```
gcloud pubsub topics create richard-topic
```
It’s quick and painless. PubSub doesn’t have regional homing—it’s a global service, like others in GCP—so let’s see what happens if I create something more geo-aware. I created two Spanner instances, each in different regions.
```
gcloud spanner instances create seroter-db1 --config=regional-us-east1 --description=ordersdb --nodes=1
gcloud spanner instances create seroter-db2 --config=regional-us-west1 --description=productsdb --nodes=1
```
It takes seconds to provision, and then querying with gcloud spanner instances list gives me all Spanner database instances, regardless of region. And I can use a handy “filter” parameter on any command to winnow down the results.

The default CLI commands don’t pull resources from across projects, but there is a new command that does enable searching across projects and organizations (if you have permission). Also note that Cloud Storage (gsutil) and Big Query (bq) use separate CLIs that aren’t part of gcloud directly.

CLI sweeteners

I used one of the “sweeteners” before: filter. It uses a simple expression language to return a subset of results. You’ll find other useful flags for sorting and limiting results. Like with other cloud CLIs, gcloud lets you return results as json, table, csv, yaml, and other formats.

There’s also a full interactive shell with suggestions, auto-completion, and more. That’s useful as you’re learning the CLI.

gcloud has a lot of commands for interacting with the services themselves. You can publish to a PubSub topic, execute a SQL statement against a Spanner database, or deploy and call a serverless Function. It doesn’t apply everywhere, but I like that it’s there for many services.

The GCP CLI also self-updates. We’ll talk about it more in the section below.

Utilities

A few paragraphs ago, I said that the gcloud CLI felt more like a system. I say that, because it brings a lot of components with it. When I type in gcloud components list, I see all the options:

We’ve got the core SDK and other GCP CLIs for Big Query, but also a potpourri of other handy tools. You’ve got Kubernetes development tools like minikube, Skaffold, Kind, kpt, and kubectl. And you get a stash of local emulators for cloud services like Bigtable, Firestore, Spanner, PubSub and Spanner.

I can install any or all of these, and upgrade them all from here. A gcloud components update command update all of them, and, shows me a nice change log.

There are other smaller utility functions included in gcloud. I like that I have commands to configure Docker to work with Google Container Registry, Or fetch Kubernetes cluster credentials and put them into my active profile. And print my identity token to inject into the auth headers of calls to secure endpoints.

Wrap

To some extent, each CLI reflects the ethos of their cloud. The AWS CLI is dense, powerful, and occasionally inconsistent. The Azure CLI is rich, easy to get started with, and 15% more complicated than it should be. And the Google Cloud CLI is clean, integrated, and evolving. All of these are great. You should use them and explore their mystery and wonder.
September 15, 2020
First look: Triggering Google Cloud Run with events generated by GCP services
When you think about “events” in an event-driven architecture, what comes to mind? Maybe you think of business-oriented events like “file uploaded”, “employee hired”, “invoice sent”, “fraud detected”, or “batch job completed.” You might emit (or consume) these types of events in your application to develop more responsive systems.

What I find even more interesting right now are the events generated by the systems beneath our applications. Imagine what your architects, security pros, and sys admins could do if they could react to databases being provisioned, users getting deleted, firewall being changed, or DNS zone getting updated. This sort of thing is what truly enables the “trust, but verify” approach for empowered software teams. Let those teams run free, but “listen” to things that might be out of compliance.

This week, the Google Cloud team announced Events for Cloud Run, in beta this September. What this capability does is let you trigger serverless containers when lifecycle events happen in most any Google Cloud service. These lifecycle events are in the CloudEvents format, and distributed (behind the scenes) to Cloud Run via Google Cloud PubSub. For reference, this capability bears some resemblance to AWS EventBridge and Azure Event Grid. In this post, I’ll give you a look at Events for Cloud Run, and show you how simple it is to use.

Code and deploy the Cloud Run service

Developers deploy containers to Cloud Run. Let’s not get ahead of ourselves. First, let’s build the app. This app is Seroter-quality, and will just do the basics. I’ll read the incoming event and log it out. This is a simple ASP.NET Core app, with the source code in GitHub.

I’ve got a single controller that responds to a POST command coming from the eventing system. I take that incoming event, serialize from JSON to a string, and print it out. Events for Cloud Run accepts either custom events, or CloudEvents from GCP services. If I detect a custom event, I decode the payload and print it out. Otherwise, I just log the whole CloudEvent.
```
namespace core_sample_api.Controllers
{
    [ApiController]
    [Route("")]
    public class Eventsontroller : ControllerBase
    {
        private readonly ILogger<Eventsontroller> _logger;
        public Eventsontroller(ILogger<Eventsontroller> logger)
        {
            _logger = logger;
        }
        [HttpPost]
        public void Post(object receivedEvent)
        {
            Console.WriteLine("POST endpoint called");
            string s = JsonSerializer.Serialize(receivedEvent);
            //see if custom event with "message" root property
            using(JsonDocument d = JsonDocument.Parse(s)){
                JsonElement root = d.RootElement;
                if(root.TryGetProperty("message", out JsonElement msg)) {
                    Console.WriteLine("Custom event detected");
                    JsonElement rawData = msg.GetProperty("data");
                    //decode
                    string data = System.Text.Encoding.UTF8.GetString(Convert.FromBase64String(rawData.GetString()));
                    Console.WriteLine("Data value is: " + data);
                }
            }
            Console.WriteLine("Data: " + s);
        }
    }
}
```
After checking all my source code into GitHub, I was ready to deploy it to Cloud Run. Note that you can use my same repo to continue on this example!

I switched over to the GCP Console, and chose to create a new Cloud Run service. I picked a region and service name. Then I could have chosen either an existing container image, or, continuous deployment from a git repo. I chose the latter. First I picked my GitHub repo to get source from.

Then, instead of requiring a Dockerfile, I picked the new Cloud Buildpacks support. This takes my source code and generates a container for me. Sweet.

After choosing my code source and build process, I kept the default HTTP trigger. After a few moments, I had a running service.

Add triggers to Cloud Run

Next up, adding a trigger. By default, the “triggers” tab shows the single HTTP trigger I set up earlier.

I wanted to show custom events in addition to CloudEvents ones, so I went to the PubSub dashboard and created a new queue that would trigger Cloud Run.

Back in the Cloud Run UX, I added a new trigger. I chose the trigger type of “com.google.cloud.pubsub.topic.publish” and picked the Topic I created earlier. After saving the trigger, I saw it show up in the list.

After this, I wanted to trigger my Cloud Run service with CloudEvents. If you’re receiving events from Google Cloud services, you’ll have to enable Data Access Logs so that events can be spun up from Cloud Logs. I’m going to listen for events from Cloud Storage and Cloud Build, so I turned on audit logging for each.

All that was left to define the final triggers. For Cloud Storage, I chose the storage.create.bucket trigger.

I wanted to react to Cloud Build, so that I could see whenever a build started.

Terrific. Now I was ready to test. I sent in a message to PubSub to trigger the custom event.

I checked the logs for Cloud Run, and almost immediately saw that the service ran, accepted the event, and logged the body.

Next, I tested Cloud Storage by adding a new bucket.

Almost immediately, I saw a CloudEvent in the log.

Finally, I kicked off a new Build pipeline, and saw an event indicating that Cloud Run received a message, and logged it.

If you care about what happens inside the systems your apps depend on, take a look at the new Events for Cloud Run and start tapping into the action.
August 25, 2020
I’m looking forward these 8 sessions at Google Cloud Next ’20 OnAir (Week 7)
It’s here. After six weeks of OTHER topics, we’re up to week seven of Google Cloud Next OnAir, which is all about my area: app modernization. The “app modernization” bucket in Google Cloud covers lots of cool stuff including Cloud Code, Cloud Build, Cloud Run, GKE, Anthos, Cloud Operations, and more. It basically addresses the end-to-end pipeline of modern apps. I recently sketched it out like this:

I think this the biggest week of Next, with over fifty breakout sessions. I like that most of the breakouts so far have been ~20 minutes, meaning you can log in, set playback speed to 1.5x, and chomp through lots of topic quickly.

Here are eight of the sessions I’m looking forward to most:
1. Ship Faster, Spend Less By Going Multi-Cloud with Anthos. This is the “keynote” for the week. We’re calling out a few product announcements, highlighting some new customers, and saying keynote-y things. You’ll like it.
2. GKE Turns 5: What’s New? All Kubernetes aren’t the same. GKE stands apart, and the team continues solving customer problems in new ways. This should be a great look back, and look ahead.
3. Cloud Run: What’s New? To me, Cloud Run has the best characteristics of PaaS, combined with the the event-driven, scale-to-zero of serverless functions. This is the best place I know of to run custom-built apps in the Google Cloud (or anywhere, with Anthos).
4. Modernize Legacy Java Apps Using Anthos. Whoever figures out how to unlock value from existing (Java) apps faster, wins. Here’s what Google Cloud is doing to help customers improve their Java apps and run them on a great host.
5. Running Anthos on Bare Metal and at the Edge with Major League Baseball (MLB). Baseball’s back, my Slam Diego Padres are fun again, and Anthos is part of the action. Good story here.
6. Getting Started with Anthos, Anthos Deep Dive: Part One, Anthos Deep Dive: Part Two. Am I cheating by making three sessions into one entry? Fine, you caught me. But this three part trilogy is a great way to grok Anthos and understand its value.
7. Develop for Cloud Run in the IDE with Cloud Code. Cloud Code extends your IDE to support Google Cloud, and Cloud Run is great. Combine the two, and you’ve got some good stuff.
8. Event-Driven Microservices with Cloud Run. You’re going to enjoy this one, and seeing what’s now possible.
I’m looking forward to this week. We’re sharing lots of fun progress, and demonstrating some fresh perspectives on what app modernization should look like. Enjoy watching!
August 24, 2020
I’m looking forward these 7 sessions at Google Cloud Next ’20 OnAir (Weeks 5 + 6)
Nearly halfway through the “Summer of Google”, there’s been some significant announcements, and plenty of interesting talks. We’ve shared the names of some new Google Cloud customers—Deutsche Bank, Goldman Sachs, Spotify, Best Buy and more—and announced some pretty cool stuff—think Confidential VMs, BigQuery Omni, a Certificate Authority Service, a new undersea cable, a cloud migration program, plus the usual barrage of new things we’re constantly shipping for you. The conference is still free to sign up for, and you can catch up on everything you’ve missed.

Week five content is available on August 11, and week six material is binge-able on August 18. Week five is all about Data Analytics and the focus of week six is Data Management and Databases.

Here’s what looks good to me in week five:
- Easy Access to Stream Analytics with Google Cloud. Data streaming. Get on that train. This might be a great session to see what’s possible.
- GCP for Bioinformatics. I’m a fan of (a) Lynn Langit, and (b) realistic use cases to try out. This session has both.
- Building an Internet-Scale Analytics Platform on Google Cloud. This talk features Verizon Media, and includes Looker, which I want to learn more about.
These sessions look appealing in week six:
- NYT: Building a Real-Time Collaborative Editor with Firestore. This sounds like a cool use case about collaborative work in a fast-moving world.
- How to Choose the Right Database For Your Workloads. I’m always a fan of “how do I choose what to use” sorts of talks.
- Lessons Learned from Evaluating Cloud Spanner at Uber Scale. Spanner continues to be unique and awesome, and the gang at Uber explain how they tested it out.
- Data Modernization: McKesson Story. Modernize your infrastructure and apps, but please, don’t forget your databases! This looks like a good talk by a company that can’t afford to get it wrong.
It was hard to just pick a few talks! Check those out, and stay tuned for a final look at the last three weeks of this summer extravaganza.
August 7, 2020