Category: Microservices

More than serverless: Why Cloud Run should be your first choice for any new web app.

I’ll admit it, I’m a PaaS guy. Platform-as-a-Service is an ideal abstraction for those that don’t get joy from fiddling with infrastructure. From Google App Engine, to Heroku, to Cloud Foundry, I’ve appreciated attempts to deliver runtimes that makes it easier to ship and run code. Classic PaaS-type services were great at what they did. The problem with all of them—this includes all the first generation serverless products like Amazon Lambda—were that they were limited. Some of the necessary compromises were well-meaning and even healthy: build 12-factor apps, create loose coupling, write less code and orchestrate manage services instead. But in the end, all these platforms, while successful in various ways, were too constrained to take on a majority of apps for a majority of people. Times have changed.

Google Cloud Run started as a serverless product, but it’s more of an application platform at this point. It’s reminiscent of a PaaS, but much better. While not perfect for everything—don’t bring Windows apps, always-on background components, or giant middleware—it’s becoming my starting point for nearly every web app I build. There are ten reasons why Cloud Run isn’t limited by PaaS-t constraints, is suitable for devs at every skill level, and can run almost any web app.

It’s for functions AND apps.
You can run old AND new apps.
Use by itself AND as part of a full cloud solution.
Choose simple AND sophisticated configurations.
Create public AND private services.
Scale to zero AND scale to 1.
Do one-off deploys AND set up continuous delivery pipelines.
Own aspects of security AND offload responsibility.
Treat as post-build target AND as upfront platform choice.
Rely on built-in SLOs, logs, metrics AND use your own observability tools.

Let’s get to it.

#1. It’s for functions AND apps.

Note that Cloud Run also has “jobs” for run-to-completion batch work. I’m focusing solely on Cloud Run web services here.

I like “functions.” Write short code blocks that respond to events, and perform an isolated piece of work. There are many great uses cases for this.

The new Cloud Run functions experience makes it easy to bang out a function in minutes. It’s baked into CLI and UI. Once I decide to create a function ….

I only need to pick a service name, region, language runtime, and whether access to this function is authenticated or not.

Then, I see a browser-based editor where I can write, test, and deploy my function. Simple, and something most of us equate with “serverless.”

But there’s more. Cloud Run does apps too. That means instead of a few standalone functions to serve a rich REST endpoint, you’re deploying one Spring Boot app with all the requisite listeners. Instead of serving out a static site, you could return a full web app with server-side capabilities. You’ve got nearly endless possibilities when you can serve any container that accepts HTTP, HTTP/2, WebSockets, or gRPC traffic.

Use either abstraction, but stay above the infrastructure and ship quickly.

Docs:	Deploy container images, Deploy functions, Using gRPC, Invoke with an HTTPS request
Code labs to try:	Hello Cloud Run with Python, Getting Started with Cloud Run functions

#2. You can run old AND new apps.

This is where the power of containers shows up, and why many previous attempts at PaaS didn’t break through. It’s ok if a platform only supports new architectures and new apps. But then you’re accepting that you’ll need an additional stack for EVERYTHING ELSE.

Cloud Run is a great choice because you don’t HAVE to start fresh to use it. Deploy from source in an existing GitHub repo or from cloned code on your machine. Maybe you’ve got an existing Next.js app sitting around that you want to deploy to Cloud Run. Run a headless CMS. Does your old app require local volume mounts for NFS file shares? Easy to do. Heck, I took a silly app I built 4 1/2 years ago, deployed it from the Docker Hub, and it just worked.

Of course, Cloud Run shines when you’re building new apps. Especially when you want fast experimentation with new paradigms. With its new GPU support, Cloud Run lets you do things like serve LLMs via tools like Ollama. Or deploy generative AI apps based on LangChain or Firebase Genkit. Build powerful web apps in Go, Java, Python, .NET, and more. Cloud Run’s clean developer experience and simple workflow makes it ideal for whatever you’re building next.

Docs:	Migrate an existing web service, Optimize Java applications for Cloud Run, Supported runtime base images, Run LLM inference on Cloud Run GPUs with Ollama
Code labs to try:	How to deploy all the JavaScript frameworks to Cloud Run, Django CMS on Cloud Run, How to run LLM inference on Cloud Run GPUs with vLLM and the OpenAI Python SDK

#3. Use by itself AND as part of a full cloud solution.

There aren’t many tech products that everyone seems to like. But folks seem to really like Cloud Run, and it regularly wins over the Hacker News crowd! Some classic PaaS solutions were lifestyle choices; you had to be all in. Use the platform and its whole way of working. Powerful, but limiting.

You can choose to use Cloud Run all by itself. It’s got a generous free tier, doesn’t require complicated HTTP gateways or routers to configure, and won’t force you to use a bunch of other Google Cloud services. Call out to databases hosted elsewhere, respond to webhooks from SaaS platforms, or just serve up static sites. Use Cloud Run, and Cloud Run alone, and be happy.

And of course, you can use it along with other great cloud services. Tack on a Firestore database for a flexible storage option. Add a Memorystore caching layer. Take advantage of our global load balancer. Call models hosted in Vertex AI. If you’re using Cloud Run as part of an event-driven architecture, you might also use built-in connections to Eventarc to trigger Cloud Run services when interesting things happen in your account—think file uploaded to object storage, user role deleted, database backup completes.

Use it by itself or “with the cloud”, but either way, there’s value.

Docs:	Hosting webhooks targets, Connect to a Firestore database, Invoke services from Workflows
Code labs to try:	How to use Cloud Run functions and Gemini to summarize a text file uploaded to a Cloud Storage bucket

#4. Choose simple AND sophisticated configurations.

One reason PaaS-like services are so beloved is because they often provide a simple onramp without requiring tons of configuration. “cf push” to get an app to Cloud Foundry. Easy! Getting an app to Cloud Run is simple too. If you have a container, it’s a single command:

rseroter$ gcloud run deploy go-app --image=gcr.io/seroter-project-base/go-restapi

If all you have is source code, it’s also a single command:

rseroter$ gcloud run deploy node-app --source .

In both cases, the CLI asks me to pick a region and whether I want requests authenticated, and that’s it. Seconds later, my app is running.

This works because Cloud Run sets a series of smart, reasonable default settings.

But sometimes you do want more control over service configuration, and Cloud Run opens up dozens of possible settings. What kind of sophisticated settings do you have control over?

CPU allocation. Do you want CPU to be always on, or quit when idle?
Ingress controls. Do you want VPC-only access or public access?
Multi-container services. Add a sidecar.
Container port. The default is 8080, but set to whatever you want.
Memory. The default value is 512 MiB per instance, but you can go up to 32GB.
CPU. It defaults to 1, but you can go less than 1, or up to 8.
Healthchecks. Define startup or liveliness checks that ping specific endpoints on a schedule.
Variables and secrets. Define environment variables that get injected at runtime. Same with secrets that get mounted at runtime.
Persistent storage volumes. There’s ephemeral scratch storage in every Cloud Run instance, but you can also mount volumes from Cloud Storage buckets or NFS shares.
Request timeout. The default value is 5 minutes, but you can go up to 60 minutes.
Max concurrency. A given service instance can handle more than one request. The default value is 80, but you can go up to 1000!
and much more!

You can do something simple, you can do something sophisticated, or a bit of both.

Docs:	Configure container health checks, Maximum concurrent requests per instance, CPU allocation, Configure secrets, Deploying multiple containers to a service (sidecars)
Code labs to try:	How to use Ollama as a sidecar with Cloud Run GPUs and Open WebUI as a frontend ingress container

#5. Create public AND private services.

One of the challenge with early PaaS services was that they were just sitting on the public internet. That’s no good as you get to serious, internal-facing systems.

First off, Cloud Run services are public by default. You control the authentication level (anonymous access, or authenticated user) and need to explicitly set that. But the service itself is publicly reachable. What’s great is that this doesn’t require you to set up any weird gateways or load balancers to make it work. As soon as you deploy a service, you get a reachable address.

Awesome! Very easy. But what if you want to lock things down? This isn’t difficult either.

Cloud Run lets me specify that I’ll only accept traffic from my VPC networks. I can also choose to securely send messages to IPs within a VPC. This comes into play as well if you’re routing requests to a private on-premises network peered with a cloud VPC. We even just added support for adding Cloud Run services to a service mesh for more networking flexibility. All of this gives you a lot of control to create truly private services.

Docs:	Private networking and Cloud Run, Restrict network ingress for Cloud Run, Cloud Service Mesh
Code labs to try:	How to configure a Cloud Run service to access an internal Cloud Run service using direct VPC egress, Configure a Cloud Run service to access both an internal Cloud Run service and public Internet

#6. Scale to zero AND scale to 1.

I don’t necessarily believe that cloud is more expensive than on-premises—regardless of some well-publicized stories—but keeping idle cloud services running isn’t helping your cost posture.

Google Cloud Run truly scales to zero. If nothing is happening, nothing is running (or costing you anything). However, when you need to scale, Cloud Run scales quickly. Like, a-thousand-instances-in-seconds quickly. This is great for bursty workloads that don’t have a consistent usage pattern.

But you probably want the option to have an affordable way to keep a consistent pool of compute online to handle a steady stream of requests. No problem. Set the minimum instance to 1 (or 2, or 10) and keep instances warm. And, set concurrency high for apps that can handle it.

If you don’t have CPU always allocated, but keep a minimum instance online, we actually charge you significantly less for that “warm” instance. And you can apply committed use discounts when you know you’ll have a service running for a while.

Run bursty workloads or steadily-used workloads all in a single platform.

Docs:	About instance autoscaling in Cloud Run services, Set minimum instances, Load testing best practices
Code labs to try:	Cloud Run service with minimum instances

#7. Do one-off deploys AND set up continuous delivery pipelines.

I mentioned above that it’s easy to use a single command or single screen to get an app to Cloud Run. Go from source code or container to running app in seconds. And you don’t have to set up any other routing middleware or Cloud networking to get a routable serivce.

Sometimes you just want to do a one-off deploy without all the ceremony. Run the CLI, use the Console UI, and get on with life. Amazing.

But if that was your only option, you’d feel constrained. So you can use something like GitHub Actions to deploy to Cloud Run. Most major CI/CD products support it.

Another great option is Google Cloud Deploy. This managed service takes container artifacts and deploys them to Google Kubernetes Engine or Google Cloud Run. It offers some sophisticated controls for canary deploys, parallel deploys, post-deploy hooks, and more.

Cloud Deploy has built-in support for Cloud Run. A basic pipeline (defined in YAML, but also configured via point-and-click in the UI if you want) might show three stages for dev, test, and prod.

When the pipeline completes, we see three separate Cloud Run instances deployed, representing each stage of the pipeline.

You want something more sophisticated? Ok. Cloud Deploy supports Cloud Run canary deployments. You’d use this if you want a subset of traffic to go to the new instance before deciding to cut over fully.

This is taking advantage of Cloud Run’s built-in traffic management feature. When I check the deployed service, I see that after advancing my pipeline to 75% of production traffic for the new app version, the traffic settings are properly set in Cloud Run.

Serving traffic in multiple regions? Cloud Deploy makes it possible to ship a release to dozens of places simultaneously. Here’s a multi-target pipeline. The production stage deploys to multiple Cloud Run regions in the US.

When I checked Cloud Run, I saw instances in all the target regions. Very cool!

If you want a simple deploy, do that with the CLI or UI. Nothing stops you. However, if you’re aiming for a more robust deployment strategy, Cloud Run readily handles it through services like Cloud Deploy.

Docs:	Use a canary deployment strategy, Deploy to multiple targets at the same time, Deploying container images to Cloud Run
Code labs to try:	How to Deploy a Gemini-powered chat app on Cloud Run, How to automatically deploy your changes from GitHub to Cloud Run using Cloud Build

#8. Own aspects of security AND offload responsibility.

On reason that you choose managed compute platforms is to outsource operational tasks. It doesn’t mean you’re not capable of patching infrastructure, scaling compute nodes, or securing workloads. It means you don’t want to, and there are better uses of your time.

With Cloud Run, you can drive aspects of your security posture, and also let Cloud Run handle key aspects on your behalf.

What are you responsible for? You choose an authentication approach, including public or private services. This includes control of how you want to authenticate developers who use Cloud Run. You can authenticate end users, internal or external ones, using a handful of supported methods.

It’s also up to you to decide which service account the Cloud Service instance should impersonate. This controls what a given instance has access to. If you want to ensure that only containers with verified provenance get deployed, you can also choose to turn on Binary Authorization.

So what are you offloading to Cloud Run and Google Cloud?

You can outsource protection from DDoS and other threats by turning on Cloud Armor. The underlying infrastructure beneath Cloud Run is completely managed, so you don’t need to worry about upgrading or patching any of that. What’s also awesome is that if you deploy Cloud Run services from source, you can sign up for automatic base image updates. This means we’ll patch the OS and runtime of your containers. Importantly, it’s still up to you to patch your app dependencies. But this is still very valuable!

Docs:	Security design overview, Introduction to service identity, Use Binary Authorization. Configure automatic base image updates
Code labs to try:	How to configure a Cloud Run service to access an internal Cloud Run service using direct VPC egress, How to connect a Node.js application on Cloud Run to a Cloud SQL for PostgreSQL database

#9. Treat as post-build target AND as upfront platform choice.

You might just want a compute host for your finished app. You don’t want to have to pick that host up front, and just want a way to run your app. Fair enough! There aren’t “Cloud Run apps”; they’re just containers. That said, there are general tips that make an app more suitable for Cloud Run than not. But the key is, for modern apps, you can often choose to treat Cloud Run as a post-build decision.

Or, you can design with Cloud Run in mind. Maybe you want to trigger Cloud Run based on a specific Eventarc event. Or you want to capitalize on Cloud Run concurrency so you code accordingly. You could choose to build based on a specific integration provided by Cloud Run (e.g. Memorystore, Firestore, or Firebase Hosting).

There are times that you build with the target platform in mind. In other cases, you want a general purpose host. Cloud Run is suitable for either situation, which makes it feel unique to me.

Docs:	Optimize Java applications for Cloud Run, Integrate with Google Cloud products in Cloud Run, Trigger with events
Code labs to try:	Trigger Cloud Run with Eventarc events

#10. Rely on built-in SLOs, logs, metrics AND use your own observability tools.

If you want it to be, Cloud Run can feel like an all-in-one solution. Do everything from one place. That’s how classic PaaS was, and there was value in having a tightly-integrated experience. From within Cloud Run, you have built-in access to logs, metrics, and even setting up SLOs.

The metrics experience is powered by Cloud Monitoring. I can customize event types, the dashboards, time window, and more. This even includes the ability to set uptime checks which periodically ping your service and let you know if everything is ok.

The embedded logging experience is powered by Cloud Logging and gives you a view into all your system and custom logs.

We’ve even added an SLO capability where you can define SLIs based on availability, latency, or custom metrics. Then you set up service level objectives for service performance.

While all these integrations are terrific, you don’t have to only use this. You can feed metrics and logs into Datadog. Same with Dynatrace. You can also write out OpenTelemetry metrics or Prometheus metrics and consume those how you want.

Docs:

Monitor Health and Performance, Logging and viewing logs in Cloud Run, Using distributed tracing

Kubernetes, virtual machines, and bare metal boxes all play a key role for many workloads. But you also may want to start with the highest abstraction possible so that you can focus on apps, not infrastructure. IMHO, Google Cloud Run is the best around and satisfies the needs of most any modern web app. Give it a try!

September 9, 2024

4 ways to pay down tech debt by ruthlessly removing stuff from your architecture

What advice do you get if you’re lugging around a lot of financial debt? Many folks will tell you to start purging expenses. Stop eating out at restaurants, go down to one family car, cancel streaming subscriptions, and sell unnecessary luxuries. For some reason, I don’t see the same aggressive advice when it comes to technical debt. I hear soft language around “optimization” or “management” versus assertive stances that take a meat cleaver to your architectural excesses.

What is architectural debt? I’m thinking about bloated software portfolios where you’re carrying eight products in every category. Brittle automation that only partially works and still requires manual workarounds and black magic. Unique customizations to packaged software that’s now keeping you from being able to upgrade to modern versions. Also half-finished “ivory tower” designs where the complex distributed system isn’t fully in place, and may never be. You might have too much coupling, too little coupling, unsupported frameworks, and all sorts of things that make deployments slow, maintenance expensive, and wholesale improvements impossible.

This stuff matters. The latest StackOverflow developer survey shows that the most common frustration is the “amount of technical debt.” It’s wasting up to eight hours a week for each developer! Number two and three are around stack complexity. Your code and architectural tech debt is slowing down your release velocity, creating attrition with your best employees, and limiting how much you can invest in new tech areas. It’s well-past time to simplify by purging architecture components that have built up (and calcified) over time. Let’s write bigger checks to pay down this debt faster.

Explore these four areas, all focused on simplification. There are obviously tradeoffs and cost with each suggestion, but you’re not going to make meaningful progress by being timid. Note there are other dimensions to fixing tech debt besides simplification, but that’s one I see discussed the least often. I’ll use Google Cloud to offer some examples of how you might specifically tackle each, given we’re the best cloud for those making a firm shift away from legacy tech debt.

1. Stop moving so much data around.

If you zoom out on your architecture, how many components do you have that get data from point A to point B? I’d bet that you have lots of ETL pipelines to consolidate data into a warehouse or data lake, messaging and event processing solutions to shunt data around, and even API calls that suck data from one system into another. That’s a lot of machinery you have to create, update, and manage every day.

Can you get rid of some of this? Can you access more of the data where it rests, versus copying it all over the place? Or use software that act on data in different ways without forcing you to migrate it for further processing? I think so.

Let’s see some examples.

Perform analytical queries against data sitting in different places? Google Cloud supports that with BigQuery Omni. We run BigQuery in AWS and Azure so that you can access data at rest, and not be forced to consolidate it in a single data lake. Here, I have an Excel file sitting in an Azure blob storage account. I could copy that data over to Google Cloud, but that’s more components for me to create and manage.

Rather, I can set up a pointer to Azure from within BigQuery, and treat it like any other table. The data is processed in Azure, and only summary info travels across the wire.

You might say “that’s cool, but I have related data in another cloud, so I’d have to move it anyway to do joins and such.” You’d think so. But we also offer cross-cloud joins with BigQuery Omni. Check this out. I’ve got that employee data in Azure, but timesheet data in Google Cloud.

With a single SQL statement, I’m joining data across clouds. No data movement required. Less debt.

Enrich data in analytical queries from outside databases? You might have ETL jobs in place to bring reference data into your data warehouse to supplement what’s already there. That may be unnecessary.

With BigQuery’s Federated Queries, I can reach live into PostgreSQL, MySQL, Cloud Spanner, and even SAP Datasphere sources. Access data where it rests. Here, I’m using the EXTERNAL_QUERY function to retrieve data from a Cloud SQL database instance.

I could use that syntax to perform joins, and do all sorts of things without ever moving data around.

Perform complex SQL analytics against log data? Does your architecture have data copying jobs for operational data? Maybe to get it into a system where you can perform SQL queries against logs? There’s a better way.

Google Cloud Log Analytics lets you query, view, and analyze log data without moving it anywhere.

You can’t avoid moving data around. It’s often required. But I’m fairly sure that through smart product selection and some redesign of the architecture, you could eliminate a lot of unnecessary traffic.

2. Compress the stack by removing duplicative components.

Break out the chainsaw. Do you have multiple products for each software category? Or too many fine-grained categories full of best-of-breed technology? It’s time to trim.

My former colleague Josh McKenty used to say something along the lines of “if it’s emerging buy a few, it’s a mature, no more than two.”

You don’t need a dozen project management software products. Or more than two relational database platforms. In many cases, you can use multi-purpose services and embrace “good enough.”

There should be a fifteen day cooling off period before you buy a specialized vector database. Just use PostgreSQL. Or, any number of existing databases that now support vector capabilities. Maybe you can even skip RAG-based solutions (and infrastructure) all together for certain use cases and just use Gemini with its long context.

Do you have a half-dozen different event buses and stream processors? Maybe you don’t need all that? Composite services like Google Cloud Pub/Sub can be a publish/subscribe message broker, apply a log-like approach with a replay-able stream, and do push-based notifications.

You could use Spanner Graph instead of a dedicated graph database, or Artifact Registry as a single place for OS and application packages.

I’m keen on the new continuous queries for BigQuery where you can do stream analytics and processing as data comes into the warehouse. Enrich data, call AI models, and more. Instead of a separate service or component, it’s just part of the BigQuery engine. Turn off some stuff?

I suspect that this one is among the hardest for folks to act upon. We often hold onto technology because it’s familiar, or even because of misplaced loyalty. But be bold. Simplify your stack by getting rid of technology that’s no longer differentiated. Make a goal of having 30% fewer software products or platforms in your architecture in 2025.

3. Replace hyper-customized software and automation with managed services and vanilla infrastructure.

Hear me out. You’re not that unique. There are a handful of things that your company does which are the “secret sauce” for your success, and the rest is the same as everyone else.

More often than not, you should be fitting your team to the software, not your software to the team. I’ve personally configured and extended packaged software to a point that it was unrecognizable. For what? Because we thought our customer service intake process was SO MUCH different than anyone else’s? It wasn’t. So much tech debt happens because we want to shape technology to our existing requirements, or we want to avoid “lock-in” by committing to a vendor’s way of doing things. I think both are misguided.

I read a lot of annual reports from public companies. I’ve never seen “we slayed at Kubernetes this year” called out. Nobody cares. A cleverly scripted, hyper-customized setup that looks like the CNCF landscape diagram is more boat anchor than accelerator. Consider switching a fully automated managed cluster in something like GKE Autopilot. Pay per pod, and get automatic upgrades, secure-by-default configurations, and a host of GKE Enterprise features to create sameness across clusters.

Or thank-and-retire that customized or legacy workflow engine (code framework, or software product) that only four people actually understand. Use a nicely API-enabled managed product with useful control-flow actions, or a full-fledged cloud-hosted integration engine.

You probably don’t need a customized database, caching solution, or even CI/CD stack. These are all super mature solution spaces, where whatever is provided out of the box is likely suitable for what you really need.

4. Tone it down on the microservices and distributed systems.

Look, I get excited about technology and want to use all the latest things. But it’s often overkill, especially in the early (or late) stages of a product.

You simply don’t need a couple dozen serverless functions to serve a static web app. Simmer down. Or a big complex JavaScript framework when your site has a pair of pages. So much technical debt comes from over-engineering systems to use the latest patterns and technology, when the classic ones will do.

Smash most of your serverless functions back into an “app” hosted in Cloud Run. Fewer moving parts, and all the agility you want. Use vanilla JavaScript where you can. Use small, geo-located databases until you MUST to do cross-region or global replication. Don’t build “developer platforms” and IDPs until you actually need them.

I’m not going all DHH on you, but most folks would be better off defaulting to more monolithic systems running on a server or two. We’ve all over-distributed too many services and created unnecessarily complex architectures that are now brittle or impossible to understand. If you need the scale and resilience of distributed systems RIGHT NOW then go build one. But most of us have gotten burned from premature optimization because we assumed that our system had to handle 100x user growth overnight.

Wrap Up

Every company has tech debt, whether the business is 100 years old or started last week. Google has it, big banks have it, the governments have it, and YC companies have it. And “managing it” is probably a responsible thing to do. But sometimes, when you need to make a step-function improvement in how you work, incremental changes aren’t good enough. Simplify by removing the cruft, and take big cuts out of your architecture to do it!

August 27, 2024
Want to externalize app configuration with Spring Cloud Config and Google Cloud Secret Manager? Now you can.
You’re familiar with twelve-factor apps? This relates to a set of principles shared by Heroku over a decade ago. The thinking goes, if your app adheres to these principles, it’s more likely to be scalable, resilient, and portable. While twelve-factor apps were introduced before Docker, serverless, or mainstream cloud adoption were a thing, I think these principles remain relevant in 2022. One of those principles relates to externalizing your configuration so that environment-related settings aren’t in code. Spring Cloud Config is a fun project that externalizes configurations for your (Java) app. It operates as a web server that serves up configurations sourced from a variety of places including git repos, databases, Vault, and more. A month ago, I saw a single-line mention in the Spring Cloud release notes that said Spring Cloud Config now integrates with Google Cloud Secret Manager. No documentation or explanation of how to use this feature? CHALLENGE ACCEPTED.

To be sure, a Spring Boot developer can easily talk to Google Cloud Secret Manager directly. We already have a nice integration here. Why add the Config Server as an intermediary? One key reason is to keep apps from caring where the configs come from. A (Spring Boot) app just needs to make an HTTP request or use the Config Client to pulls configs, even if they came from GitHub, a PostgreSQL database, Redis instance, or Google Cloud Secret Manager. Or any combination of those. Let’s see what you think once we’re through.

Setting up our config sources

Let’s pull configs from two different places. Maybe the general purpose configuration settings are stored in git, and the most sensitive values are stored in Secret Manager.

My GitHub repo has a flat set of configuration files. The Spring Cloud Config Server reads all sorts of text formats. In this case, I used YAML. My “app1” has different configs for the “dev” and “qa” environments, as determined by their file names.

Secret Manager configs work a bit differently than git-based ones. The Spring Cloud Config Server uses the file name in a git repo to determine the app name and profile (e.g. “app1-qa.yml”) and makes each key/value pair in that file available to Spring for binding to variables. So from the image above, those three properties are available to any instance of “app1” where the Spring profile is set to “qa.” Secret Manager itself is really a key/value store. So the secret name+value is what is available to Spring. The “app” and “profile” come from the labels attached to the secret. Since you can’t have two secrets with the same name, if you want one secret for “dev” and one for “qa”, you need to name them differently. So, using the Cloud Code extension for VS Code, I created three secrets.

Two of the secrets (connstring-dev, connstring-qa) hold connection strings for their respective environments, and the other secret (serviceaccountcert) only applies to QA, and has the corresponding label values.

Ok, so we have all our source configs. Now to create the server that swallows these up and flattens the results for clients.

Creating and testing our Spring Cloud Config Server

Creating a Spring Cloud Config Server is very easy. I started at the Spring Intializr site to bootstrap my application. In fact, you can click this link and get the same package I did. My dependencies are on the Actuator and Config Server.

The Google Cloud Secret Manager integration was added to the core Config Server project, so there’s config-specific dependency to add. It does appear you need to add a reference to the Secret Manager package to enable connectivity and such. I added this to my POM file.
```
<dependency>
		<groupId>com.google.cloud</groupId>
		<artifactId>google-cloud-secretmanager</artifactId>
		<version>1.0.1</version>
</dependency>
```
There’s no new code required to get a Spring Cloud Config Server up and running. Seriously. You just add an annotation (@EnableConfigServer) to the primary class.
```
@EnableConfigServer
@SpringBootApplication
public class BootConfigServerGcpApplication {

	public static void main(String[] args) {
		SpringApplication.run(BootConfigServerGcpApplication.class, args);
	}
}
```
The final step is to add some settings. I created an application.yaml file that looks like this:
```
server:
  port: ${PORT:8080}
spring:
  application:
    name: config-server
  profiles:
    active:
      secret-manager, git
  cloud:
    config:
      server:
        gcp-secret-manager:
          #application-label: application
          #profile-label: profile
          token-mandatory: false
          order: 1
        git:
          uri: https://github.com/rseroter/spring-cloud-config-gcp
          order: 2
```
Let’s unpack this. First I set the port to whatever the environment provides, or 8080. I’m setting two active profiles here, so that I activate the Secret Manager and git environments. For the “gcp-secret-manager” block, you see I have the option to set the label values to designate the application and profile. If I wanted to have my secret with a label “appname:app1” then I’d set the application-label property here to “appname.” Make sense? I fumbled around with this for a while until I understood it. And notice that I’m pointing at the GitHub repo as well.

One big thing to be aware of on this Secret Manager integration with Config Server. Google Cloud has the concept of “projects.” It’s a key part of an account hierarchy. You need to provide the project ID when interacting with the Google Cloud API. Instead of accepting this as a setting, the creators of the Secret Manager integration look up the value using a metadata service that only works when the app is running in Google Cloud. It’s a curious design choice, and maybe I’ll submit an issue or pull request to make that optional. In the meantime, it means you can’t test locally; you need to deploy the app to Google Cloud.

Fortunately, Google Cloud Run, Secret Manager, and Artifact Registry (for container storage) are all part of our free tier. If you’re logged into the gcloud CLI, all you have to do is type gcloud run deploy and we take your source code, containerize it using buildpacks, add it to Artifact Registry, and deploy a Cloud Run instance. Pretty awesome.

After a few moments, I have a serverless container running Spring middleware. I can scale to zero, scale to 1, handle concurrent requests, and maybe pay zero dollars for it all.

Let’s test this out. We can query a Config Server via HTTP and see what a Spring Boot client app would get back. The URL contains the address of the server and path entries for the app name and profile. Here’s the query for app1 and the dev profile.

See that our config server found two property sources that matched a dev profile and app1. This gives a total of three properties for our app to use.

Let’s swap “dev” for “qa” in the path and get the configurations for the QA environment.

The config server used different sources, and returns a total of five properties that our app can use. Nice!

Creating and testing our config client

Consuming these configurations from a Spring Boot app is simple as well. I returned to the Spring Initializr site and created a new web application that depends on the Actuator, Web, and Config Client packages. You can download this starter project here.

My demo-quality code is basic. I annotated the main class as a @RestController, exposed a single endpoint at the root, and returned a couple of configuration values. Since the “dev” and “qa” connection strings have different configuration names—remember, I can’t have two Secrets with the same name—I do some clunky work to choose the right one.
```
@RestController
@SpringBootApplication
public class BootConfigClientGcpApplication {

	public static void main(String[] args) {
		SpringApplication.run(BootConfigClientGcpApplication.class, args);
	}

	@Value("${appversion}")
	String appVersion;

	@Value("${connstring-dev:#{null}}")
	String devConnString;

	@Value("${connstring-qa:#{null}}")
	String qaConnString;

	@GetMapping("/")
	public String getData() {
		String secret;
		secret = (devConnString != null) ? devConnString : qaConnString;
		return String.format("version is %s and secret is %s",appVersion, secret);
	}
}
```
The application.yaml file for this application has a few key properties. First, I set the spring.application.name, which tells the Config Client which configuration properties to retrieve. It’ll query for those assigned to “app1”. Also note that I set the profile to “dev”, which also impacts the query. And, I’m exposing the “env” endpoint of the actuator, which lets me peek at all the environment variables available to my application.
```
server:
  port: 8080
management:
  endpoints:
    web:
      exposure:
        include: env
spring:
  application:
    name: app1
  profiles:
    active: dev
  config:
    import: configserver:https://boot-config-server-gcp-ofanvtevaa-uw.a.run.app
```
Ok, let’s run this. I can do it locally, since there’s nothing that requires this app to be running in any particular location.

Cool, so it returned the values associated with the “dev” profile. If I stop the app, switch the spring.profiles.active to “qa” and restart, I get different property values.

So the Config Client in my application is retrieving configuration properties from the Config Server, and my app gets whatever values make sense for a given environment with zero code changes. Nice!

If we want, we can also check out ALL the environment variables visible to the client app. Just send a request to the /actuator/env endpoint and observe.

Summary

I like Spring Cloud Config. It’s a useful project that helps devs incorporate the good practice of externalizing configuration. If you want a bigger deep-dive into the project, check out my new Pluralsight course that covers it.

Also, take a look at Google Cloud Run as a legit host for your Spring middleware and apps. Instead of over-provisioning VMs, container clusters, or specialized Spring runtimes, use a cloud service that scales automatically, offers concurrency, supports private traffic, and is pay-for-what-you-use.
January 18, 2022
Learn all about building and coordinating Java microservices with my two updated Pluralsight courses about Spring Cloud
Java retains its stubborn hold near the top of every language ranking. Developers continue to depend on it for all sorts of application types. Back when I joined Pivotal in 2016, I figured the best way to learn their flagship Java framework, Spring, was to teach a course about it. I shipped a couple of Pluralsight courses that have done well over the years, but feel dated. Spring Boot and Spring Cloud keep evolving, as good frameworks do. So, I agreed to refresh both courses, which in reality, meant starting over. The result? Forty new demos, a deep dive into thirteen Spring (Cloud) projects, and seven hours of family-friendly content.

Spring Cloud is a set of Spring projects for developers who want to introduce distributed systems patterns into their apps. You can use these projects to build new microservices, or connect existing ones. Both of these areas are where I focused the courses.

Java Microservices with Spring Cloud: Developing Services looks at four projects that help you build new services. We dig into:
- Spring Cloud Config as a way to externalize configuration into a remote store. This is a pretty cool project that lets you stash and version config values in repositories like Git, Vault, databases, or public cloud secret stores. In my course, we use GitHub and show how Spring transparently caches and refreshes these values for your app to use.
- Spring Cloud Function offers a great option for those developing event-driven, serverless-style apps. This is a new addition to the course, as it didn’t exist when I created the first version. But as serverless, and event-driven architectures, continue to take off, I thought it was important to add it. I throw in a bonus example of deploying one of these little fellas to a public cloud FaaS platform.
- Spring Security. I have a confession. I don’t think I really understood how OAuth 2.0 worked until rebuilding this module of the course. The lightbulb finally went off as I set up and used Keycloak as an authorization server, and grokked what was really happening. If you’d like that same epiphany, watch this one!
- Spring Cloud Sleuth. When building modern services, you can’t forget to build in some instrumentation. Sleuth does a pretty remarkable job of instrumenting most everything in your app, and offering export to something like Zipkin.
The second course is Java Microservices with Spring Cloud: Coordinating Services where we explore projects that make it easier to connect, route, and compose microservices. We spend time with:
- Spring Cloud Eureka. This rock-solid framework is still plugging away, almost a decade after Netflix created it. It offers a mature way to easily register and discover services in your architecture. It’s lightweight, and also uses local caching so that there’s no SPOF to freak out about.
- Spring Cloud Circuit Breaker. This one is also new since my first run through the course. It replaced Hystrix, which is end of life. This project is an abstraction atop libraries like Resilience4j, Spring Retry, and Sentinel. Here, we spend time with Resilience4j and show how to configure services to fail fast, use sliding windows to determine whether to open a circuit, and more.
- Spring Cloud LoadBalancer and Spring Cloud Gateway. Ribbon is also a deprecated projected, so instead, I introduced LoadBalancer. This components works great with service discovery to do client-side load balancing. And Spring Cloud Gateway is an intriguing project that gives you lightweight, modular API gateway functionality for your architecture. We have fun with both of those here.
- Spring Cloud Stream. This is probably my favorite Spring Cloud project because I’m still a messaging geek at heart. With the new functional interface (replacing the annotation-based model in my earlier course), it’s stupid-easy to build functions that publish to message brokers or receive messages. Talking to Apache Kafka, RabbitMQ, or public cloud messaging engines has never been easier.
- Spring Cloud Data Flow. Stitch a bunch of streaming (or batch-processing) apps into a data processing pipeline? That’s what Spring Cloud Data Flow is all about. Running atop modern platforms like Kubernetes, it’s a portable orchestration engine. Here we build pipelines, custom apps for our pipelines, and more.
It took me about 5 months to rebuild, record, and edit these courses, and I’m proud of the result. I think you’ll enjoy this look through an exciting, fun-to-use set of components made by my friends on the Spring team.
December 14, 2021
Exploring a fast inner dev loop for Spring Boot apps targeting Google Cloud Run
It’s a gift to the world that no one pays me to write software any longer. You’re welcome. But I still enjoy coding and trying out a wide variety of things. Given that I rarely have hours upon hours to focus on writing software, I seek things that make me more productive with the time I have. My inner development loop matters. You know, the iterative steps we perform to write, build, test, and commit code.

So let’s say I want to build a REST API in Java. This REST API stores and returns the names of television characters. What’s the bare minimum that I need to get going?
- An IDE or code editor
- A database to store records
- A web server to host the app
- A route to reach the app
What are things I personally don’t want to deal with, especially if I’m experimenting and learning quickly?
- Provisioning lots of infrastructure. Either locally to emulate the target platform, or elsewhere to actually run my app. It takes time, and I don’t know what I need.
- Creating database stubs or mocks, or even configuring Docker containers to stand-in for my database. I want the real thing, if possible.
- Finding a container registry to use. All this stuff just needs to be there.
- Writing Dockerfiles to package an app. I usually get them wrong.
- Configuring API gateways or network routing rules. Just give me an endpoint.
Based on this, one of the quickest inner loop I know of involves Spring Boot, the Google Cloud SDK, Cloud Firestore, and Google Cloud Run. Spring Boot makes it easy to spin up API projects and it’s ORM capabilities make it simple to interact with a database. Speaking of databases, Cloud Firestore is powerful and doesn’t force me into a schema. That’s great when I don’t know the final state of my data structure. And Cloud Run seems like the single best way to run custom-built apps in the cloud. How about we run through this together?

On my local machine, I’ve installed Visual Studio Code—the FASTEST possible inner loop might have involved using the Google Cloud Shell and skipping any local work, but I still like doing local dev—along with the latest version of Java, and the Google Cloud SDK. The SDK comes with lots of CLI tools and emulators, including one for Firestore and Datastore (an alternate API).

Time to get to work. I visited start.spring.io to generate a project. I could choose a few dependencies from the curated list, including a default one for Google Cloud services, and another for exposing my data repository as a series of REST endpoints.

I generated the project, and opened it in Visual Studio Code. Then, I opened the pom.xml file and added one more dependency. While I’m using the Firestore database, I’m using it in “Datastore mode” which works better with Spring Data REST. Here’s my finished pom file.
```
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
	<modelVersion>4.0.0</modelVersion>
	<parent>
		<groupId>org.springframework.boot</groupId>
		<artifactId>spring-boot-starter-parent</artifactId>
		<version>2.4.4</version>
		<relativePath/> 
	</parent>
	<groupId>com.seroter</groupId>
	<artifactId>boot-gcp-run-firestore</artifactId>
	<version>0.0.1-SNAPSHOT</version>
	<name>boot-gcp-run-firestore</name>
	<description>Demo project for Google Cloud and Spring Boot</description>
	<properties>
		<java.version>11</java.version>
		<spring-cloud-gcp.version>2.0.0</spring-cloud-gcp.version>
		<spring-cloud.version>2020.0.2</spring-cloud.version>
	</properties>
	<dependencies>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-data-rest</artifactId>
		</dependency>
		<dependency>
			<groupId>com.google.cloud</groupId>
			<artifactId>spring-cloud-gcp-starter</artifactId>
		</dependency>
		<dependency>
			<groupId>com.google.cloud</groupId>
			<artifactId>spring-cloud-gcp-starter-data-datastore</artifactId>
			<version>2.0.2</version>
		</dependency>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-test</artifactId>
			<scope>test</scope>
		</dependency>
	</dependencies>
	<dependencyManagement>
		<dependencies>
			<dependency>
				<groupId>org.springframework.cloud</groupId>
				<artifactId>spring-cloud-dependencies</artifactId>
				<version>${spring-cloud.version}</version>
				<type>pom</type>
				<scope>import</scope>
			</dependency>
			<dependency>
				<groupId>com.google.cloud</groupId>
				<artifactId>spring-cloud-gcp-dependencies</artifactId>
				<version>${spring-cloud-gcp.version}</version>
				<type>pom</type>
				<scope>import</scope>
			</dependency>
		</dependencies>
	</dependencyManagement>

	<build>
		<plugins>
			<plugin>
				<groupId>org.springframework.boot</groupId>
				<artifactId>spring-boot-maven-plugin</artifactId>
			</plugin>
		</plugins>
	</build>

</project>
```
Let’s sling a little code, shall we? Spring Boot almost makes this too easy. First, I created a class to describe a “character.” I started with just a couple of characteristics—full name, and role.
```
package com.seroter.bootgcprunfirestore;

import com.google.cloud.spring.data.datastore.core.mapping.Entity;
import org.springframework.data.annotation.Id;

@Entity
class Character {

    @Id
    private Long id;
    private String FullName;
    private String Role;
    
    public String getFullName() {
        return FullName;
    }
    public String getRole() {
        return Role;
    }
    public void setRole(String role) {
        this.Role = role;
    }
    public void setFullName(String fullName) {
        this.FullName = fullName;
    }
}
```
All that’s left is to create a repository resource and Spring Data handles the rest. Literally!
```
package com.seroter.bootgcprunfirestore;

import com.google.cloud.spring.data.datastore.repository.DatastoreRepository;
import org.springframework.data.rest.core.annotation.RepositoryRestResource;

@RepositoryRestResource
interface CharacterRepository extends DatastoreRepository<Character, Long> {
    
}
```
That’s kinda it. No other code is needed. Now I want to test it out and see if it works. The first option is to spin up an instance of the Datastore emulator—not Firestore since I’m using the Datastore API—when my app starts. That’s handy. It’s one line in my app.properties file.
```
spring.cloud.gcp.datastore.emulator.enabled=true
```
When I execute ./mvnw spring-boot:run I see the app compile, and get a notice that the Datastore emulator was started up. I went to Postman to call the API. First I added a record.

Then I called the endpoint to retrieve the store data. It worked. It’s great that Spring Data REST wires up all these endpoints automatically.

Now, I really like that I can start up the emulator as part of the build. But, that instance is ephemeral. When I stop running the app locally, my instance goes away. What if my inner loop involves constantly stopping the app to make changes, recompile, and start up again? Don’t worry. It’s also easy to stand up the emulator by itself, and attach my app to it. First, I ran gcloud beta emulators datastore start to get the local instance running in about 2 seconds.

Then I updated my app.properties file by commenting out the statement that enables local emulation, and replacing with this statement that points to the emulator:
```
spring.cloud.gcp.datastore.host=localhost:8081
```
Now I can start and stop the app as much as I want, and the data persists. Both options are great, depending on how you’re doing local development.

Let’s deploy. I wanted to see this really running, and iterate further after I’m confident in how it behaves in a production-like environment. The easiest option for any Spring Boot developer is Cloud Run. It’s quick, it’s serverless, and we support buildpacks, so you never need to see a container.

I issued a single CLI command—gcloud beta run deploy boot-app --memory=1024 --source .— to package up my app and get it to Cloud Run.

After a few moments, I had a container in the registry, and an instance of Cloud Run. I don’t have to do any other funny business to reach the endpoint. No gateways, proxies, or whatever. And everything is instantly wired up to Cloud Logging and Cloud Monitoring for any troubleshooting. And I can provision up to 8GB of RAM and 4 CPUs, while setting up to 250 concurrent connections per container, and 1000 maximum instances. There’s a lot you can run with that horsepower.

I pinged the public endpoint, and sure enough, it was easy to publish and retrieve data from my REST API …

… and see the data sitting in the database!

When I saw the results, I realized I wanted more data fields in here. No problem. I went back to my Spring Boot app, and added a new field, isHuman. There are lots of animals on my favorite shows.

This time when I deployed, I chose the “no traffic” flag—cloud beta run deploy boot-app --memory=1024 --source . --no-traffic—so that I could control who saw the new field. Once it deployed, I saw two “revisions” and had the ability to choose the amount of traffic to send to each.

I switched 50% of the traffic to the new revision, liked what I saw, and then flipped it to 100%.

So there you go. It’s possible to fly through this inner loop in minutes. Because I’m leaning on managed serverless technologies for things like application runtime and database, I’m not wasting any time building or managing infrastructure. The local dev tooling from Google Cloud is terrific, so I have easy use of IDE integrations, emulators and build tools. This stack makes it simple for me to iterate quickly, cheaply, and with tech that feels like the future, versus wrestling with old stuff that’s been retrofitted for today’s needs.
April 13, 2021
How GitOps and the KRM make multi-cloud less scary.
I’m seeing the usual blitz of articles that predict what’s going to happen this year in tech. I’m not smart enough to make 2021 predictions, but one thing that seems certain is that most every company is deploying more software to more places more often. Can we agree on that? Companies large and small are creating and buying lots of software. They’re starting to do more continuous integration and continuous delivery to get that software out the door faster. And yes, most companies are running that software in multiple places—including multiple public clouds.

So we have an emerging management problem, no? How do I create and maintain software systems made up of many types of components—virtual machines, containers, functions, managed services, network configurations—while using different clouds? And arguably the trickiest part isn’t building the system itself, but learning and working within each cloud’s tenancy hierarchy, identity system, administration tools, and API model.

Most likely, you’ll use a mix of different build orchestration tools and configuration management tools based on each technology and cloud you’re working with. Can we unify all of this without forcing a lowest-common-denominator model that keeps you from using each cloud’s unique stuff? I think so. In this post, I’ll show an example of how to provision and manage infrastructure, apps, and managed services in a consistent way, on any cloud. As a teaser for what we’re building here, see that we’ve got a GitHub repo of configurations, and 1st party cloud managed services deployed and configured in Azure and GCP as a result.

Before we start, let’s define a few things. GitOps—a term coined by Alexis and championed by the smart folks at Weaveworks—is about declarative definitions of infrastructure, stored in a git repo, and constantly applied to the environment so that you remain in the desired state.

@kelseyhightower discusses all things #GitOps with a personal twist in his latest talk delivered at @GitHub Universe 2020. If you're looking for a clear explanation of GitOps that includes an equally simple demo, this is it. https://t.co/umAqItpnsW
— Weaveworks (@weaveworks) January 6, 2021

Next, let’s talk about the Kubernetes Resource Model (KRM). In Kubernetes, you define resources (built in, or custom) and the system uses controllers to create and manage those resources. It treats configurations as data without forcing you to specify *how* to achieve your desired state. Kubernetes does that for you. And this model is extendable to more than just containers!

Infrastructure-as-Code isn't enough. There are challenges to an imperative model for building infrastructure.

Configuration as Data, based on the Kubernetes Resource Model, looks like the future. @kelseyhightower and @markbalch make a very strong case: https://t.co/7v5402wfFg
— Richard Seroter (@rseroter) November 19, 2020

The final thing I want you to know about is Google Cloud Anthos. That’s what’s tying all this KRM and GitOps stuff together. Basically, it’s a platform designed to create and manage distributed Kubernetes clusters that are consistent, connected, and application ready. There are four capabilities you need to know to grok this KRM/GitOps scenario we’re building:
1. Anthos clusters and the cloud control plane. That sounds like the title of a terrible children’s book. For tech folks, it’s a big deal. Anthos deploys GKE clusters to GCP, AWS, Azure (in preview), vSphere, and bare metal environments. These clusters are then visible to (and configured by) a control plane in GCP. And you can attach any existing compliant Kubernetes cluster to this control plane as well.
2. Config Connector. This is a KRM component that lets you manage Google Cloud services as if they were Kubernetes resources—think BigQuery, Compute Engine, Cloud DNS, and Cloud Spanner. The other hyperscale clouds liked this idea, and followed our lead by shipping their own flavors of this (Azure version, AWS version).
3. Environs. These are logical groupings of clusters. It doesn’t matter where the clusters physically are, and which provider they run on. An environ treats them all as one virtual unit, and lets you apply the same configurations to them, and join them all to the same service mesh. Environs are a fundamental aspect of how Anthos works.
4. Config Sync. This Google Cloud components takes git-stored configurations and constantly applies them to a cluster or group of clusters. These configs could define resources, policies, reference data, and more.
Now we’re ready. What are we building? I’m going to provision two Anthos clusters in GCP, then attach an Azure AKS cluster to that Anthos environ, apply a consistent configuration to these clusters, install the GCP Config Connector and Azure Service Operators into one cluster, and use Config Sync to deploy cloud managed services and apps to both clouds. Why? Once I have this in place, I have a single way to create managed services or deploy apps to multiple clouds, and keep all these clusters identically configured. Developers have less to learn, operators have less to do. GitOps and KRM, FTW!

Step 1: Create and Attach Clusters

I started by creating two GKE clusters in GCP. I can do this via the Console, CLI, Terraform, and more. Once I created these clusters (in different regions, but same GCP project), I registered both to the Anthos control plane. In GCP, the “project” (here, seroter-anthos) is also the environ.

Next, I created a new AKS cluster via the Azure Portal.

In 2020, our Anthos team added the ability to attach existing clusters an an Anthos environ. Before doing anything else, I created a new minimum-permission GCP service account that the AKS cluster would use, and exported the JSON service account key to my local machine.

From the GCP Console, I followed the option to “Add clusters to environ” where I provided a name, and got back a single command to execute against my AKS cluster. After logging into my AKS cluster, I ran that command—which installs the Connect agent—and saw that the AKS cluster connected successfully to Anthos.

I also created a service account in my AKS cluster, bound it to the cluster-admin role, and grabbed the password (token) so that GCP could log into that cluster. At this point, I can see the AKS cluster as part of my environ.

You know what’s pretty awesome? Once this AKS cluster is connected, I can view all sorts of information about cluster nodes, workloads, services, and configurations. And, I can even deploy workloads to AKS via the GCP Console. Wild.

But I digress. Let’s keep going.

Step 2: Instantiate a Git Repo

GitOps requires … a git repo. I decided to use GitHub, but any reachable git repository works. I created the repo via GitHub, opened it locally, and initialized the proper structure using the nomos CLI. What does a structured repo look like and why does the structure matter? Anthos Config Management uses this repo to figure out the clusters and namespaces for a given configuration. The clusterregistry directory contains ClusterSelectors that let me scope configs to a given cluster or set of clusters. The cluster directory holds any configs that you want applied to entire clusters versus individual namespaces. And the namespaces directory holds configs that apply to a specific namespace.

Now, I don’t want all my things deployed to all the clusters. I want some namespaces that span all clusters, and others that only sit in one cluster. To do this, I need ClusterSelectors. This lets me define labels that apply to clusters so that I can control what goes where.

For example, here’s my cluster definition for the AKS cluster (notice the “name” matches the name I gave it in Anthos) that applies an arbitrary label called “cloud” with a value of “azure.”
```
kind: Cluster
apiVersion: clusterregistry.k8s.io/v1alpha1
metadata:
  name: aks-cluster-1
  labels:
    environment: prod
    cloud: azure
```
And here’s the corresponding ClusterSelector. If my namespace references this ClusterSelector, it’ll only apply to clusters that match the label “cloud: azure.”
```
kind: ClusterSelector
apiVersion: configmanagement.gke.io/v1
metadata:
    name: selector-cloud-azure
spec:
    selector:
        matchLabels:
            cloud: azure
```
After creating all the cluster definitions and ClusterSelectors, I committed and published the changes. You can see my full repo here.

Step 3: Install Anthos Config Management

The Anthos Config Management (ACM) subsystem lets you do a variety of things such as synchronize configurations across clusters, apply declarative policies, and manage a hierarchy of namespaces.

Enabling and installing ACM on GKE clusters and attached clusters is straightforward. First, we need credentials to talk to our git repo. One option is to use an SSH keypair. I generated a new keypair, and added the public key to my GitHub account. Then, I created a secret in each Kubernetes cluster that references the private key value.
```
kubectl create ns config-management-system && \
kubectl create secret generic git-creds \
  --namespace=config-management-system \
  --from-file=ssh="[/path/to/KEYPAIR-PRIVATE-KEY-FILENAME]"
```
With that done, I went through the GCP Console (or you can do this via CLI) to add ACM to each cluster. I chose to use SSH as the authentication mechanism, and then pointed to my GitHub repo.

After walking through the GKE clusters, I could see that ACM was installed and configured. Then I installed ACM on the AKS cluster too, all from the GCP Console.

With that, the foundation of my multi-cloud platform was all set up.

Step 4: Install Config Connector and Azure Service Operator

As mentioned earlier, the Config Connector helps you treat GCP managed services like Kubernetes resources. I only wanted the Config Connector on a single GKE cluster, so I went to gke-cluster-2 in the GCP Console and “enabled” Workload Identity and the Config Connector features. Workload Identity connects Kubernetes service accounts to GCP identities. It’s pretty cool. I created a new service account (“seroter-cc”) that Config Connector would use to create managed services.

To confirm installation, I ran a “kubectl get crds” command to see all the custom resources added by the Config Connector.

There’s only one step to configure the Config Connector itself. I created a single configuration that referenced the service account and GCP project used by Config Connector.
```
# configconnector.yaml
apiVersion: core.cnrm.cloud.google.com/v1beta1
kind: ConfigConnector
metadata:
  # the name is restricted to ensure that there is only one
  # ConfigConnector instance installed in your cluster
  name: configconnector.core.cnrm.cloud.google.com
spec:
 mode: cluster
 googleServiceAccount: "seroter-cc@seroter-anthos.iam.gserviceaccount.com"
```
I ran “kubectl apply -f configconnector.yaml” for the configuration, and was all set.

Since I also wanted to provision Microsoft Azure services using the same GitOps + KRM mechanism, I installed the Azure Service Operators. This involved installing a cert manager, installing Helm, creating an Azure Service Principal (that has rights to create services), and then installing the operator.

Step 5: Check-In Configs to Deploy Managed Services and Applications

The examples for the Config Connector and Azure Service Operator talk about running “kubectl apply” for each service you want to create. But I want GitOps! So, that means setting up git directories that hold the configurations, and relying on ACM (and Config Sync) to “apply” these configurations on the target clusters.

I created five namespace directories in my git repo. The everywhere-apps namespace applies to every cluster. The gcp-apps namespace should only live on GCP. The azure-apps namespace only runs on Azure clusters. And the gcp-connector and azure-connector namespaces should only live on the cluster where the Config Connector and Azure Service Operator live. I wanted something like this:

How do I create configurations that make that above image possible? Easy. Each “namespace” directory in the repo has a namespace.yaml file. This file provides the name of the namespace, and optionally, annotations. The annotation for the gcp-connector namespace used the ClusterSelector that only applied to gke-cluster-2. I also added a second annotation that told the Config Connector which GCP project hosted the generated managed services.
```
apiVersion: v1
kind: Namespace
metadata:
  name: gcp-connector
  annotations:
    configmanagement.gke.io/cluster-selector: selector-specialrole-connectorhost
    cnrm.cloud.google.com/project-id: seroter-anthos
```
I added namespace.yaml files for each other namespace, with ClusterSelector annotations on all but the everywhere-apps namespace, since that one runs everywhere.

Now, I needed the actual resource configurations for my cloud managed services. In GCP, I wanted to create a Cloud Storage bucket. With this “configuration as data” approach, we just define the resource, and ask Anthos to instantiate and manage it. The Cloud Storage configuration looks like this:
```
  apiVersion: storage.cnrm.cloud.google.com/v1beta1
  kind: StorageBucket
  metadata:
    annotations:
      cnrm.cloud.google.com/project-id : seroter-anthos
      #configmanagement.gke.io/namespace-selector: config-supported
    name: seroter-config-bucket
  spec:
    lifecycleRule:
      - action:
          type: Delete
        condition:
          age: 7
    uniformBucketLevelAccess: true
```
The Azure example really shows the value of this model. Instead of programmatically sequencing the necessary objects—first create a resource group, then a storage account, then a storage blob—I just need to define those three resources, and Kubernetes reconciles each resource until it succeeds. The Storage Blob resource looks like:
```
apiVersion: azure.microsoft.com/v1alpha1
kind: BlobContainer
metadata:
  name: blobcontainer-sample
spec:
  location: westus
  resourcegroup: resourcegroup-operators
  accountname: seroterstorageaccount
  # accessLevel - Specifies whether data in the container may be accessed publicly and the level of access.
  # Possible values include: 'Container', 'Blob', 'None'
  accesslevel: Container
```
The image below shows my managed-service-related configs. I checked all these configurations into GitHub.

A few seconds later, I saw that Anthos was processing the new configurations.

Ok, it’s the moment of truth. First, I checked Cloud Storage and saw my brand new bucket, provisioned by Anthos.

Switching over to the Azure Portal, I navigated to Storage area and saw my new account and blob container.

How cool is that? Now i just have to drop resource definitions into my GitHub repository, and Anthos spins up the service in GCP or Azure. And if I delete that resource manually, Anthos re-creates it automatically. I don’t have to learn each API or manage code that provisions services.

Finally, we can also deploy applications this way. Imagine using a CI pipeline to populate a Kubernetes deployment template (using kpt, or something else) and dropping it into a git repo. Then, we use the Kubernetes resource model to deploy the application container. In the gcp-apps directory, I added Kubernetes deployment and service YAML files that reference a basic app I containerized.

As you might expect, once the repo synced to the correct clusters, Anthos created a deployment and service that resulted in a routable endpoint. While there are tradeoffs for deploying apps this way, there are some compelling benefits.

Step 6: “Move” App Between Clouds by Moving Configs in GitHub

This last step is basically my way of trolling the people who complain that multi-cloud apps are hard. What if I want to take the above app from GCP and move it to Azure? Does it require a four week consulting project and sacrificing a chicken? No. I just have to copy the Kubernetes deploy and service YAML files to the azure-apps directory.

After committing my changes to GitHub, ACM fired up and deleted the app from GCP, and inflated it on Azure, including an Azure Load Balancer instance to get a routable endpoint. I can see all of that from within the GCP Console.

Now, in real life, apps aren’t so easily portable. There are probably sticky connections to databases, and other services. But if you have this sort of platform in place, it’s definitely easier.

Thanks to deep support for GitOps and the KRM, Anthos makes it possible to manage infrastructure, apps, and managed services in a consistent way, on any cloud. Whether you use Anthos or not, take a look at GitOps and the KRM and start asking your preferred vendors when they’re going to adopt this paradigm!
January 12, 2021

Four reasons that Google Cloud Run is better than traditional FaaS offerings

Has the “serverless revolution stalled”? I dunno. I like serverless. Taught a popular course about it. But I reviewed and published an article written by Bernard Brode that made that argument, and it sparked a lot of discussion. If we can agree that serverless computing means building an architecture out of managed services that scale to zero—we’re not strictly talking about function-as-a-service—that’s a start. Has this serverless model crossed the chasm from early adopters to an early majority? I don’t think so. And the data shows that usage of FaaS—still a fundamental part of most people’s serverless architecture—has flattened a bit. Why is that? I’m no expert, but I wonder if some of the inherent friction of the 1st generation FaaS gets in the way.

We’re seeing a new generation of serverless computing that removes that friction and may restart the serverless revolution. I’m talking here about Google Cloud Run. Based on the Knative project, it’s a fully managed service that scales container-based apps to zero. To me, it takes the best attributes from three different computing paradigms:

Paradigm	Best Attributes
Platform-as-a-Service	– focus on the app, not underlying infrastructure – auto-wire networking components to expose your endpoint
Container-as-a-Service	– use portable app packages – develop and test locally
Function-as-a-Service	– improve efficiency by scaling to zero – trigger action based on events

Each of those above paradigms has standalone value. By all means, use any of them if they suit your needs. Right now, I’m interested in what it will take for large companies to adopt serverless computing more aggressively. I think it requires “fixing” some of the flaws of FaaS, and there are four reasons Cloud Run is positioned to do so.

1. It doesn’t require rearchitecting your systems

First-generation serverless doesn’t permit cheating. No, you have to actually refactor or rebuild your system to run this way. That’s different than all the previous paradigms. IaaS? You could take existing bare metal workloads and run them unchanged in a cloud VM platform. PaaS? It catered to 12-factor apps, but you could still run many existing things there. CaaS? You can containerize a lot of things without touching the source code. FaaS? Nope. Nothing in your data center “just works” in a FaaS platform.

While that’s probably a good thing from a purity perspective—stop shifting your debt from one abstraction to another without paying it down!—it’s impractical. Simultaneously, we’re asking staff at large companies to: redesign teams for agile, introduce product management, put apps on CI pipelines, upgrade their programming language/framework, introduce new databases, decouple apps into microservices, learn cloud and edge models, AND keep all the existing things up and running. It’s a lot. The companies I talk to are looking for ways to get incremental benefits for many workloads, and don’t have the time or people to rebuild many things at once.

This is where Cloud Run is better than FaaS. It hosts containers that respond to web requests or event-based triggers. You can write functions, or, containerize a complete app—Migrate for Anthos makes it easy. Your app’s entry point doesn’t have to conform to a specific method signature, and there are no annotations or code changes required to operate in Cloud Run. Take an existing custom-built app written in any language, or packaged (or no source-code-available) software and run it. You don’t have to decompose your existing API into a series of functions, or break down your web app into a dozen components. You might WANT to, but you don’t HAVE to. I think that’s powerful, and significantly lowers the barrier to entry.

2. It runs anywhere

Lock-in concerns are overrated. Everything is lock-in. You have to decide whether you’re getting unique value from the coupling. If so, go for it. A pristine serverless architecture consists of managed services with code (FaaS) in the gaps. The sticky part is all those managed services, not the snippets of code running in the FaaS. Just making a FaaS portable doesn’t give you all the benefits of serverless.

That said, I don’t need all the aspects of serverless to get some of the benefits. Replacing poorly utilized virtual machines with high-density nodes hosting scale-to-zero workloads is great. Improving delivery velocity by having an auto-wired app deployment experience versus ticket-defined networking is great. I think it’s naive to believe that most folks can skip from traditional software development directly to fully serverless architectures. There’s a learning and adoption curve. And one step on the journey is defining more distributed services, and introducing managed services. Cloud Run offers a terrific best-of-both-worlds model that makes the journey less jarring. And uniquely, it’s not only available on a single cloud.

Cloud Run is great on Google Cloud. Given the option, you should use it there. It’s fully managed and elastic, and integrates with all types of GCP-only managed services, security features, and global networking. But you won’t only use Google Cloud in your company. Or Azure. Or AWS. Or Cloudflare. Cloud Run for Anthos puts this same runtime most anywhere. Use it in your data center. Use it in your colocation or partner facility. Use it at the edge. Soon, use it on AWS or Azure. Get one developer-facing surface for apps running on a variety of hosts.

A portable Faas, based on open source software, is powerful. And I believe, necessary, to break into mainstream adoption within the enterprise. Bring the platform to the people!

3. It makes the underlying container as invisible, or visible, as you want

Cloud Run uses containers. On one hand, it’s a packaging mechanism, just like a ZIP file for AWS Lambda. On the other, it’s a way to bring apps written in any language, using any libraries, to a modern runtime. There’s no “supported languages” page on the website for Cloud Run. It’s irrelevant.

Now, I personally don’t like dealing with containers. I want to write code, and see that code running somewhere. Building containers is an intermediary step that should involve as little effort as possible. Fortunately, tools like Cloud Code make that a reality for me. I can use Visual Studio Code to sling some code, and then have it automatically containerized during deployment. Thanks Cloud Buildpacks! If I choose to, I can use Cloud Run while being blissfully unaware that there are containers involved.

That said, maybe I want to know about the container. My software may depend on specific app server settings, file system directories, or running processes. During live debugging, I may like knowing I can tunnel into the container and troubleshoot in sophisticated ways.

Cloud Run lets you choose how much you want to care about the container image and running container itself. That’s a flexibility that’s appealing.

4. It supports advanced use cases

Cloud Run is great for lots of scenarios. Do server-side streaming with gRPC. Build or migrate web apps or APIs that take advantage of our new API Gateway. Coordinate apps in Cloud Run with other serverless compute using the new Cloud Workflows. Trigger your Cloud Run apps based on events occurring anywhere within Google Cloud. Host existing apps that need a graceful shutdown before scaling to zero. Allocate more horsepower to new or existing apps by assigning up to 4 CPUs and 4GB of RAM, and defining concurrency settings. Decide if your app should always have an idle instance (no cold starts) and how many instances it should scale up to. Route traffic to a specific port that your app listens on, even if it’s not port 80.

If you use Cloud Run for Anthos (in GCP or on other infrastructure), you have access to underlying Kubernetes attributes. Create private services. Participate in the service mesh. Use secrets. Reference ConfigMaps. Turn on Workload Identity to secure access to GCP services. Even take advantage of GPUs in the cluster.

Cloud Run isn’t for every workload, of course. It’s not for background jobs. I wouldn’t run a persistent database. It’s ideal for web-based apps, new or old, that don’t store local state.

Give Cloud Run a look. It’s a fast-growing service, and it’s free to try out with our forever-free services on GCP. 2 million requests a month before we charge you anything! See if you agree that this is what the next generation of serverless compute should look like.

October 13, 2020

First look: Triggering Google Cloud Run with events generated by GCP services
When you think about “events” in an event-driven architecture, what comes to mind? Maybe you think of business-oriented events like “file uploaded”, “employee hired”, “invoice sent”, “fraud detected”, or “batch job completed.” You might emit (or consume) these types of events in your application to develop more responsive systems.

What I find even more interesting right now are the events generated by the systems beneath our applications. Imagine what your architects, security pros, and sys admins could do if they could react to databases being provisioned, users getting deleted, firewall being changed, or DNS zone getting updated. This sort of thing is what truly enables the “trust, but verify” approach for empowered software teams. Let those teams run free, but “listen” to things that might be out of compliance.

This week, the Google Cloud team announced Events for Cloud Run, in beta this September. What this capability does is let you trigger serverless containers when lifecycle events happen in most any Google Cloud service. These lifecycle events are in the CloudEvents format, and distributed (behind the scenes) to Cloud Run via Google Cloud PubSub. For reference, this capability bears some resemblance to AWS EventBridge and Azure Event Grid. In this post, I’ll give you a look at Events for Cloud Run, and show you how simple it is to use.

Code and deploy the Cloud Run service

Developers deploy containers to Cloud Run. Let’s not get ahead of ourselves. First, let’s build the app. This app is Seroter-quality, and will just do the basics. I’ll read the incoming event and log it out. This is a simple ASP.NET Core app, with the source code in GitHub.

I’ve got a single controller that responds to a POST command coming from the eventing system. I take that incoming event, serialize from JSON to a string, and print it out. Events for Cloud Run accepts either custom events, or CloudEvents from GCP services. If I detect a custom event, I decode the payload and print it out. Otherwise, I just log the whole CloudEvent.
```
namespace core_sample_api.Controllers
{
    [ApiController]
    [Route("")]
    public class Eventsontroller : ControllerBase
    {
        private readonly ILogger<Eventsontroller> _logger;
        public Eventsontroller(ILogger<Eventsontroller> logger)
        {
            _logger = logger;
        }
        [HttpPost]
        public void Post(object receivedEvent)
        {
            Console.WriteLine("POST endpoint called");
            string s = JsonSerializer.Serialize(receivedEvent);
            //see if custom event with "message" root property
            using(JsonDocument d = JsonDocument.Parse(s)){
                JsonElement root = d.RootElement;
                if(root.TryGetProperty("message", out JsonElement msg)) {
                    Console.WriteLine("Custom event detected");
                    JsonElement rawData = msg.GetProperty("data");
                    //decode
                    string data = System.Text.Encoding.UTF8.GetString(Convert.FromBase64String(rawData.GetString()));
                    Console.WriteLine("Data value is: " + data);
                }
            }
            Console.WriteLine("Data: " + s);
        }
    }
}
```
After checking all my source code into GitHub, I was ready to deploy it to Cloud Run. Note that you can use my same repo to continue on this example!

I switched over to the GCP Console, and chose to create a new Cloud Run service. I picked a region and service name. Then I could have chosen either an existing container image, or, continuous deployment from a git repo. I chose the latter. First I picked my GitHub repo to get source from.

Then, instead of requiring a Dockerfile, I picked the new Cloud Buildpacks support. This takes my source code and generates a container for me. Sweet.

After choosing my code source and build process, I kept the default HTTP trigger. After a few moments, I had a running service.

Add triggers to Cloud Run

Next up, adding a trigger. By default, the “triggers” tab shows the single HTTP trigger I set up earlier.

I wanted to show custom events in addition to CloudEvents ones, so I went to the PubSub dashboard and created a new queue that would trigger Cloud Run.

Back in the Cloud Run UX, I added a new trigger. I chose the trigger type of “com.google.cloud.pubsub.topic.publish” and picked the Topic I created earlier. After saving the trigger, I saw it show up in the list.

After this, I wanted to trigger my Cloud Run service with CloudEvents. If you’re receiving events from Google Cloud services, you’ll have to enable Data Access Logs so that events can be spun up from Cloud Logs. I’m going to listen for events from Cloud Storage and Cloud Build, so I turned on audit logging for each.

All that was left to define the final triggers. For Cloud Storage, I chose the storage.create.bucket trigger.

I wanted to react to Cloud Build, so that I could see whenever a build started.

Terrific. Now I was ready to test. I sent in a message to PubSub to trigger the custom event.

I checked the logs for Cloud Run, and almost immediately saw that the service ran, accepted the event, and logged the body.

Next, I tested Cloud Storage by adding a new bucket.

Almost immediately, I saw a CloudEvent in the log.

Finally, I kicked off a new Build pipeline, and saw an event indicating that Cloud Run received a message, and logged it.

If you care about what happens inside the systems your apps depend on, take a look at the new Events for Cloud Run and start tapping into the action.
August 25, 2020
I’m looking forward these 8 sessions at Google Cloud Next ’20 OnAir (Week 7)
It’s here. After six weeks of OTHER topics, we’re up to week seven of Google Cloud Next OnAir, which is all about my area: app modernization. The “app modernization” bucket in Google Cloud covers lots of cool stuff including Cloud Code, Cloud Build, Cloud Run, GKE, Anthos, Cloud Operations, and more. It basically addresses the end-to-end pipeline of modern apps. I recently sketched it out like this:

I think this the biggest week of Next, with over fifty breakout sessions. I like that most of the breakouts so far have been ~20 minutes, meaning you can log in, set playback speed to 1.5x, and chomp through lots of topic quickly.

Here are eight of the sessions I’m looking forward to most:
1. Ship Faster, Spend Less By Going Multi-Cloud with Anthos. This is the “keynote” for the week. We’re calling out a few product announcements, highlighting some new customers, and saying keynote-y things. You’ll like it.
2. GKE Turns 5: What’s New? All Kubernetes aren’t the same. GKE stands apart, and the team continues solving customer problems in new ways. This should be a great look back, and look ahead.
3. Cloud Run: What’s New? To me, Cloud Run has the best characteristics of PaaS, combined with the the event-driven, scale-to-zero of serverless functions. This is the best place I know of to run custom-built apps in the Google Cloud (or anywhere, with Anthos).
4. Modernize Legacy Java Apps Using Anthos. Whoever figures out how to unlock value from existing (Java) apps faster, wins. Here’s what Google Cloud is doing to help customers improve their Java apps and run them on a great host.
5. Running Anthos on Bare Metal and at the Edge with Major League Baseball (MLB). Baseball’s back, my Slam Diego Padres are fun again, and Anthos is part of the action. Good story here.
6. Getting Started with Anthos, Anthos Deep Dive: Part One, Anthos Deep Dive: Part Two. Am I cheating by making three sessions into one entry? Fine, you caught me. But this three part trilogy is a great way to grok Anthos and understand its value.
7. Develop for Cloud Run in the IDE with Cloud Code. Cloud Code extends your IDE to support Google Cloud, and Cloud Run is great. Combine the two, and you’ve got some good stuff.
8. Event-Driven Microservices with Cloud Run. You’re going to enjoy this one, and seeing what’s now possible.
I’m looking forward to this week. We’re sharing lots of fun progress, and demonstrating some fresh perspectives on what app modernization should look like. Enjoy watching!
August 24, 2020
Google Cloud’s support for Java is more comprehensive than I thought
Earlier this year, I took a look at how Microsoft Azure supports Java/Spring developers. With my change in circumstances, I figured it was a good time to dig deep into Google Cloud’s offerings for Java developers. What I found was a very impressive array of tools, services, and integrations. More than I thought I’d find, honestly. Let’s take a tour.

Local developer environment

What stuff goes on your machine to make it easier to build Java apps that end up on Google Cloud?

Cloud Code extension for IntelliJ

Cloud Code is a good place to start. Among other things, it delivers extensions to IntelliJ and Visual Studio Code. For IntelliJ IDEA users, you get starter templates for new projects, snippets for authoring relevant YAML files, tool windows for various Google Cloud services, app deployment commands, and more. Given that 72% of Java developers use IntelliJ IDEA, this extension helps many folks.

Cloud Code in VS Code

The Visual Studio Code extension is pretty great too. It’s got project starters and other command palette integrations, Activity Bar entries to manage Google Cloud services, deployment tools, and more. 4% of Java devs use Visual Studio Code, so, we’re looking out for you too. If you use Eclipse, take a look at the Cloud Tools for Eclipse.

The other major thing you want locally is the Cloud SDK. Within this little gem you get client libraries for your favorite language, CLIs, and emulators. This means that as Java developers, we get a Java client library, command line tools for all-up Google Cloud (gcloud), Big Query (bq) and Storage (gsutil), and then local emulators for cloud services like Pub/Sub, Spanner, Firestore, and more. Powerful stuff.

App development

Our machine is set up. Now we need to do real work. As you’d expect, you can use all, some, or none of these things to build your Java apps. It’s an open model.

Java devs have lots of APIs to work with in the Google Cloud Java client libraries, whether talking to databases or consuming world-class AI/ML services. If you’re using Spring Boot—and the JetBrains survey reveals that the majority of you are—then you’ll be happy to discover Spring Cloud GCP. This set of packages makes it super straightforward to interact with terrific managed services in Google Cloud. Use Spring Data with cloud databases (including Cloud Spanner and Cloud Firestore), Spring Cloud Stream with Cloud Pub/Sub, Spring Caching with Cloud Memorystore, Spring Security with Cloud IAP, Micrometer with Cloud Monitoring, and Spring Cloud Sleuth with Cloud Trace. And you get the auto-configuration, dependency injection, and extensibility points that make Spring Boot fun to use. Google offers Spring Boot starters, samples, and more to get you going quickly. And it works great with Kotlin apps too.

Emulators available via gcloud

As you’re building Java apps, you might directly use the many managed services in Google Cloud Platform, or, work with the emulators mentioned above. It might make sense to work with local emulators for things like Cloud Pub/Sub or Cloud Spanner. Conversely, you may decide to spin up “real” instance of cloud services to build apps using Managed Service for Microsoft Active Directory, Secret Manager, or Cloud Data Fusion. I’m glad Java developers have so many options.

Where are you going to store your Java source code? One choice is Cloud Source Repositories. This service offers highly available, private Git repos—use it directly or mirror source code code from GitHub or Bitbucket—with a nice source browser and first-party integration with many Google Cloud compute runtimes.

Building and packaging code

After you’ve written some Java code, you probably want to build the project, package it up, and prepare it for deployment.

Artifact Registry

Store your Java packages in Artifact Registry. Create private, secure artifact storage that supports Maven and Gradle, as well as Docker and npm. It’s the eventual replacement of Container Registry, which itself is a nice Docker registry (and more).

Looking to build container images for your Java app? You can write your own Dockerfiles. Or, skip docker build|push by using our open source Jib as a Maven or Gradle plugin that builds Docker images. Jib separates the Java app into multiple layers, making rebuilds fast. A new project is Google Cloud Buildpacks which uses the CNCF spec to package and containerize Java 8|11 apps.

Odds are, your build and containerization stages don’t happen in isolation; they happen as part of a build pipeline. Cloud Build is the highly-rated managed CI/CD service that uses declarative pipeline definitions. You can run builds locally with the open source local builder, or in the cloud service. Pull source from Cloud Source Repositories, GitHub and other spots. Use Buildpacks or Jib in the pipeline. Publish to artifact registries and push code to compute environments.

Application runtimes

As you’d expect, Google Cloud Platform offers a variety of compute environments to run your Java apps. Choose among:
- Compute Engine. Pick among a variety of machine types, and Windows or Linux OSes. Customize the vCPU and memory allocations, opt into auto-patching of the OS, and attach GPUs.
- Bare Metal. Choose a physical machine to host your Java app. Choose from machine sizes with as few as 16 CPU cores, and as many as 112.
- Google Kubernetes Engine. The first, and still the best, managed Kubernetes service. Get fully managed clusters that are auto scaled, auto patched, and auto repaired. Run stateless or stateful Java apps.
- App Engine. One of the original PaaS offerings, App Engine lets you just deploy your Java code without worrying about any infrastructure management.
- Cloud Functions. Run Java code in this function-as-a-service environment.
- Cloud Run. Based on the open source Knative project, Cloud Run is a managed platform for scale-to-zero containers. You can run any web app that fits into a container, including Java apps.
- Google Cloud VMware Engine. If you’re hosting apps in vSphere today and want to lift-and-shift your app over, you can use a fully managed VMware environment in GCP.
Running in production

Regardless of the compute host you choose, you want management tools that make your Java apps better, and help you solve problems quickly.

You might stick an Apigee API gateway in front of your Java app to secure or monetize it. If you’re running Java apps in multiple clouds, you might choose Google Cloud Anthos for consistency purposes. Java apps running on GKE in Anthos automatically get observability, transport security, traffic management, and SLO definition with Anthos Service Mesh.

Anthos Service Mesh

But let’s talk about day-to-day operations of Java apps. Send Java app logs to Cloud Logging and dig into them. Analyze application health and handle alerts with Cloud Monitoring. Do production profiling of your Java apps using Cloud Profiler. Hunt for performance problems via distributed tracing with Cloud Trace. And if you need to, debug in production by analyzing the running Java code (in your IDE!) with Cloud Debugger.

Modernizing existing apps

You probably have lots of existing Java apps. Some are fairly new, others were written a decade ago. Google Cloud offers tooling to migrate many types of existing VM-based apps to container or cloud VM environments. There’s good reasons to do it, and Java apps see real benefits.

Migrate for Anthos takes an existing Linux or Windows VM and creates artifacts (Dockerfiles, Kubernetes YAML, etc) to run that workload in GKE. Migrate for Compute Engine moves your Java-hosting VMs into Google Compute Engine.

All-in-all, there’s a lot to like here if you’re a Java developer. You can mix-and-match these Google Cloud services and tools to build, deploy, run, and manage Java apps.
July 1, 2020