Happy Hump Day and enjoy a handful of pieces on dev productivity, continuous deployment, and whether AI will usher in a platform shift in the way cloud computing did.
[blog] Playing With Fire – ChatGPT. Steve Blank knows disruption, and he sounds the alarm on generative AI and its unknown implications on society.
[blog] Growth of AI Through a Cloud Lens. Will AI usher in a platform shift the same way cloud did seventeen years ago? Mitchell, one of the founders of Hashicorp, thinks it might.
[blog] Google Cloud Deploy adds canary and parallel deployment support. The DevOps tooling space is fairly mature, but there’s always room for improvement. I like that our fully managed continuous deployment service now supports parallel deployments to multiple targets along with canary deployments for staged rollouts.
[article] The changing world of Java. There’s a new survey out about the state of Java and offers a look into popular app types, developer challenges, where apps run, and more.
I came across a lot of pieces about software development today, which is always good. Check out a handful of posts from Martin Fowler about planning software work, a look at what software development will look like with AI in the mix, and why you should learn Python.
[article] Ditch the Template: Incident Write-ups They Want to Read. If you actually write incident reports, kudos. Many don’t. But does anyone read them? This post has good advice for sharing information in a digestible way.
[blog] ContinuousFlow. Martin Fowler published a handful of posts today that looks at a modern way of scheduling work. He contrasts this with Timeboxed Interations and also talks about the value of Slack.
[blog] The Cloud Chooses You. Here’s a list of 25 things Brian has learned on his cloud journey, and most of these mirror what I’ve seen from those who are doing it right.
[blog] CloudEvents Basics. This specification is used in a handful of messaging services that route events—including Knative, Azure EventGrid, Google Cloud Eventarc—and Mete explains the fundamentals in this post.
[article] If you want a career in AI, learn Python. If it’s just money you’re after, go learn COBOL or C++. You can make a great living with those. But if you want to move towards emerging and hot tech, Python is a clear choice.
[blog] Groovy Datasets for Test Databases. What data source are you using when building a sample app or exploring data science? This post points to handful of fun, free datasets.
[blog] Tracing Notifications. The Slack engineering team looks at what it takes to trace requests through the many stages of their distributed system.
[blog] Tired: Mostly Cloudy. Wired: Runtime. I read a lot of different sources each day to uncover the nuggets you find here, and I’m excited to have a quality aggregation on the way to make my life easier.
It definitely feels like Spring here in the States, although honestly, it never really felt like Winter in San Diego. Not after seven years of legit Winter in Washington. But here we are, enjoying Spring and reading tech news. Check out what I came across today.
[article] Culture & Methods Trends Report March 2023. Are you, or your team, doing bleeding-edge work from a culture and methods perspective? Or are you in the late majority? Take a look at this InfoQ trends report and tell me.
[blog] Our microservice stack. It’s rarely a good idea to copy someone’s tech stack without understanding their circumstances. But, we can still study and learn from each other. Here’s a stack that includes standard components like PostgreSQL, Kafka, and Prometheus.
[docs] Create an admin cluster using Anthos On-Prem API clients. Whether it’s a dozen clusters on-premises, or hundreds of clusters at the edge, you’ll want a lifecycle management strategy for Kubernetes that scales. Our team here has done great work enabling both API/CLI and UI for creating and managing GKE clusters on-premises.
[blog] Releasing Ververica Cloud – A Fully Managed Cloud Native Service. If you can help it, don’t build apps that use services from public cloud A, others from public cloud B. It’s unnecessary complexity. But some SaaS products should be part of your architecture. Here’s a new one to consider for data processing.
[blog] Load Testing for 2022 Wrapped. There are probably a few moments where your systems expect a flood of traffic all at once. At Spotify, it happens during those “year in review” Wrapped notices that give listeners a personalized look at their listening habits. Here’s how they prepare.
[blog] Cloud CISO Perspectives: March 2023. If you want to improve your awareness of security concepts—even if you don’t live and breathe this stuff—I highly recommend this monthly post.
[youtube-video] Is AlloyDB compatible with PostgreSQL? What does it mean when our new-ish database is “PostgreSQL compatible”? This video just laid it out.
[article] 5 priorities that cut cloud costs and improve IT ops. You’ll find some actionable advice in this article. If you’re just going to lift-and-shift to cloud, you’ll likely be disappointed, so follow some of this guidance instead.
[blog] Scaling vision transformers to 22 billion parameters. Read this, and your takeaways may be the same as mine: this space is evolving quickly, compute capacity matters (and TPUs are awesome), and we’re going to see some massive improvements in classification systems.
##
Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:
Light reading day today, but still read, listened to, and watched some good things. Check out a handful of AI pieces, guidance on data processing, and clarification on what SRE is all about.
[article] Debunking Myths About Reliability. How much reliability is too much? What’s more important, innovation or reliability? Kit explores those questions here.
[podcast] Jennifer Mace on How Google Does SRE. If you kinda sorta understand what Site Reliability Engineering is about, but freeze up when someone asks you to explain it, take 30 minutes and listen to this podcast. It’s the best explanation I’ve heard.
More AI content today, but it’s a day that ends in “y.” But also find some fresh perspective on serverless computing, cross-platform app frameworks, and learning the right lessons from failure.
[blog] Generative AI will 5-10x developer productivity. You are COMPLETELY justified in being skeptical of generative AI given the graveyard of “next big things” that were hyped over the past decade, including the metaverse, web3, blockchain, edge computing, 5G, and so on. But this one? It’s genuinely different.
[blog] The Edge Of The Age Of AI. This post echoes some of Brian’s points from the previous link. We’re at the start of something that will change all our jobs in some way over the next three years.
[blog] Flutter in 2023: strategy and roadmap. Very transparent look by this team. See their strategy doc, and then a link to a roadmap with specifics for this popular multi-platform app framework.
[article] Don’t Learn the Wrong Lessons from Failure. This article wasn’t what I thought it would be, but I still took something away from it. The main takeaway for me was to slow down and do thorough analysis of failures and don’t rush to a root cause.
Famed psychologist Daniel Kahneman once said “Nothing in life is as important as you think it is, while you are thinking about it.” There’s always new tech, new trends, and a feeling of FOMO. Deep breath. You’re doing fine. Stay aware, but don’t get too hyped up with all the things around you. Keep learning by reading through today’s links below.
[blog] The Misuse of User Stories. Are you relying on user stories to plan and deliver products? This post advises against using them the wrong way.
[article] Cloud Management Issues Are Coming to a Head. Folks are having challenges with running all their environments. That’s not too surprising to me. I wonder how folks who have purposely minimized their options and focused on a rapid transition to cloud architectures are doing? Better, I assume.
[article] Open source Kubeflow 1.7 set to ‘transform’ MLops. I haven’t kept a super-close eye on this project, but it’s good to see momentum here. If you’re doing notebook management, model training, model serving and ALSO using Kubernetes, you might like this project.
[article] Simplify Day 2 Operations on GCP — Active Assist. Plenty of folks don’t want to create their own AI stuff. They just want the things they use to be smarter. Take advantage of what’s already built into the platforms and clouds that you use today.
[blog] Rolling in the deep. The days of “plan once a year” in technology seem to be coming to a close. Things change too quickly. The McDonalds engineering team talks about rolling-planning with specific rhythms for different purposes.
[blog] Jump start your future career with Google Cloud certifications. People seem to either love or loath certifications. I got MCSD certified from Microsoft years ago, and it was a good one to test myself on a topic. I didn’t see it as a job finding tool. But however you view certifications, you might want to take a look at this post about starting your cloud career by taking a cert exam.
##
Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:
Whew, what a day. Still managed to read and watch some good stuff you’ll find below. Learn about platform teams, AI-washing, Kubernetes, and boosting creativity in your teams.
[blog] Top Ten Things Slowing Down Your Platform Team. If you have a platform team, you’re doing it right. But have you set them up for success? This post has good questions to ask yourself.
[article] From ‘cloud washing’ to ‘AI washing’. Unfortunately, it’s going to be on you to sift through the myriad AI-related announcements out there to discern which are legit.
[article] 5 Ways to Boost Creativity on Your Team. None of these tips are rocket science, but I still appreciate reminders and guidance for keeping our teams engaged and sharing cool ideas.
[blog] Google chip design team benefits from move to Google Cloud. People have asked me why Google doesn’t run on Google Cloud. Parts of it do, but Google Cloud itself is built upon many remarkable Google services. Here’s the story of Google one team (chip design) that’s moved into Google Cloud.
##
Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:
I found a lot of good advice in today’s reading. There was useful perspective on how to think about the value of AI assistance, tips for asynchronous systems, how to build an effective incident program, and how to write good docs. Dig in!
[article] AI and the future of software development. Many folks are focused on AI generating code, but Matt points out that understanding code, or reviewing code, might be where we’ll find significant impact from LLMs.
[blog] Serverless take the wheel. Simple post, but good reminder that fully managed services continue us on a journey of offloading plumbing work and focusing on the good stuff.
[blog] Don’t Fail Publishing Events! When dealing with message brokers and databases, you need to think through a variety of scenarios. What happens if publishing fails? Should you write to the database FIRST? This post explores the topic.
[blog] Best and Worst Practices in Technical Documentation. Docs aren’t an afterthought. At least they shouldn’t be! This post has applicable advice for those trying to create or improve their documentation.
[blog] The difference between libraries and frameworks. Are you using these words in the right way? I do not. This post reminded me to tighten up my language and not use certain words interchangeably.
[article] Getting Developer Self-Service Right. Read this, and ask yourself if you’d fall into the high-performing or low-performing team bucket, and react accordingly.
##
Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:
Is a serverless architecture realistic for every system? Of course not. But it’s never been easier to build robust solutions out of a bunch of fully-managed cloud services. For instance, what if I want to take uploaded files, inspect them, and route events to app instances hosted in different regions around the world? Such a solution might require a lot of machinery to set up and manage—file store, file listeners, messaging engines, workflow system, hosting infrastructure, and CI/CD products. Yikes. How about we do that with serverless technology such as:
The heart of this system is the app that processes “loan” events. The events produced by Eventarc are in the industry-standard CloudEvents format. Do I want to parse and process those events in code manually? No, no I do not. Two things will help here. First, our excellent engineers have built client libraries for every major language that you can use to process CloudEvents for various Google Cloud services (e.g. Storage, Firestore, Pub/Sub). My colleague Mete took it a step further by creating VS Code templates for serverless event-handlers in Java, .NET, Python, and Node. We’ll use those.
To add these templates to your Visual Studio Code environment, you start with Cloud Code, our Google Cloud extension to popular IDEs. Once Cloud Code is installed, I can click the “Cloud Code” menu and then choose the “New Application” option.
Then I chose the “Custom Application” option and “Import Sample from Repo” and added a link to Mete’s repo.
Now I have the option to pick a “Cloud Storage event” code template for Cloud Functions (traditional function as a service) or Cloud Run (container-based serverless). I picked the Java template for Cloud Run.
The resulting project is a complete Java application. It references the client library mentioned above, which you can see as google-cloudevent-types in the pom.xml file. The code is fairly straightforward and the core operation accepts the inbound CloudEvent and creates a typed StorageObjectData object.
This generated project has directions and scripts to test locally, if you’re so inclined. I went ahead and deployed an instance of this app to Cloud Run using this simple command:
gcloud run deploy --source .
That gave me a running instance, and, a container image I could use in our next step.
Step 2: Create parallel deployment of Java app to multiple Cloud Run locations
In our fictitious scenario, we want an instance of this Java app in three different regions. Let’s imagine that the internal employees in each geography need to work with a local application.
I’d like to take advantage of a new feature of Cloud Deploy, parallel deployments. This makes it possible to deploy the same workload to a set of GKE clusters or Cloud Run environments. Powerful! To be sure, the MOST applicable way to use parallel deployments is a “high availability” scenario where you’d deploy identical instances across locations and put a global load balancer in front of it. Here, I’m using this feature as a way to put copies of an app closer to specific users.
First, I need to create “service” definitions for each Cloud Run environment in my deployment pipeline. I’m being reckless, so let’s just have “dev” and “prod.”
My “dev” service definition looks like this. The “image” name can be anything, as I’ll replace this placeholder in realtime when I deploy the pipeline.
The “production” YAML service is identical except for a different service name.
Next, I need a Skaffold file that identifies the environments for my pipeline, and points to the respective YAML files that represent each environment.
The final artifact I need is a DeliveryPipeline definition. It calls out two stages (dev and prod), and for production that points to a multiTarget that refers to three Cloud Run targets.
In the Google Cloud Console, I can see my deployed pipeline with two stages and multiple destinations for production.
Now it’s time to create a release for this deployment and see everything provisioned.
The command to create a release might be included in your CI build process (whether that’s Cloud Build, GitHub Actions, or something else), or you can run the command manually. I’ll do that for this example. I named the release, gave it the name of above pipeline, and swapped the placeholder image name in my service YAML files with a reference to the container image generated by the previously-deployed Cloud Run instance.
After a few moments, I see a deployment to “dev” rolling out.
When that completed, I “promoted” the release to production and saw a simultaneous deployment to three different cloud regions.
Sweet. Once this is done, I check and see four total Cloud Run instances (one for dev, three for prod) created. I like the simplicity here for shipping the same app instance to any cloud region. For GKE clusters, this also works with Anthos environments, meaning you could deploy to edge, on-prem or other clouds as part of a parallel deploy.
We’re done with this step. I have an event-receiving app deployed around North America.
Step 3: Set up Cloud Storage bucket
This part is simple. I use the Cloud Console to create a new object storage bucket named seroter-loan-applications. We’ll assume that an application drops files into this bucket.
Step 4: Write Cloud Workflow that routes events to correct Cloud Run instance
There are MANY ways one could choose to architect this solution. Maybe you upload files to specific bucket and route directly to the target Cloud Run instance using a trigger. Or you route all bucket uploads to a Cloud Function and decide there where you’ll send it next. Plus dozens of other options. I’m going to use a Cloud Workflow that receives an event, and figures out where to send it next.
A Cloud Workflow is described with a declarative definition written in YAML or JSON. It’s got a standard library of functions, supports control flow, and has adapters to lots of different cloud services. This Workflow needs to parse an incoming CloudEvent and route to one of our three (secured) Cloud Run endpoints. I do a very simple switch statement that looks at the file name of the uploaded file, and routes it accordingly. This is a terrible idea in real life, but go with me here.
This YAML results in a workflow that looks like this:
Step 5: Configure Eventarc trigger to kick off a Cloud Workflow
Our last step is to wire up the “file upload” event to this workflow. For that, we use Eventarc. Eventarc handles the machinery for listening to events and routing them. See here that I chose Cloud Storage as my event source (there are dozens and dozens), and then the event I want to listen to. Next I selected my source bucket, and chose a destination. This could be Cloud Run, Cloud Functions, GKE, or Workflows. I chose Workflows and then my specific Workflow that should kick off.
All good. Now I have everything wired up and can see this serverless solution in action.
Step 6: Test and enjoy
Testing this solution is straightforward. I dropped three “loan application” files into the bucket, each named with a different target region.
Sure enough, three Workflows kick off and complete successfully. Clicking into one of them shows the Workflow’s input and output.
Looking at the Cloud Run logs, I see that each instance received an event corresponding to its location.
Wrap Up
No part of this solution required me to stand up hardware, worry about operating systems, or configure networking. Except for storage costs for my bucket objects, there’s no cost to this solution when it’s not running. That’s amazing. As you look to build more event-driven systems, consider stitching together some fully managed services that let you focus on what matters most.
On this Friday, I read a whole stash of interesting things. There thought-provoking content on “build versus buy”, analyzing logs in security scenarios, and helping robots navigate your living room.
[blog] Visual language maps for robot navigation. If reading this changes what you plan to do at work on Monday, I want to be friends with you. For most of us, it’s just neat research to read about.
[blog] Buy vs Build… Over Time. Good post that emphasizes opportunity cost and the context of the current situation when deciding when to build or buy a solution.
[blog] Nobody cares, train harder. Edgy post, but it resonates with me. Everyone’s got challenges and rarely is everything handed to you. If you want more, push through.
[article] What’s the Difference between Flutter and React Native? Virtually every year, I threaten to learn a frontend web framework or mobile framework, and every year I don’t. But, I still like to pay attention to what’s out there. This is a good breakdown of two popular options.
[blog] Improving Istio Propagation Delay. This is a good example of (a) why good infrastructure monitoring matters and (b) the value of open source software that you can explore and change if needed. The Airbnb engineering team walks through their experiences here.