It’s the end of a solid workweek. Next week, I’ll be at Google I/O delivering the cloud keynote, and then jetting off to Vegas to talk to analysts at the Gartner event going on there. Those events have wildly different audiences, and I deeply enjoy both.
[blog] Kubernetes 1.30 is now available in GKE in record time. Staying up to date in your Kubernetes cluster isn’t about features; it’s about security and stability updates. Nobody keeps you up to date like our GKE team.
[blog] What’s new with Active Assist: New Hub UI and four new recommendations. Even if the AI hype train has taken focus off cost and security optimization, those things still matter a ton. I like the new security and audit-related “recommendations” we’re offering Cloud customers now.
Each day, I just read whatever pops into my feeds and newsletters. I’m not looking for a theme, but sometimes one pops out at me. Today? It seemed like a lot of content focused on optimization and doing things the right way. For example, check out items below about improving dev experience, efficient hosting of streaming platforms, doing CI well, controlling ops metrics volume, and scaling Kubernetes.
[paper] Capabilities of Gemini Models in Medicine. There’s 30+ pages of description and data in this new paper, and it may inspire you for use cases outside of medicine.
[article] How is Flutter Platform-Agnostic? This framework renders interfaces across desktop, web, and mobile. how does it do that? Good deep dive here.
[blog] Optimizing CI in Google Cloud Build. Darren wrote a fantastic post that’s helpful whether you’re using the Google Cloud services he mentions, or not.
[blog] A tour of Gemini 1.5 Pro samples. Speaking of useful things offered on dev websites, Mete looks at code samples here. He looks at how to use Gemini to process audio, video, and multi-modalities at the same time.
[[blog] The biggest effect on code quality. This author says its not about tools or skills; code quality is impacted by teams working in “crunch mode” and making mistakes.
[blog] alpine, distroless or scratch? A small, minimum-dependency base image for your containers is a smart bet. This post looks at 4 options, and the implications of each.
[article] What software developers hate. Short post, but Matt looks at three things—scope creep, pace of learning, lack of time to code—that drive developers bonkers.
[article] Platform Engineering for a Mainframe: Design Thinking Drives Change. You can apply modern practices (with varying levels of impact) to nearly any technology stack. I like efforts like this which try to help users of “legacy” systems get a better dev experience.
[blog] Does Your Marketing Pass the Duck Test? This is a good read, especially for folks who crave plain-English descriptions that actually tell you something.
##
Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:
It was a beautiful weekend down here in San Diego. Summer feels close. For those readers in the Northern Hemisphere, hopefully you’re feeling the same!
[blog] A Useful Productivity Measure? What did this VP of Engineering do after being asked to define a productivity metric for his team? I like where he ended up.
[blog] How LLMs Work, Explained Without Math. Unless you live and breathe LLMs, you’ll probably have one or two lightbulb moments when reading this. I sure did.
[article] MongoDB takes data streaming service GA. Kafka integrations today, apparently more coming in the future. Check this out if you use MongoDB yourself.
[article] Friday Forward – Chasing Butterflies. Good post from Bob. Sometimes feelings are just feelings, and not indicative of something that needs therapy or medication.
[blog] When to use Gemini or purpose-built AI models in BigQuery. This argument applies to almost everything in tech: use the general purpose thing or the purpose-built thing? Shane looks at how BigQuery handles some specific data prep and analysis tasks with AI models.
I spent most of today allowing the “unread” number in my inbox to grow as I attended meetings that required my full attention. For an “inbox zero” person, that’s painful. But fortunately the last half hour of the day included some productive catch-up. Enjoy your weekend!
[blog] Private networking patterns to Vertex AI workloads. Your best option for performant, innovative, and complete AI stacks is in the cloud. But what are the right patterns for securely interfacing with those services? This post is about networking.
[article] How Slack automates deploys. Good post that looks at some recent work Slack did to improve dev experience and how they do release management.
[blog] Simulate a zone failure in GKE regional clusters. You know what shouldn’t be unknown? How your app responds when it loses a cloud zone. This is a good guide for how to simulate that scenario.
##
Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:
It was a wild, frantic day, and now I’m sitting in the airport waiting to fly home. My days are never boring, which is a gift! A little more boring might be nice.
[blog] Thin Events: The lean muscle of event-driven architecture. Gosh, I remember writing about this topic fifteen years ago. Even today, it’s important to make conscious choices about how you shape your messages—”thick” messages that contain all the data, or “thin” messages that are only notifications.
[article] Atlassian launches Rovo, its new AI teammate. While I’m not smart enough to understand why every vendor needs to create their own foundation model, I do get why everyone is offering agent-builder experiences.
[blog] C# and Vertex AI Gemini streaming API bug and workaround. We know that our interactions with APIs is sometimes affected by choices within our programming language itself. This issues within C# is causing an issue calling the Gemini API, but a fix is on the way.
[blog] The Backend for Frontend Pattern. If you’re doing single-page apps, or just looking for more ideas around authentication in distributed systems, check out this post.
[blog] Evolving the Go Standard Library with math/rand/v2. Developers often use random number generators in their code. This is a big, interesting post on the new generator in Go, and how we introduced a breaking change.
[blog] Managing Cloud Storage soft delete at scale. Everyone makes mistakes, and it’s good to be able to recover “deleted” files if needed. I was impressed reading this post and seeing the breadth we considered (metrics, Terraform, APIs) to manage this new type of object.
[article] WebAssembly Adoption: It’s Complicated, Says CNCF Survey. I can’t say I’ve ever had Wasm come up in a customer conversation during the past four years. Maybe it’s who I talk to. Or it’s just not that relevant to most developers.
I had a very enlightening day of leadership meetings and am very bullish on the things Google Cloud is working on for customers. Energy level is high around here! I started my day with lots of reading, which you’ll find below.
[blog] How Konfig provides an enterprise platform with GitLab and Google Cloud. I’d suspect that if you’re reading my post, you probably have a source control system in place. But if not, or if you need an upgrade, I like what Real Kinetic is doing to make it easier to set up an enterprise-grade deployment.
[article] AI still has a ways to go in code refactoring. Readability and maintainability matter as much (more than?) coding speed, and Matt points out the need to supervise what AI is generating.
[blog] Supercharged Developer Portals. Good for Spotify for commercializing their open source tech and making developer portals easier to set up and use.
[article] AI, Your Task: Create Autonomous Agents. Vik (with help from AI) wrote this piece about “foundation agents” that learn and adapt to their environments.
[article] Who Takes a Risk on New Technology? That new technology won’t take off if there aren’t people willing to make personal bets on it. This article starts with a story about directors in Hollywood, and connects it to technology adoption.
Travel day as I jetted up to Sunnyvale for a week of meetings. The year is 1/3 over after tomorrow, so hopefully you’re tracking towards some of your 2024 goals!
[article] DevEx Success: How Pfizer Scaled to 1,000 Engineers. Building a great developer experience for an individual team is wonderful, and not trivial. Scaling that to hundreds or thousands? Entirely different problem. This article has good advice.
What exactly is a “generative AI app”? Do you think of chatbots, image creation tools, or music makers? What about document analysis services, text summarization capabilities, or widgets that “fix” your writing? These all seem to apply in one way or another! I see a lot written about tools and techniques for training, fine-tuning, and serving models, but what about us app builders? How do we actually build generative AI apps without obsessing over the models? Here’s what I’d consider using in 2024. And note that there’s much more to cover besides just building—think designing, testing, deploying, operating—but I’m just focusing on the builder tools today.
Find a sandbox for experimenting with prompts
A successful generative AI app depends on a useful model, good data, and quality prompts. Before going to deep on the app itself, it’s good to have a sandbox to play in.
You can definitely start with chat tools like Gemini and ChatGPT. That’s not a bad way to get your hands dirty. There’s also a set of developer-centric surfaces such as Google Colab or Google AI Studio. Once you sign in with a Google ID, you get free access to environments to experiment.
Let’s look at Google AI Studio. Once you’re in this UI, you have the ability to simulate a back-and-forth chat, create freeform prompts that include uploaded media, or even structured prompts for more complex interactions.
If you find yourself staring at an empty console wondering what to try, check out this prompt gallery that shows off a lot of unique scenarios.
Once you’re doing more “serious” work, you might upgrade to a proper cloud service that offers a sandbox along with SLAs and prompt lifecycle capabilities. Google Cloud Vertex AI is one example. Here, I created a named prompt.
With my language prompts, I can also jump into a nice “compare” experience where I can try out variations of my prompt and see if the results are graded as better or worse. I can even set one as “ground truth” used as a baseline for all comparisons.
Whatever sandbox tools you use, make sure they help you iterate quickly, while also matching the enterprise-y needs of the use case or company you work for.
Consume native APIs when working with specific models or platforms
At this point, you might be ready to start building your generative AI app. There seems to be a new, interesting foundation model up on Hugging Face every couple of days. You might have a lot of affection for a specific model family, or not. If you care about the model, you might choose the APIs for that specific model or provider.
For example, let’s say you were making good choices and anchored your app to the Gemini model. I’d go straight to the Vertex AI SDK for Python, Node, Java, or Go. I might even jump to the raw REST API and build my app with that.
If I were baking a chat-like API call into my Node.js app, the quickest way to get the code I need is to go into Vertex AI, create a sample prompt, and click the “get code” button.
I took that code, ran it in a Cloud Shell instance, and it worked perfectly. I could easily tweak it for my specific needs from here. Drop this code into a serverless function, Kubernetes pod, or VM and you’ve got a working generative AI app.
If you have a specific model preference, you might choose to use the API for Gemini, Llama, Mistral, or whatever. And you might choose to directly interact with database or function APIs to augment the input to those models. That’s cool, and is the right choice for many scenarios.
Use meta-frameworks for consistent experiences across models and providers
As expected, the AI builder space is now full of higher-order frameworks that help developers incorporate generative AI into their apps. These frameworks help you call LLMs, work with embeddings and vector databases, and even support actions like function calling.
LangChain is a big one. You don’t need to be bothered with many model details, and you can chain together tasks to get results. It’s for Python devs, so your choice is either to use Python, or, embrace one of the many offshoots. There’s LangChain4J for Java devs, LangChain Go for Go devs, and LangChain.js for JavaScript devs.
You have other choices if LangChain-style frameworks aren’t your jam. There’s Spring AI, which has a fairly straightforward set of objects and methods for interacting with models. I tried it out for interacting with the Gemini model, and almost found it easier to use than our native API! It takes one update to my POM file:
And then an autowired chat object that I call from anywhere, like in this REST endpoint.
@RestController
@SpringBootApplication
public class DemoApplication {
public static void main(String[] args) {
SpringApplication.run(DemoApplication.class, args);
}
private final VertexAiGeminiChatClient chatClient;
@Autowired
public DemoApplication(VertexAiGeminiChatClient chatClient) {
this.chatClient = chatClient;
}
@GetMapping("/")
public String getGeneratedText() {
String generatedResponse = chatClient.call("Tell me a joke");
return generatedResponse;
}
}
Super easy. There are other frameworks too. Use something like AI.JSX for building JavaScript apps and components. BotSharp is a framework for .NET devs building conversational apps with LLMs. Hugging Face has frameworks that help you abstract the LLM, including Transformers.js and agents.js.
There’s no shortage of these types of frameworks. If you’re iterating through LLMs and want consistent code regardless of which model you use, these are good choices.
Create with low-code tools when available
If I had an idea for a generative AI app, I’d want to figure out how much I actually had to build myself. There are a LOT of tools for building entire apps, components, or widgets, and many require very little coding.
Everyone’s in this game. Zapier has some cool integration flows. Gradio lets you expose models and APIs as web pages. Langflow got snapped up by DataStax, but still offers a way to create AI apps without much required coding. Flowise offers some nice tooling for orchestration or AI agents. Microsoft’s Power Platform is useful for low-code AI app builders. AWS is in the game now with Amazon Bedrock Agents. ServiceNow is baking generative AI into their builder tools, Salesforce is doing their thing, and basically every traditional low-code app vendor is playing along. See OutSystems, Mendix, and everyone else.
As you would imagine, Google does a fair bit here as well. The Vertex AI Agent Builder offers four different app types that you basically build through point-and-click. These include personalized search engines, chat, recommendation engine, and connected agents.
Search apps can tap into a variety of data sources including crawled websites, data warehouses, relational databases, and more.
What’s fairly new is the “agent app” so let’s try building one of those. Specifically, let’s say I run a baseball clinic (sigh, someday) and help people tune their swing in our batting cages. I might want a chat experience for those looking for help with swing mechanics, and then also offer the ability to book time in the batting cage. I need data, but also interactivity.
Before building the AI app, I need a Cloud Function that returns available times for the batting cage.
This Node.js function returns an array of book-able timeslots. I’ve hard-coded the data, but you get the idea.
I also jumped into the Google Cloud IAM interface to ensure that the Dialogflow service account (which the AI agent operates as) has permission to invoke the serverless function.
Let’s build the agent. Back in the Vertex AI Agent Builder interface, I choose “new app” and pick “agent.”
Now I’m dropped into the agent builder interface. On the left, I have navigation for agents, tools, test cases, and more. In the next column, I set the goal of the agent, the instructions, and any tools I want to use with the agent. On the right, I preview my agent.
I set a goal of “Answer questions about baseball and let people book time in the batting cage” and then get to the instructions. There’s a “sample” set of instructions that are useful for getting started. I used those, but removed references to other agents or tools, as we don’t have that yet.
But now I want to add a tool, as I need a way to show available booking times if the user asks. I have a choice of adding a data store—this is useful if you want to source Q&A from a BigQuery table, crawl a website, or get data from an API. I clicked the “manage all tools” button and chose to add a new tool. Here I give the tool a name, and very importantly, a description. This description is used by the AI agent to figure out when to invoke it.
Because I chose OpenAPI as the tool type, I need to provide an OpenAPI spec for my Cloud Function. There’s a sample provided, and I used that to put together my spec. Note that the URL is the function’s base URL, and the path contains the specific function name.
Finally, in this “tool setup” I define the authentication to that API. I chose “service agent token” and because I’m calling a specific instance of a service (versus the platform APIs), I picked “ID token.”
After saving the tool, I go back to the agent definition and want to update the instructions to invoke the tool. I use the syntax, and appreciated the auto-completion help.
Let’s see if it works. I went to the right-hand preview pane and asked it a generic baseball question. Good. Then I asked it for open times in the batting cage. Look at that! It didn’t just return a blob of JSON; it parsed the result and worded it well.
Very cool. There are some quirks with this tool, but it’s early, and I like where it’s going. This was MUCH simpler than me building a RAG-style or function-calling solution by hand.
Summary
The AI assistance and model building products get a lot of attention, but some of the most interesting work is happening in the tools for AI app builders. Whether you’re experimenting with prompts, coding up a solution, or assembling an app out of pre-built components, it’s a fun time to be developer. What products, tools, or frameworks did I miss from my assessment?