Skip to content

Richard Seroter's Architecture Musings

About
Daily Reading List
Contact Me

Author: Richard Seroter

Three Ways to Run Apache Kafka in the Public Cloud
Yes, people are doing things besides generative AI. You’ve still got other problems to solve, systems to connect, and data to analyze. Apache Kafka remains a very popular product for event and data processing, and I was thinking about how someone might use it in the cloud right now. I think there are three major options, and one of them (built-in managed service) is now offered by Google Cloud. So we’ll take that for a spin.

Option 1: Run it yourself on (managed) infrastructure

Many companies choose to run Apache Kafka themselves on bare metal, virtual machines, or Kubernetes clusters. It’s easy to find stories about companies like Netflix, Pinterest, and Cloudflare running their own Apache Kafka instances. Same goes for big (and small) enterprises that choose to setup and operate dedicated Apache Kafka environments.

Why do this? It’s the usual reasons why people decide to manage their own infrastructure! Kafka has a lot of configurability, and experienced folks may like the flexibility and cost profile of running Apache Kafka themselves. Pick your infrastructure, tune every setting, and upgrade on your timetable. On the downside, self-managed Apache Kafka can result in a higher total cost of ownership, requires specialized skills in-house, and could distract you from other high-priority work.

If you want to go that route, I see a few choices.
- Download the components and install them. Grab the latest release and throw it onto a set of appropriate virtual machine instances or bare metal machines. You might use Terraform or something similar to template out the necessary activities.
- Use a pre-packaged virtual machine image. Providers like Bitnami (part of VMware, part of Broadcom) offer a catalog of packaged and supported images that contain popular software packages, including Apache Kafka. These can be deployed directly from your cloud provider as well, as I show here with Google Cloud.
- Deploy to Kubernetes. Nowadays, it’s reasonable to deploy rich, stateful workloads to a Kubernetes cluster. You might use a Helm chart from someone like Bitnami. Here’s great documentation for deploying a highly available Apache Kafka cluster to GKE using Terraform. I also like the Kubernetes operator pattern and Strimzi makes this fairly easy. Check out this documentation for using Strimzi and operators to create Apache Kafka clusters in GKE.
There’s no shame in going this route! It’s actually very useful to know how to run software like Apache Kafka yourself, even if you decide to switch to a managed service later.

Option 2: Use a built-in managed service

You might want Apache Kafka, but not want to run Apache Kafka. I’m with you. Many folks, including those at big web companies and classic enterprises, depend on managed services instead of running the software themselves.

Why do this? You’d sign up for this option when you want the API, but not the ops. It may be more elastic and cost-effective than self-managed hosting. Or, it might cost more from a licensing perspective, but provide more flexibility on total cost of ownership. On the downside, you might not have full access to every raw configuration option, and may pay for features or vendor-dictated architecture choices you wouldn’t have made yourself.

AWS offers an Amazon Managed Streaming for Apache Kafka product. Microsoft doesn’t offer a managed Kafka product, but does provide a subset of the Apache Kafka API in front of their Azure Event Hubs product. Oracle cloud offers self-managed infrastructure with a provisioning assist, but also appears to have a compatible interface on their Streaming service.

Google Cloud didn’t offer any native service until just a couple of months ago. The Apache Kafka for BigQuery product is now in preview and looks pretty interesting. It’s available in a global set of regions, and provides a fully-managed set of brokers that run in a VPC within a tenant project. Let’s try it out.

Set up prerequisites

First, I needed to enable the API within Google Cloud. This gave me the ability to use the service. Note that this is NOT FREE while in preview, so recognize that you’ll incur changes.

Next, I wanted a dedicated service account for accessing the Kafka service from client applications. The service supports OAuth and SASL_PLAIN with service account keys. The latter is appropriate for testing, so I chose that.

I created a new service account named seroter-bq-kafka and gave it the roles/managedkafka.client role. I also created a JSON private key and saved it to my local machine.

That’s it. Now I was ready to get going with the cluster.

Provision the cluster and topic

I went into the Apache Kafka for BigQuery dashboard in the Google Cloud console—I could have also used the CLI which has the full set of control plane commands—to spin up a new cluster. I get very few choices, and that’s not a bad thing. You give the CPU and RAM capacity for the cluster, and Google Cloud creates the right shape for the brokers, and creates a highly available architecture. You’ll also see that I choose the VPC for the cluster, but that’s about it. Pretty nice!

In about twenty minutes, my cluster was ready. Using the console or CLI, I could see the details of my cluster.

Topics are a core part of Apache Kafka represent the resource you publish and subscribe to. I could create a topic via the UI or CLI. I created a topic called “topic1”.

Build the producer and consumer apps

I wanted two client apps. One to publish new messages to Apache Kafka, and another to consume messages. I chose Node.js and JavaScript as the language for the app. There are a handful of libraries for interacting with Apache Kafka, and I chose the mature kafkajs.

Let’s start with the consuming app. I need (a) the cluster’s bootstrap server URL and (b) the encoded client credentials. We access the cluster through the bootstrap URL and it’s accessible via the CLI or the cluster details (see above). The client credentials for SASL_PLAIN authentication consists of the base64 encoded service account key JSON file.

My index.js file defines a Kafka object with the client ID (which identifies our consumer), the bootstrap server URL, and SASL credentials. Then I define a consumer with a consumer group ID and subscribe to the “topic1” we created earlier. I process and log each message before appending to an array variable. There’s an HTTP GET endpoint that returns the array. See the whole index.js below, and the GitHub repo here.
```
const express = require('express');
const { Kafka, logLevel } = require('kafkajs');
const app = express();
const port = 8080;

const kafka = new Kafka({
  clientId: 'seroter-consumer',
  brokers: ['bootstrap.seroter-kafka.us-west1.managedkafka.seroter-project-base.cloud.goog:9092'],
  ssl: {
    rejectUnauthorized: false
  },
  logLevel: logLevel.DEBUG,
  sasl: {
    mechanism: 'plain', // scram-sha-256 or scram-sha-512
    username: 'seroter-bq-kafka@seroter-project-base.iam.gserviceaccount.com',
    password: 'tybgIC ... pp4Fg=='
  },
});

const consumer = kafka.consumer({ groupId: 'message-retrieval-group' });

//create variable that holds an array of "messages" that are strings
let messages = [];

async function run() {
  await consumer.connect();
  //provide topic name when subscribing
  await consumer.subscribe({ topic: 'topic1', fromBeginning: true }); 

  await consumer.run({
    eachMessage: async ({ topic, partition, message }) => {
      console.log(`################# Received message: ${message.value.toString()} from topic: ${topic}`);
      //add message to local array
      messages.push(message.value.toString());
    },
  });
}

app.get('/consume', (req, res) => {
    //return the array of messages consumed thus far
    res.send(messages);
});

run().catch(console.error);

app.listen(port, () => {
  console.log(`App listening at http://localhost:${port}`);
});
```
Now we switch gears and go through the producer app that publishes to Apache Kafka.

This app starts off almost identically to the consumer app. There’s a Kafka object with a client ID (different for the producer) and the same pointer to the bootstrap server URL and credentials. I’ve got an HTTP GET endpoint that takes the querystring parameters and publishes the key and value content to the request payload. The code is below, and the GitHub repo is here.
```
const express = require('express');
const { Kafka, logLevel } = require('kafkajs');
const app = express();
const port = 8080; // Use a different port than the consumer app

const kafka = new Kafka({
    clientId: 'seroter-publisher',
    brokers: ['bootstrap.seroter-kafka.us-west1.managedkafka.seroter-project-base.cloud.goog:9092'],
    ssl: {
      rejectUnauthorized: false
    },
    logLevel: logLevel.DEBUG,
    sasl: {
      mechanism: 'plain', // scram-sha-256 or scram-sha-512
      username: 'seroter-bq-kafka@seroter-project-base.iam.gserviceaccount.com',
      password: 'tybgIC ... pp4Fg=='
    },
  });

const producer = kafka.producer();

app.get('/publish', async (req, res) => {
  try {
    await producer.connect();

    const _key = req.query.key; // Extract key from querystring
    console.log('key is ' + _key);
    const _value = req.query.value // Extract value from querystring
    console.log('value is ' + _value);

    const message = {
      key: _key, // Optional key for partitioning
      value: _value
    };

    await producer.send({
      topic: 'topic1', // Replace with your topic name
      messages: [message]
    });

    res.status(200).json({ message: 'Message sent successfully' });

  } catch (error) {
    console.error('Error sending message:', error);
    res.status(500).json({ error: 'Failed to send message' });
  }
});

app.listen(port, () => {
  console.log(`Producer listening at http://localhost:${port}`);
});
```
Next up, containerizing both apps so that I could deploy to a runtime.

I used Google Cloud Artifact Registry as my container store, and created a Docker image from source code using Cloud Native buildpacks. It took one command for each app:
```
gcloud builds submit --pack image=gcr.io/seroter-project-base/seroter-kafka-consumer
```
```
gcloud builds submit --pack image=gcr.io/seroter-project-base/seroter-kafka-publisher
```
Now we had everything needed to deploy and test our client apps.

Deploy apps to Cloud Run and test it out

I chose Google Cloud Run because I like nice things. It’s still one of the best two or three ways to host apps in the cloud. We also make it much easier now to connect to a VPC, which is what I need. Instead of creating some tunnel out of my cluster, I’d rather access it more securely.

Here’s how I configured the consuming app. I first picked my container image and a target location.

Then I chose to use always-on CPU for the consumer, as I had connection issues when I had a purely ephemeral container.

The last setting was the VPC egress that made it possible for this instance to talk to the Apache Kafka cluster.

About three seconds later, I had a running Cloud Run instance ready to consume.

I ran through a similar deployment process for the publisher app, except I kept the true “scale to zero” setting turned on since it doesn’t matter if the publisher app comes and goes.

With all apps deployed, I fired up the browser and issued a pair of requests to the “publish” endpoint.

I checked the consumer app’s logs and saw that messages were successfully retrieved.

Sending a request to the GET endpoint on the consumer app returns the pair of messages I sent from the publisher app.

Sweet! We proved that we could send messages to the Apache Kafka cluster, and retrieve them. I get all the benefits of Apache Kafka, integrated into Google Cloud, with none of the operational toil.

Read more in the docs about this preview service.

Option 3: Use a managed provider on your cloud(s) of choice

The final way you might choose to run Apache Kafka in the cloud is to use a SaaS product designed to work on different infrastructures.

The team at Confluent does much of the work on open source Apache Kafka and offers a managed product via Confluent Cloud. It’s performant, feature-rich, and runs in AWS, Azure, and Google Cloud. Another option is Redpanda, who offer a managed cloud service that they operate on their infrastructure in AWS or Google Cloud.

Why do this? Choosing a “best of breed” type of managed service is going to give you excellent feature coverage and operational benefits. These platforms are typically operated by experts and finely tuned for performance and scale. Are there any downside? These platforms aren’t free, and don’t always have all the native integrations into their target cloud (logging, data services, identity, etc) that a built-in service does. And you won’t have all the configurability or infrastructure choice that you’d have running it yourself.

Wrap up

It’s a great time to run Apache Kafka in the cloud. You can go full DIY or take advantage of managed services. As always, there are tradeoffs with each. You might even use a mix of products and approaches for different stages (dev/test/prod) and departments within your company. Are there any options I missed? Let me know!
July 15, 2024
Daily Reading List – July 12, 2024 (#354)

Finished the week mostly caught up and ready for the weekend. I’m trying to finish writing a tech blog post for publication here, but my demo app is misbehaving. Hopefully better luck this weekend.

[article] How To Measure Platform Engineering. This article proposes using a series of specific metrics to measure how useful your platform is for developers and the business overall.

[blog] How to Implement OAuth 2.0 into a Golang App. I’ll be honest with you, authentication protocols (SAML, OAuth, OIDC) are not my love language. I like my security stuff to be as invisible as possible. But this is a good walkthrough of the plumbing.

[blog] IAM so lost: A guide to identity in Google Cloud. Relatedly, identity management is my kryptonite. It’s easy for it to kill my momentum when building an app. However, I liked the way it was explained in this post.

[blog] How to Run Hugging Face Models Programmatically Using Ollama and Testcontainers. It’s been interesting to watch how the Docker team has leveraged their acquisition of AtomicJar (testcontainers). Posts like this make me think this was a great match.

[docs] From multi-cluster edge to mesh: Globally distributed applications exposed through GKE Gateway and Cloud Service Mesh. I read through this brand new architecture guide that shows you how to build a resilient architecture with Kubernetes clusters and a service mesh.

[article] CIOs resist vendor-led AI hype, seeking out transparency. Make good choices, pick tech for the right reasons. But it’s also interesting to read this take where CIOs are saying their being too cautious on using generative AI. I don’t know what to believe anymore.

[blog] Firebase Just Got Smarter: Meet Your New AI Development Assistant. This offers a deeper dive into the AI assistance included in this mobile and app platform.

[article] Barriers to AI adoption. So good. This summary of a recent research paper provides such a relatable set of concerns that AI superfans need to consider as they try to grow adoption.

[blog] Google BigQuery vs. Snowflake: A head-to-head comparison of cloud data warehouses. Here’s a fairly massive deep-dive into these two platforms, with details that may help you choose the right one for your situation.

[paper] Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting. This new paper proposes a different RAG framework that can increase accuracy and decrease latency.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

July 12, 2024
Daily Reading List – July 11, 2024 (#353)

You’ll find a whole lot of advice in today’s reading list. Advice on public speaking, changing IT culture, being resilient in a crisis, and even how to A/B test!

[blog] Long Document Summarization Techniques with Java, Langchain4J and Gemini models. Dan looks at three techniques you can use to summarize big docs with Java and your favorite LLM.

[blog] The Product Model in Traditional IT. Can you bring the product model to other types of orgs? This post looks at when it can work with traditional IT teams, when it doesn’t, and what has to change.

[blog] Programming, Fluency, and AI. Important post from Mike. If you’re not fluent in the thing you’re doing with the AI, you’re easily replaceable by AI. Don’t skip depth.

[article] The Anatomy of Slow Code Reviews. Slow reviews are a motivation killer. Google measures this closely. Here’s a post that explores the source of code review challenges.

[blog] The secrets of public speaking. Brian offers up some very good advice for those trying to improve their public speaking, or build up their confidence in the first place.

[blog] How I sent 500 million HTTP requests to 2.5 million hosts. Loved this. It’s a good answer to “what happens when I submit a web request” while also showing how you’d optimize a lot of calls.

[article] How To Be More Resilient: 6 Steps To Success When Life Gets Hard. There’s no pill or item to purchase to make you more resilient. Most of this advice from Eric relates to attitude.

[blog] BigQuery QUALIFY Clause: Towards Cleaner SQL Queries. I’ve got very dated SQL syntax knowledge. QUALIFY is not in my wheelhouse, but now I know more about it.

[article] Why Ruby on Rails Is Still Worth Your While as a Developer. Ruby is a tier-2 language at this point—viable, but not as vendor-supported or used as others—but that doesn’t mean it’s not a good language to work with.

[article] There’s a Smarter Way to A/B Test. Running A/B tests on your product can be expensive. This article proposes different assignment rules that let you run shorter trials.

[article] AI Success Depends on Tackling “Process Debt.” Don’t use AI to automate or augment bad processes. Use this technology moment to rethink processes and start fresh.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

July 11, 2024
Daily Reading List – July 10, 2024 (#352)

Folks on my team are analyzing (digital) watering holes where developers hang out. There’s no one place! It seems to differ by experience, region, and whether you’re in a big company or a freelancer. What are your top three hangouts? Tell me in the comments.

[paper] Searching for Best Practices in Retrieval-Augmented Generation. Good paper that expresses opinions after an investigation of techniques and practices for RAG with LLMs.

[blog] Vertex AI Studio vs. Google AI Studio: Choosing the Right AI Tool for Your Startup. We offer a couple of different surfaces today for those building with AI. I thought this was a solid comparison.

[blog] State of the Cloud 2024. I’m not ready to start using “AI Cloud” as a term, but what are tech leaders and investors thinking about the future of cloud? This report sheds some light.

[article] You could learn a lot from a CIO with a $17B IT budget. JPMorgan Chase has a massive IT budget and team, and does a ton of meaningful work. Here’s a good story about their approach.

[blog] Building PDF Open Source Services with Angular & GCP — Handling long processing tasks. Yesterday’s reading list contained an item about long-running HTTP requests. This post shows a real implementation of such a pattern.

[article] Creating Stability Is Just as Important as Managing Change. I work in a fairly dynamic place where things change often to meet the changing market needs. This post on “stability management” is important for those that don’t want to overly unnerve teams during times of change.

[blog] Optimizing CI with Bazel and Kaniko in Cloud Build. Bazel is used by lots of folks as a flexible build tool. This post goes keep into how to use it with a cloud-based CI tool.

[blog] Meta’s approach to machine learning prediction robustness. Learn more about how the Meta Ads business delivers a reliable and high-quality ML predictions solution.

[blog] Get started with Gemma on Ray on Vertex AI. Do some fine-tuning of the newest open Gemma model and see what the steps look like.

[article] The Big Interview: Solo.io CEO Idit Levine gets “Ambient.” I don’t see many long-form interviews of tech folks anymore, so this was refreshing. It’s a good read about a leader and space that’s evolved over time.

[article] 8 reasons developers love Go—and 8 reasons they don’t. Love and hate are strong emotions, but better than feeling “meh” about something. These are valid reasons you’ll like or dislike Go.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

July 10, 2024
Daily Reading List – July 9, 2024 (#351)

I’m still emptying out a long reading queue from the vacation break. You’ll find some good advice in the items listed below.

[blog] Avoiding long-running HTTP API requests. Good post from Derek. There are a few patterns to consider when dealing with a request that can take a long time to process.

[blog] Building Flexible and Maintainable Go-Lang Apps. I admittedly didn’t know Google created a dependency injection framework for Go named Wire. Now I do.

[blog] RAG API powered by LlamaIndex on Vertex AI. This offers up a good walkthrough of a powerful LLM orchestration framework that you can use to customize results based on your corpus of data.

[blog] Chrome Prompt Playground. AI is definitely going to replace some work that engineers do. Simon used Claude to quickly build a playground interface for the built-in Chrome LLM.

[blog] Understanding BigQuery data canvas: how to easily transform data into insights with AI. I need to play with this more, but I like the idea behind this UX for exploring and analyzing data. The post points to a demonstration you can run yourself.

[blog] Boost performance of Go applications with profile-guided optimization. This is a very cool Go feature. Pass in CPU profiling info and get an optimized build. Here’s how to test that out.

[blog] Share your streaming data with Pub/Sub topics in Analytics Hub. This is very cool. Most hyperscale clouds offer a “data exchange” type experience to sell your business data. But we’ve added the ability to also publish streaming Pub/Sub topics for outsiders to subscribe to.

[blog] Introduction to Federated Learning. Train locally (on device, in your data center), upload a trained model, and aggregate all those models into a complete model. It’s a powerful concept that’s applied today in a few cases.

[article] Survey Surfaces Lots of Software Supply Chain Insecurity. AI chatter can drown out other topics, but don’t sleep on supply chain security. Security professionals aren’t.

[blog] AI Conundrums. Here’s a series of musings from Stephen that is well worth your time to read.

[blog] Scaling Chick-fil-A’s Design System from Tool to Service. I don’t think I’ve heard the term “DesignOps” before, but let’s roll with it. The CFA team takes a broad approach to design team work.

[blog] Counting Gemini text tokens locally. Maybe you use on LLM for small requests, and another for giant ones? How can you count input tokens to know which LLM to call? Our latest Python SDK has a local tokenizer that does the job.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

July 9, 2024
Daily Reading List – July 8, 2024 (#350)

And I’m back. That was an excellent week off full of sun and fun. I tinkered around with a few tech things—reading books on data science, building a Kafka demo—but I mostly disconnected and embraced the free time. My reading queue was full of interesting items, hot takes, and educational material. Dig in.

[article] How DevProd teams got funded: 20 real-world examples. Do you have a formal developer productivity program in your team? Where does the funding come from?

[article] This AI cloud: How Google Gemini will help everyone build things faster, cheaper, better. I chatted with David at ZDNet about AI-assisted development and the future of this type of technology.

[blog] Serving a billion web requests with boring code. Here’s an excellent post with a good set of lessons learned about technology choices and design patterns.

[blog] FleetOps: Can the GKE Enterprise Stack Help Self-service Platforms Sync and Swim? Are integrated stacks “back”? Did they ever leave? Paul writes up a post that looks at an integrated GKE experience that forms the foundation of a good developer platform.

[docs] Migrate from AWS to Google Cloud: Migrate from Amazon RDS for SQL Server to Cloud SQL for SQL Server. I thought this provided a good set of advice for most any “from –> to” database migration. There are obviously some specific details to the target and destination listed here, but many parts of the workflow are general purpose.

[blog] What is Spring Modulith? Introduction to modular monoliths. If you like some of the loose coupling of a distributed system but appreciate the understand-ability of a monolithic app, you might like the “modular monolith” pattern. Here’s an example in Spring Boot.

[blog] TDD. You’re Doing it Wrong. Test-driven development is a useful practice, but there are wrong ways to do it, as John points out here.

[podcast] Navigating Corporate Giants: Jeffrey Snover and the Making of PowerShell. Listen (or read the transcript) to this conversation with Jeffrey who fought hard to build PowerShell within Microsoft.

[blog] DevRel’s Death as Zero Interest Rate Phenomenon. Not wrong, and it’s a good thing. No more devrel talking to devrel and working on whatever seems interesting. Now it’s about being aligned to biz priorities, demonstrating deep expertise, and owning the delivery of an outstanding dev experience.

[article] AI’s moment of disillusionment. Matt’s been on this “skeptic” thread for a while, but always with an eye towards folks making good choices. Don’t fall into either extreme of recklessly applying AI to everything, or, sitting entirely on the sidelines.

[blog] Latest Gemini features support in LangChain4j 0.32.0. I’m not disillusioned about AI when I keep seeing the positive progress towards quality tooling and libraries that help folks build useful systems. Great new stuff in this Java library.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

July 8, 2024
Daily Reading List – June 28, 2024 (#349)

I’m glad to be home after a busy work week out of town. I’m a single dad next week while my wife and son are on a trip together, so I’m taking it as vacation. I’ll be back with this reading list on July 8th!

[blog] Making Vertex AI the most enterprise-ready generative AI platform. Lots of news here, including Gemini 1.5 Flash going GA, Gemini 1.5 Pro introducing 2 million input tokens, the Imagen 3 image-generation model is in preview, context caching, provisioned throughput, and much more.

[blog] Google Cloud expands grounding capabilities on Vertex AI. There’s a really good story here for those building low-code or code-based AI agents.

[blog] 5 Myths About Zero Trust in the Cloud, Busted. There’s a vendor angle here, but it’s still a useful look at zero trust if you’re unfamiliar with the idea.

[blog] Doubling calculation speed and other new innovations in Google Sheets. I might spend more time in Google Sheets than VS Code. That’s not a flex; it’s a cry for help. But, it’s a good thing I like Sheets.

[blog] Open challenges for AI engineering. Great presentation and writeup from Simon that looks at what’s new in AI and open challenges for the industry.

[blog] Gemma 2 is now available to researchers and developers. This open model has proven fairly popular, and the new version is performant, fast, and efficient.

[blog] Developer Experience Is Still Developing. Forrester folks have been doing solid research into developer experience, and this post teases a new report they have. Worth reading!

[blog] How to: Build a Chat Web Application with Streamlit, Cloud Run and Gemini Flash. Very straightforward demo, and it shows off the power of APIs, frameworks, and serverless.

[blog] Deploy Python pipelines on Kubernetes using the Flink runner. If you want to geek out on data streaming this weekend, you could try following along with this post.

[blog] My 5 favorite ways of keeping the technical axe sharp. How do you stay on top of things? Maybe reading this daily list is one of your strategies. This post looks at a few ways to stay sharp.

##

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

June 28, 2024
Daily Reading List – June 27, 2024 (#348)

We had a great customer council meeting today in New York, and I love it when customers can be in a room with us for a day to provide feedback and hear about roadmaps. Back home tomorrow.

[blog] Gemini 1.5 Pro 2M context window, code execution capabilities, and Gemma 2 are available today. Each of these is super valuable. Even larger input context for one-shot requests into the LLM. Use Gemini to execute and iterate on Python code. And access our latest open model.

[article] What GitHub Pull Requests Reveal about Your Team’s Dev Habits. Looking for hidden patterns in your git transactions? Some researchers did just that.

[blog] “It’s a Balance” isn’t always the answer. I liked this post that explored a handful of decision making approaches to use based on the situation at hand.

[blog] Infrastructure as Code Landscape Overview 2024. Brian uses this post to look at declarative resource-oriented provisioning tools, so not things like Ansible, Chef, or Docker. If you’ve heard of all of these tools, you are a wizard.

[blog] Transformation Regrets. We often just hear survivor stories about how great something was. History is written by the winners! But where have things gone wrong on product transformations? Marty covers that here.

[blog] 110 new languages are coming to Google Translate. More than half a billion people speak the languages we just added to Google Translate.

[blog] Structuring Go Code for CLI Applications. The “right” project structure is the one that works for you, but most of us can pick up good practices from others.

[article] How Expedia Group Moved From 21 Platform Stacks to 1. Consolidation is so hot right now! Good teams are looking to reduce redundancy is their platforms, tools, and stacks.

[blog] Enhancing Netflix Reliability with Service-Level Prioritized Load Shedding. Cool post from Netflix about how they smartly throttle pre-fetch requests to ensure that user-initiated requests get priority under load.

##

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

June 27, 2024
Daily Reading List – June 26, 2024 (#347)

I like East Coast trips because my West Coast meeting times don’t kick in until late morning. The downside? It’s relentless after that. In today’s reading list, I can across some solid advice on a variety of topics.

[blog] Unlock the Power of Conversational AI: RAG 101 with Gemini & LangChain. The post is accompanied by a notebook if you want to follow along with this scenario.

[article] Survey Surfaces Varying Levels of Enthusiasm for AI Coding Tools. Execs are ready to go on AI-assisted coding, developers not as enthusiastic. And I’m surprised that so few are running POCs to help introduce these tools.

[article] AI Work Assistants Need a Lot of Handholding. This Wall Street Journal piece looks at productivity tools, and features a short silly quote from me.

[article] The Right Way to Go “All In”. Good piece that talks about striving for greatness not JUST be being obsessive about one “identity”, but by maintaining your self-complexity.

[blog] Tips for troubleshooting Google Cloud Load Balancing backends. Whether on-premises or in the cloud, your application traffic doesn’t follow a straight path from the user to the app. There’s lots of proxy components in between, and it’s helpful to know how to troubleshoot them!

[article] Building a Platform Team at a 153-Year-Old Company. Here’s a nice roadmap for those with a deeply established IT team that wants to embrace an engineering transformation.

[article] Why Cross-Functional Collaboration Stalls, and How to Fix It. The answer is not “better executive alignment” which I found refreshing. Read this for ways to fix your “collaboration drag.”

[blog] Advancing systems research: Synthesized Google storage I/O traces now available to the community. Do you want 2.5 billion storage I/O traces from Google? I don’t know your life; maybe this is the best day ever for you. Either way, it’s cool that we’re opening up this data for others to access and explore.

[blog] Old Books that Every Architect Should Read. Plenty of tech books slink into irrelevance once a new version of software gets released, or the industry trends shift significantly. But others remain fairly timeless, as Gregor points out here.

[article] DevTools Marketing That Works, According to Developers Themselves. Adam offers up some good advice for those trying to authentically market their developer tools to a skeptical audience.

##

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

June 26, 2024
Daily Reading List – June 25, 2024 (#346)

Long day in New York today, but a great chance to chat with customers and new friends. The challenge is real with implementing AI, but wise leaders see the potential!

[blog] What Is ChatGPT Doing … and Why Does It Work? I probably shared this last year when it came out, but I came across this post again and found it useful.

[blog] Leveling up FinOps: 5 cost management innovations from FinOps X 2024. I was inexplicably staying at the hotel where this event was going on last week. I like the new scenario modeling tool.

[article] 4 keys to writing modern Python. Lots of folks are using Python nowadays, and this article highlights some new language features.

[blog] Is your AI workload secure? Sita offers up a good perspective about AI security frameworks and why they matter.

[article] This Is How To Have A Long Awesome Life: 4 Secrets From Research. None of these are shocking, nor do they require a weird pod to sleep in. Just make good choices.

[blog] Build Rag With Llamaindex To Make LLM Answer About Yourself, Like in an Interview or About General Information. LangChain has a lot of attention, but there are a handful of viable orchestration frameworks out there. Check out LlamaIndex.

[blog] why we no longer use LangChain for building our AI agents. Or maybe don’t use an LLM orchestration framework at all? This post says these frameworks can add unnecessary overhead.

[blog] GKE under the hood: What’s new with Cluster Autoscaler. With benchmarks! These Kubernetes autoscaling improvements are baked into GKE, and we’re seeing some fairly significant results.

[article] Measuring Developer Experience at Google. This piece looks at how we measure, not what. Abi digs into a recent research paper that explains our Engineering Satisfaction survey.

[article] Platform Engineering: Lessons, queries from the First Hype Cycle. Check out this take on a recent Gartner assessment of the platform engineering space.

[article] JetBrains AI Assistant to integrate Google Gemini AI models. The IDE maker is mixing and matching models for developers who want AI assistance. Maybe a trend?

##

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

June 25, 2024

←Previous Page

1 … 42 43 44 45 46 … 138

Richard Seroter's Architecture Musings

Loading Comments...

Write a Comment...

Email (Required)

Name (Required)

Website

Subscribe Subscribed
- Richard Seroter's Architecture Musings
- Already have a WordPress.com account? Log in now.