Skip to content

Richard Seroter's Architecture Musings

About
Daily Reading List
Contact Me

Month: November 2024

Daily Reading List – November 27, 2024 (#450)

If you’re in the US, have a happy Thanksgiving holiday. I’ve got countless things to be thankful for. And if you’re outside the States, enjoy these two quiet days of work!

[blog] 2024: The State of Generative AI in the Enterprise. Lots of interesting data here. You all are majorly expanding your spend on generative AI apps and finding AI use cases. See here for some viable predictions.

[blog] How software-engineering instincts clash with Infrastructure as Code. Brian says that Infrastructure as Code isn’t like “regular” code. Best practices with most code would cause problems with IaC.

[blog] My review is stuck – what now? Wow, this post offers some outstanding advice for those who are trying to get a pull request approved or even looked at.

[article] What causes ‘bad days’ for developers? Maybe stuck code reviews cause bad days, but there are other things too. This piece looks at research into what disrupts developers’ workdays.

[blog] Cloud CISO Perspectives: To end ransomware scourge, start with more reporting — not blocking cyber-insurance. Lots of links, and good perspective, in this month’s security writeup.

[blog] Making a dev container with multiple data services. Follow Pamela’s advice if you want a data-ready dev environment when coding.

[blog] Setup a RAG with Google Drive data using Google Cloud’s RAG Engine. It’s not trivial to crawl content, index it, store chunks for vector retrieval, and then serve up augmented predictions. Our RAG Engine API makes it easier, as this post shows.

[blog] Bluesky’s open API means anyone can scrape your data for AI training. Lots of social channels available now. Open platforms mean … open access. Looks like you have no control over what folks do with your Bluesky posts.

[article] Anna Berenberg Talks at QCon San Francisco on Google’s One Network. I’m excited to see this recording come out, but for now, read the recap. Anna and team have been building up to this for years.

[article] GenAI Won’t Replace Open Source, Says AWS Exec. I didn’t realize this was even a concern!

[blog] DORA Report 2024 – A Look at Throughput and Stability. I’ve probably read a dozen pieces of analysis on this year’s DORA report, and as usual, Redmonk’s is among the best.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

November 27, 2024
Daily Reading List – November 26, 2024 (#449)

Today was the last workday of the week as I’m taking the rest of the week off for the Thanksgiving holiday. Stay tuned for a reading list tomorrow, though!

[article] Vercel Expands AI Toolkit with AI SDK 4.0 Update. Vercel is doing some great work bringing AI to devs. This toolkit is solid, and their v0 service has buzz.

[blog] Five Success Principles For Startup Founders. Here’s a short post with some embedded slides I found educational.

[article] Need a Trillion-Parameter LLM? Google Cloud Is for You. You likely don’t need one. But I like using platforms that are capable of delivering whatever I need now, or could need in the future.

[article] It’s Time to Reimagine Scale. Speaking of scale, I thought this was a thought-provoking piece about what it means to scale and how we might reassess success.

[blog] 7 examples of Gemini’s multimodal capabilities in action. Multimodal still feels underrated and people stick with chatbots. Expand your lens! There is so much more than LLMs like Gemini can do besides deal with textual input.

This slideshow could not be started. Try refreshing the page or viewing it in another browser.
AI eats the world. Flip through this very readable online slide deck for a good view of the opportunity and landscape for AI.

[blog] Australia Connect initiative delivers new digital pathways for the Indo-Pacific. More subsea cables are connecting the world.

[blog] KubeCon NA 2024 Key Takeaways: A Recap of Our Time in Salt Lake City. Watch conferences like this for signals about what’s relevant among early adopters and late majority. Here’s another recap, this time from Intuit.

[blog] Introducing Google Developer Program premium membership. The free program is excellent with training credits, insider access, and more. But if you want Cloud credits, unlimited access to training, a cert voucher and more, this new premium (paid) tier looks great.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

November 26, 2024
Daily Reading List – November 25, 2024 (#448)

Happy Monday. It’s a short week here in the States, and I seemed to fit a week’s worth of meetings into today. I’ll do reading lists through Wednesday!

[blog] Skip the RAG workflows with Gemini’s 2M context window and the Context Cache. You don’t have to skip them entirely, but there are definitely scenarios now where RAG is unnecessary.

[article] QCon San Francisco 2024 Day 1: Architectures, Rust, AI/ML for Engineers, Sociotech Resilience. QCons are probably the best non-vendor conferences in our industry. They grab the best speakers and have some of the smartest topics. Also check out the day 2 recap.

[blog] AlphaQubit tackles one of quantum computing’s biggest challenges. This probably has zero impact on your work or anything in your life today, but it’s still freakin’ cool.

[blog] Introducing the Model Context Protocol. It wouldn’t be surprising to see that we’re at a point in the AI ecosystem where open standards get proposed, and adopted.

[blog] AI-Powered Updates–Issue Grouping, Autofix, Anomaly Detection, and more. Look for vendor solutions that use AI to complement their core value prop. Sentry seems to be doing that.

[blog] Redacting sensitive information when using Generative AI models. How do you filter out sensitive data from your LLM API calls? Guillaume shows off one technique that works well.

[article] KPMG fuels Google Cloud practice with $100M investment. Smart folks over there at KPMG 🙂

[article] Start Presentations on the Second Slide. I like this advice. Too many folks spend excessive time on the setup and lose the audience before they get to the good stuff.

[blog] GoMLX: ML in Go without Python. Terrific post from Eli who shows that machine learning is expanding beyond it’s Python base.

[blog] Deno v. Oracle: Canceling the JavaScript Trademark. Popcorn, popped. This will be one to watch, and would benefit the industry if Deno gets their way.

[blog] Re-Invoke: Tool invocation rewriting for zero-shot tool retrieval. Here’s the latest from Google Research. It looks at how to get LLMs to retrieve the most relevant tools for a downstream agent to use.

[blog] Make IAM for GKE easier to use with Workload Identity Federation. Accessing cloud services from workloads running in Kubernetes? You don’t want to have to impersonate accounts, or embed credentials in the workload. Other options? This shows a great one.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

November 25, 2024
Daily Reading List – November 22, 2024 (#447)

It’s been a good week, even though I’ve been under the weather. I’ve got a chance to rest up over this weekend and most of next week. I hope you also take a chance to recharge!

[article] American Airlines is deploying new tech to shame boarding line cutters. Fully endorsed. Now if we can also somehow electrocute people who don’t return shopping carts, I’ll be thrilled.

[article] In The Context Of Long Context. Here’s a wonderful post from Steven (who’s part of the NotebookLM effort) about the magic and importance of long context for LLMs.

[article] The case against living in the Bay Area, for ambitious tech people. Forrest brings the heat, and receipts in this great post.

[article] Auditors blast Pentagon over insecure, “antiquated” IT systems. Even when high-profile audits result in embarrassing findings, it’s hard to modernize the offending legacy systems. I can’t imagine how hard it is when the findings aren’t on the front page.

[blog] Leveling Up Fuzzing: Finding more vulnerabilities with AI. Very cool. We’re finding software vulnerabilities in open source projects by using AI to generate fuzz tests.

[blog] What’s new in Spring Modulith 1.3? I’ve been keeping an eye on this “modular monolith” framework from the Spring team. It’s making good progress.

[article] DHH Wants To Make Web Dev Easy Again, With Ruby on Rails. Lots of hot takes in here, but you can always count on some pointed perspectives from DHH.

[article] New York Times and AWS dispel three mainframe to cloud migration myths. Oh, so moving from mainframe to the cloud doesn’t HAVE to impact staffing, performance, or reliability? Great to see.

[blog] Never rewrite code? I’m giving you all sorts of whiplash opinions today. Derek says to think twice before rewriting a system from scratch.

[blog] Boost your Continuous Delivery pipeline with Generative AI. Here’s a neat example of using AI to generate commentary and release notes for every source code change.

[blog] Building a User Signals Platform at Airbnb. I always enjoy these types of posts where companies go deep into their architectures. This one looks at a stream processing system.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

November 22, 2024
Daily Reading List – November 21, 2024 (#446)

Flying home after a good week in Sunnyvale. I’m excited about what we’re building ,but I’m more about excited about who is building and how we build.

[blog] Automating Bulk Assessments for Cloud Migration with Google’s DMA. If you’re migrating to the cloud, you probably have more than one database to consider. This post looks at a tool and approach to assess databases at scale.

[blog] Scaling wearable foundation models. A look at wearables, consumer health data, and model training.

[article] Global cloud spend to surpass $700B in 2025 as hybrid adoption spreads: Gartner. Cloud spending is up, and AI doesn’t hurt.

[blog] Learn to build and run AI powered apps at Firebase Demo Day ‘24. There’s a lot happening within this mobile and web hosting service. This post embeds short videos that show you how to realize some AI use cases.

[article] Google’s Gemini chatbot now has memory. It remembers. I mean, not in a creepy way. But if you want our Gemini agent to use past info to make future suggestions, you got it.

[article] What are top social drivers of performance in engineering teams? Are you creating an environment where your software engineers can thrive, and what predicts success? This post shares some research.

[article] The 2024 State of Platform Engineering? Fledgling at Best. Jennifer looks at a recent survey and explains the progression towards platform engineering, and why we have a ways to go.

[article] Structured rollout boosts Copilot adoption and satisfaction by 20%. Short post, but a good reminder that you can’t just hand your team a set of tools; invest in training and thoughtful rollouts.

[blog] Why configuration drift is so hard to avoid in practice. You may have adopted infrastructure as code, but Brian points out that it’s unlikely that you’ll funnel every change through that pipeline.

[blog] Announcing the Winners of the Gemini API Developer Competition! Very cool apps here, all showing a creativity that’s inspiring.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

November 21, 2024
Daily Reading List – November 20, 2024 (#445)

Had another good day in Sunnyvale today and I’m enjoying these chances to plan what, and how, we work in 2025. The industry keeps moving, even if I’m in all-day meetings, and there’s some good updates below.

[article] Using AI to design landing pages. This advice (and technique) could apply to both internal or external landing pages.

[blog] Top 10 Things Developers Want from their AI Code Assistants in 2024. A lot has changed in the year since Kate first wrote this post. This is a great list of what’s relevant right now.

[blog] Enhancing observability in complex IT infrastructures with Google Cloud Logging. I don’t come across much about built-in Cloud observability tools, so this one caught my eye. Good overview.

[podcast] Your Cloud IAM Top Pet Peeves (and How to Fix Them). Listen to this episode of the Cloud Security podcast to go deeper on identity and access management, and where things go sideways.

[article] The product-led growth product manager’s toolkit: Empowering product-led success. If you’re getting your head around PLG, this is a useful piece to read.

[blog] Control your Generative AI costs with the Gemini API’s context caching. This is a powerful function that has come to other models too. Learn when and how to use context caching.

[blog] How we built Google Meet’s adaptive audio feature. I assumed this was just a terrible fact of life. But no, it doesn’t HAVE to sound like death when people join a conference call from the same room.

[blog] Why Spring AI: The Seamless Path to Generative AI. Don’t sleep on Spring AI as a framework that stands to get traction. There’s a lot of Spring Boot out there, and this looks like a solid option for Java devs.

[blog] Google Axion is a game-changer — let me show you why. This Arm-based CPU is a big deal, and this detailed post explains the architecture and performance.

[article] Gartner forecast dampens cloud repatriation outlook. Workloads will move around, but I haven’t met many (any?) industry leaders who think exiting the cloud gives them an advantage.

[blog] New Cassandra to Spanner adapter simplifies Yahoo’s migration journey. Plenty of products or components come about because of a specific customer’s need. This one helped Yahoo! make their database transition.

[blog] Announcing new updates to Cloud Translation AI, now covering 189 languages. If you’re translating content for global audiences, or those in a region with multiple languages being spoken, check this out.

[article] WebAssembly Won’t Replace Docker Anytime Soon: Docker CTO. Agreed. The tooling isn’t there yet, and the use cases don’t seem mainstream enough to move folks off containers.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

November 20, 2024
Daily Reading List – November 19, 2024 (#444)

Today you’ll find a mix of newsy items, in addition to some long-form pieces. I think you’ll like one or two of these!

[blog] Use AI to build AI: Save time on prompt design with AI-powered prompt writing. It’s useful to learn basic prompting techniques. But also take advantage of platforms that help you do it.

[blog] Stateful workload operator: stateful systems on Kubernetes at LinkedIn. The team at LinkedIn set up a system on Kubernetes that makes it easier for app owners to deploy and run stateful systems.

[blog] Playground Wisdom: Threads Beat Async/Await. Big, interesting post about blocking code, async processing, and why threads are better than the async/await abstraction.

[blog] Meet Angular v19. Big, big release of this popular frontend framework. Dig through this to see what matters, and what’s new/improved.

[blog] There Is Only One Key Difference Between Observability 1.0 and 2.0. Another banger from Charity. She does an excellent job explaining what to look for in modern observability products.

[blog] Batch prediction in Gemini. You want predictions for dozens, hundreds, or thousands of items in your store, portfolio, or account? Mete has a clear post that shows how to submit batches of requests to an LLM.

[article] p-Hacking your A/B tests. You might be making a mistake when running A/B tests for your app, marketing campaigns, or whatever. Jason says you shouldn’t stop a test once it looks like you got a “conclusive” result.

[article] How To Track DORA Metrics in an Internal Developer Portal. Are you capturing software delivery performance metrics? How? Where? Here’s an example with one product, but extrapolate to whatever tool you use.

[blog] Data extraction: The many ways to get LLMs to spit JSON content. If you’re parsing text with an LLM and want structured output, you have options. Guillaume shows off a few, and lands on a clean choice.

[blog] Scaling Ambient In Your Sleep. Quick post, but good to read if you skipped Istio the first time around because of too much complexity or too many scaling concerns.

[article] Microsoft brings together its enterprise AI offerings in the Azure AI Foundry. I’m keeping an eye on what our friends at Microsoft are up to at Ignite. With data products too.

[blog] AWS Lambda SnapStart for Python and .NET functions is now generally available. This looks like a cool feature from AWS. We offer quick startup with a CPU boost, but this does some neat things with memory. Also, a post about ten years of Lambda.

[blog] Cloud CISO Perspectives: The high security cost of legacy tech. Even if you aren’t chasing functional benefits from modernization, you probably crave a better security posture.

[article] When Your Manager Is Ineffective — and You Feel Stuck. There are good action items here to help you address a real (or perceived) lack of usefulness from your boss.

[blog] A Step-by-Step Guide to Fine-Tuning Gemini for Question Answering. I liked this post which explained some different types of fine tuning, and shows a clear example.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

November 19, 2024
Daily Reading List – November 18, 2024 (#443)

I’m in Sunnyvale for the week. Today was a product leadership offsite, and I had fun talking to our senior product leaders. Enjoy the reading list below.

[blog] Data loading best practices for AI/ML inference on GKE. How do you smartly load up the large weights/parameters of an LLM so that the inference server starts up faster? This post has good advice.

[blog] Platform Engineering as a Service. Does platform engineering bring order to the chaos that DevOps creates? Tyler thinks so, and offers up a solution.

[report] CNCF Technology Landscape Radar. If you’re a platform builder who is looking for open tech to bake into your stack, look at this new CNCF report that explores sentiment about some leading projects.

[blog] AlloyDB Omni supercharges performance: Faster transactions, analytics, and vector search. If you’re running PostgreSQL anywhere—on your laptop, in a local data center, in any public cloud—AlloyDB is turning into an excellent option. Great performance, and AI ready.

[article] What Comes After Open Source? Bruce Perens Has Some Ideas. Does something come after open source? This proposal is focused on sustainable funding for contributors by “taxing” large users.

[blog] Self-contained Executable Programs with Deno Compile. This makes it easy to turn your server-side JavaScript apps into a binary that runs anywhere.

[article] How To Stop Procrastinating: 4 Secrets From Research. This sat in my reading queue for a week before I got to it, which feels ironic. But some helpful advice here!

[blog] Meta Prompt Guard. Nervous about people tricking your AI agents into doing bad things? This post explores a lightweight classifier model from Meta that looks for malicious input.

[article] Let’s End Toxic Productivity. We have more conveniences than our grandparents did, but also seem to work more hours. Our standard of living is higher, as are our expectations, but still. This deep dive looks at how to step back and not overdo everything.

[blog] Empower your teams with self-service Kubernetes using GKE fleets and Argo CD. Nick and team show off a new Argo CD plugin that keeps your fleet of Kubernetes cluster synced and deployment ready.

[blog] Gemini in Firebase for Data Connect queries. Here’s a good use case for LLMs. Firebase is using natural language prompts to generate GraphQL instead of you needing to know the full syntax.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

November 18, 2024
Customizing AI coding suggestions using the *best* code, not just *my* code
The ability to use your own codebase to customize the suggestions from an AI coding assist is a big deal. This feature—available in products like Gemini Code Assist, GitHub Copilot, and Tabnine—gives developers coding standards, data objects, error messages, and method signatures that they recognize from previous projects. Data shows that the acceptance rate for AI coding assistants goes way up when devs get back trusted results that look familiar. But I don’t just want up-to-date and familiar code that *I* wrote. How can I make sure my AI coding assistant gives me the freshest and best code possible? I used code customization in Gemini Code Assist to reference Google Cloud’s official code sample repos and now I get AI suggestions that feature the latest Cloud service updates and best practices for my preferred programming languages. Let me show you how I did it.

Last month, I showed how to use local codebase awareness in Gemini Code Assist (along with its 128,000 input token window) to “train” the model on the fly using code samples or docs that an LLM hasn’t been trained on yet. It’s a cool pattern, but also requires upfront understanding of what problem you want to solve, and work to stash examples into your code repo. Can I skip both steps?

Yes, Gemini Code Assist Enterprise is now available and I can point to existing code repos in GitHub or GitLab. When I reference a code repo, Google Cloud automatically crawls it, chunks it up, and stores it (encrypted) in a vector database within a dedicated project in my Google Cloud environment. Then, the Gemini Code Assist plugin uses that data as part of a RAG pattern when I ask for coding suggestions. By pointing at Google Cloud’s code sample repos—any best practice repo would apply here—I supercharge my recommendations with data the base LLM doesn’t have (or prioritize).

Step #0 – Prerequisites and considerations

Code customization is an “enterprise” feature of Gemini Code Assist, so it requires a subscription to that tier of service. There’s a promotional $19-per-month price until March of 2025, so tell your boss to get moving.

Also, this is currently available in US, European, and Asian regions, you may need to request geature access via a form (depending on when you read this), and today it works with GitHub.com and GitLab.com repos, although on-premises indexing is forthcoming. Good? Good. Let’s keep going.

Step #1 – Create the source repo

One wrinkle here is that you need to own the repos you ask Gemini Code Assist to index. You can’t just point at any random repo to index. Deal breaker? Nope.

I can just fork an existing repo into my own account! For example, here’s the Go samples repo from Google Cloud, and the Java one. Each one is stuffed with hundreds of coding examples for interacting with most of Google Cloud’s services. These repos are updated multiple times per week to ensure they include support for all the latest Cloud service features.

I went ahead and forked each repo in GitHub. You can do it via the CLI or in the web console.

I didn’t overthink it and kept the repository name the same.

Gemini Code Assist can index up to 950 repos (and more if really needed), so you could liberally refer to best-practice repos that will help your developers write better code.

Any time I want to refresh my fork to grab the latest code sample updates, I can do so.

Step #2 – Add a reference to the source repo

Now I needed to reference these repos for later code customization. Google Cloud Developer Connect is a service that maintains connections to source code sitting outside Google Cloud.

I started by choosing GitHub.com as my source code environment.

Then I named my Developer Connect connection.

Then I installed a GitHub app into my GitHub account. This app is what enables the loading of source data into the customization service. From here, I chose the specific repos that I wanted available to Developer Connect.

When finished, I had one of my own repos, and two best practice repos all added to Developer Connect.

That’s it! Now to point these linked repos to Gemini Code Assist.

Step #3 – Add a Gemini Code Assist customization index

I had just two CLI commands to execute.

First, I created a code customization index. You’ve got one index per Cloud project (although you can request more) and you create it with one command.

Next, I created a repository group for the index. You use these to control access to repos, and could have different ones for different dev audiences. Here’s where you actually point to a given repo that has the Developer Connect app installed.

I ran this command a few times to ensure that each of my three repos was added to the repository group (and index).

Indexing can take up to 24 hours, so here’s where you wait. After a day, I saw that all my target repos were successfully indexed.

Whenever I sync the fork with the latest updates to code samples, Gemini Code Assist will index the updated code automatically. And my IDE with Gemini Code Assist will have the freshest suggestions from our samples repo!

Step #4 – Use updated coding suggestions

Let’s prove that this worked.

I looked for a recent commit to the Go samples repos that the base Gemini Code Assist LLM wouldn’t know about yet. Here’s one that has new topic-creation parameters for our Managed Kafka service. I gave the prompt below to Gemini Code Assist. First, I used a project and account that was NOT tied to the code customization index.
```
//function to create a topic in Google Cloud Managed Kafka and include parameters for setting replicationfactor and partitioncount
```
The coding suggestion was good, but incomplete as it was missing the extra configs the service can now accept.

When I went to my Code Assist environment that did have code customization turned on, you see that the same prompt gave me a result that mirrored the latest Go sample code.

I tried a handful of Java and Go prompts, and I regularly (admittedly, not always) got back exactly what I wanted. Good prompt engineering might have helped me reach 100%, but I still appreciated the big increase in quality results. It was amazing to have hundreds of up-to-date Google-tested code samples to enrich my AI-provided suggestions!

AI coding assistants that offer code customization from your own repos are a difference maker. But don’t stop at your own code. Index other great code repos that represent the coding standards and fresh content your developers need!
November 18, 2024
Daily Reading List – November 15, 2024 (#442)

Happy Friday. I’ve got lots of links again for you today! Enjoy your weekend and see y’all next week.

[blog] The 5 Cs: Configuring access to backing services. How do you set up configurations between app code and its database? Brian looks at what is needed, and wonders if there’s a better way.

[blog] Inference with Gemma using Dataflow and vLLM. I learned a least a half dozen things from reading this post. Cool look at what it takes to use an LLM in a streaming pipeline.

[article] IT leaders reshape 2025 spending around AI despite cost concerns. It’s time to invest ahead of returns, and smart folks get that. But, it’s also important to chase realistic returns!

[article] “Reducing Complexity”. John makes great points here about how we use “complexity” as a shorthand for a lot of different problems.

[blog] Announcing .NET 9. I’m admittedly not doing much with C# right now, but these language updates are still a big milestone. I’m sure devs will pick up this version quickly.

[article] Why designing landing pages is hard. It is hard. Know who you’re targeting, and just accept that for pages with a wide audience, you can’t please everyone.

[blog] RAG and Long-Context Windows: Why You need Both. Have a few tools at your disposal. This post also links to a long-context contest that’s still open.

[blog] Generative epiphany. Analogies don’t have to be perfect to be helpful. I like how Katie used ideas from the containerization world to grok LLMs.

[blog] Spring Boot and Temporal. Sometimes we feel like pioneers as we navigate the mashup of technologies. Cornelia goes through an exploration to get this workflow engine to play with a Java Spring Boot app.

[blog] How developers spend the time they save thanks to AI coding tools. Here’s some new data from GitHub that shows where devs are applying their AI-provided free time.

[blog] You’re not as loosely coupled as you think! Quick post, but Derek offers a useful reminder about the multiple types of coupling you’ll find in your architecture.

[blog] How to deploy Llama 3.2-1B-Instruct model with Google Cloud Run GPU. It’s getting less and less intimidating to work with LLMs. Here, you can quickly deploy a model to our serverless runtime.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

November 15, 2024

1 2 3

Richard Seroter's Architecture Musings

Loading Comments...

Write a Comment...

Email (Required)

Name (Required)

Website

Subscribe Subscribed
- Richard Seroter's Architecture Musings
- Already have a WordPress.com account? Log in now.