Skip to content

Richard Seroter's Architecture Musings

About
Daily Reading List
Contact Me

Author: Richard Seroter

Daily Reading List – October 14, 2024 (#418)

It was good to get back into a routine today! Check out some good items below, especially the first one.

[blog] How Hard Should Your Employer Work to Retain You? Goodness, what a post. Charity wrote a must-read piece for managers and individual contributors. Full of straight talk!

[blog] Better together: BigQuery and Spanner expand operational insights with external datasets. Nobody in the cloud pre-integrates like we do. Here’s another case of services that will cleanly work together. From BigQuery, issue queries to Spanner as if they were native tables within BigQuery. No data movement required.

[article] CIQ Unveils a Version of Rocky Linux for the Enterprise. If you’ve been hunting for an open, enterprise-grade Linux distribution since Red Hat retired CentOS, consider Rocky Linux.

[article] Replit’s Path to Product-Market Fit—The $1 Billion Side Project. Good story of having a vision in mind, and following the twists and turns along the way to building a product that resonates.

[blog] Introducing AI-powered app dev with code customization from Gemini Code Assist Enterprise. Time for a second look at Gemini Code Assist? Get strong enterprise features, customization based on code in GitHub and GitLab, a big input token window, and much more.

[blog] What Message Queue-Based Architectures Reveal About the Evolution of Distributed Systems. I started my career on this tech, and it’s definitely not flashy nowadays. But is it just core infrastructure that everyone assumes we need? This should be a good topic for Redmonk to explore.

[blog] Database Center — your AI-powered, unified fleet management solution. Can’t transform apps if you don’t transform ops. Here’s another solution for those trying to get a handle on their database estate.

[blog] Announcing Deno 2. In exploring this latest release of the JavaScript runtime, Simon uncovered a Jupyter notebook kernel in there. Nice option for those that want notebooks!

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

October 14, 2024
I love this technique for getting up-to-date suggestions from my AI coding assistant
Trust. Without trust, AI coding assistants won’t become a default tool in a developer’s toolbox. Trust is the #1 concern of devs today, and it’s something I’ve struggled with in regards to getting the most relevant answers from an LLM. Specifically, am I getting back the latest information? Probably not, given that LLMs have a training cutoff date. Your AI coding assistant probably doesn’t (yet) know about Python 3.13, the most recent features of your favorite cloud service, or the newest architectural idea shared at a conference last week. What can you do about that?

To me, this challenge comes up in at least three circumstances. There are entirely new concepts or tools that the LLM training wouldn’t know about. Think something like pipe syntax as an alternative to SQL syntax. I wouldn’t expect a model trained last year to know about that. How about updated features to existing libraries or frameworks? I want suggestions that reflect the full feature set of the current technology and I don’t want to accidentally do something the hard (old) way. An example? Consider the new “enum type” structured output I can get from LangChain4J. I’d want to use that now! And finally, I think about improved or replicated framework libraries. If I’m upgrading from Java 8 to Java 23, or Deno 1 to Deno 2, I want to ensure I’m not using deprecated features. My AI tools probably don’t know about any of these.

I see four options for trusting the freshness of responses from your AI assistant. The final technique was brand new to me, and I think it’s excellent.
1. Fine-tune your model
2. Use retrieval augmented generation (RAG)
3. Ground the results with trusted sources
4. “Train” on the fly with input context
Let’s briefly look at the first three, and see some detailed examples of the fourth.

Fine-tune your model

Whether using commercial or open models, they all represent a point-in-time based on their training period. You could choose to repeatedly train your preferred model with fresh info about the programming languages, frameworks, services, and patterns you care about.

The upside? You can get a model with knowledge about whatever you need to trust it. The downside? It’s a lot of work—you’d need to craft a healthy number of examples and must regularly tune the model. That could be expensive, and the result wouldn’t naturally plug into most AI coding assistance tools. You’d have to jump out of your preferred coding tool to ask questions of a model elsewhere.

Use RAG

Instead of tuning a serving a custom model, you could choose to augment the input with pre-processed content. You’ll get back better, more contextual results when taking into account data that reflects the ideal state.

The upside? You’ll find this pattern increasingly supported in commercial AI assistants. This keeps you in your flow without having to jump out to another interface. GitHub Copilot offers this, and now our Gemini Code Assist provides code customization based on repos in GitHub or GitLab. With Code Assist, we handle the creation and management of the code index of your repos, and you don’t have to manually chunk and store your code. The downside? This only works well if you’ve got the most up-to-date data in an indexed source repo. If you’ve got old code or patterns in there, that won’t help your freshness problem. And while these solutions are good for extra code context, they may not support a wider range of possible context sources (e.g. text files).

Ground the results

This approach gives you more confidence that the results are accurate. For example, Google Cloud’s Vertex AI offers “ground with Google Search” so that responses are matched to real, live Google Search results.

If I ask a question about upgrading an old bit of Deno code, you can see that the results are now annotated with reference points. This gives me confidence to some extent, but doesn’t necessarily guarantee that I’m getting the freshest answers. Also, this is outside of my preferred tool, so it again takes me out of a flow state.

Train on the fly

Here’s the approach I just learned about from my boss’s boss, Keith Ballinger. I complained about freshness of results from AI assistance tools, and he said “why don’t you just train it on the fly?” Specifically, pass the latest and greatest reference data into a request within the AI assistance tool. Mind … blown.

How might it handle entirely new concepts or tools? Let’s use that pipe syntax example. In my code, I want to use this fresh syntax instead of classic SQL. But there’s no way my Gemini Code Assist environment knows about that (yet). Sure enough, I just get back a regular SQL statement.

But now, Gemini Code Assist supports local codebase awareness, up to 128,000 input tokens! I grabbed the docs for pipe query syntax, saved as a PDF, and then asked Google AI Studio to produce a Markdown file of the docs. Note that Gemini Code Assist isn’t (yet) multi-modal, so I need Markdown instead of passing in a PDF or image. I then put a copy of that Markdown file in a “training” folder within my app project. I used the new @ mention feature in our Gemini Code Assist chat to specifically reference the syntax file when asking my question again.

Wow! So by giving Gemini Code Assist a reference file of pipe syntax, it was able to give me an accurate, contextual, and fresh answer.

What about updated features to existing libraries or frameworks? I mentioned the new feature of LangChain4J for the Gemini model. There’s no way I’d expect my coding assistant to know about a feature added a few days ago. Once again, I grabbed some resources. This time, I snagged the Markdown doc for Google Vertex AI Gemini from the LangChain4J repo, and converted a blog post from Guillaume to Markdown using Google AI Studio.

My prompt to the Gemini Code Assist model was “Update the service function with a call to Gemini 1.5 Flash using LangChain4J. It takes in a question about a sport, and the response is mapped to an enum with values for baseball, football, cricket, or other.” As expected, the first response was a good attempt, but it wasn’t fully accurate. And it used a manual way to map the response to an enum.

What if I pass in both of those training files with my prompt? I get back exactly the syntax I wanted for my Cloud Run Function!

So great. This approach requires me to know what tech I’m interested in up front, but still, what an improvement!

Final example. How about improved or replicated framework libraries? Let’s say I’ve got a very old Deno app that I created when I first got excited about this excellent JavaScript runtime.
```
// from https://deno.com/blog/v1.35#denoserve-is-now-stable
async function handleHttp(conn: Deno.Conn) {
  // `await` is needed here to wait for the server to handle the request
  await (async () => {
    for await (const r of Deno.serveHttp(conn)) {
      r.respondWith(new Response("Hello World from Richard"));
    }
  })();
}

for await (const conn of Deno.listen({ port: 8000 })) {
  handleHttp(conn);
}
```
This code uses some libraries and practices that are now out of date. When I modernize this app, I want to trust that I’m doing it the best way. Nothing to fear! I grabbed the Deno 1.x to 2.x migration guide, a blog post about the new approach to web servers, and the launch blog for Deno 2. The result? Impressive, including a good description of why it generated the code this way.

I could imagine putting the latest reference apps into a repo and using Gemini Code Assist’s code customization feature to pull that automatically into my app. But this demonstrated technique gives me more trust in the output of tool when freshness is paramount. What do you think?
October 14, 2024
Daily Reading List – October 11, 2024 (#417)

Flying home today, and thankful for decent wifi on this flight. Not good enough to do real work, but good enough to catch up email and do some tech reading. Enjoy the weekend.

[blog] Google DeepMind’s Demis Hassabis & John Jumper awarded Nobel Prize in Chemistry. Very impressive! Yes, Google is a successful commercial company, but I appreciate the deep investment in research we have here.

[article] Open source business model struggles at WordPress. No one is looking good here, but one side looks significantly worse. It doesn’t give me a ton of confidence in the platform!

[blog] Achieve global scale and greater flexibility with new Memorystore enhancements. In-memory caches are important. Now you can spin up Redis and Valkey datastores with cross-region replication and even single-shard clusters.

[article] How to Give Busy People the Time to Innovate. I liked many of the points in this article, and it’s something that matters if you need time for innovation, or want to free up your team to do creative work.

[blog] Project Shield expands free DDoS protection to even more organizations and nonprofits. It’s great that we’re offering this type of protection to companies that might struggle to protect themselves from online attacks.

[article] Gradio 5 is here: Hugging Face’s newest tool simplifies building AI-powered web apps. You’re familiar with this tool if you’ve thrown simple web interfaces atop an LLM. Now, it’s getting some enterprise-y features.

[blog] Valkey Momentum: Seven Months In. Here’s some excellent, well-rounded research from Rachel that explores the Redis project and its viable alternatives.

[blog] Using BigQuery Omni to reduce log ingestion and analysis costs in a multi-cloud environment. This is an excellent idea. Here’s a simplified architecture for storing and analyzing logs that sit in different clouds.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

October 11, 2024
Daily Reading List – October 10, 2024 (#416)

Today was the last full day of my trip to Sweden, and it was a busy one. I was part of the keynote at our Google Cloud Summit Nordics, participated in a panel interview, did three customer briefings, delivered a 45-minute talk and demo, and did a press interview. Back home tomorrow!

[blog] Local Development with Google Cloud Managed Service for Apache Kafka. I’ve played with this Kafka service, but smarter folks than me are doing more legit scenarios.

[article] How to Manage — and Avoid — Mental Fatigue. Short-ish post, but a good reminder to pay attention to fatigue and take active steps to reduce it.

[blog] Transcribe everything everywhere all at once with Chirp 2. We’ve gotten pretty good at language-based AI (see: NotebookLM) and this model does some great transcription.

[article] The State of Security in 2024. Here’s new survey data from O’Reilly that tells us what security professionals are worried about, and what they’re learning about.

[article] Fei-Fei Li picks Google Cloud, where she led AI, as World Labs’ main compute provider. Companies old and new, large and small are moving towards Google Cloud. Glad to have World Labs aboard.

[blog] Distributed Transactions in Go: Read Before You Try. If you have to do distributed transactions, there’s a chance you took a wrong turn somewhere.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

October 10, 2024
Daily Reading List – October 9, 2024 (#415)

Another packed day in Stockholm, with a keynote rehearsals for keynotes tomorrow. What a treat to get to work with and learn from folks around the world!

[blog] Things I Wished More Developers Knew About Databases. This article popped up for me today, and even though it’s four years old, Jaana’s wisdom is still on point.

[blog] Introducing AI-powered app dev with code customization from Gemini Code Assist Enterprise. Great update to this constantly-improving AI assistance experience. For devs in the IDE, it’s powerful. And now the Enterprise edition brings AI assistance to multiple cloud services.

[blog] What is Infrastructure as Data? What does this even mean? Is it all about having a serializable data structure that gets applied to infrastructure? Brian explores.

[blog] How to protect your site from DDoS attacks with the power of Google Cloud networking and network security. You deserve nice things. Companies of any size experience distributed denial of service attacks, but running in the right cloud can keep you safer.

[article] Knowledge workers lean on AI as workloads increase. Another survey showing that execs and workers have different awareness about AI usage, and that folks feel under-trained.

[article] Stop Ignoring Your High Performers. I was convicted about this point when I read Andy Grove’s book a few years back. You might mistakenly think high performers have it all figured out and should be left alone. That’s wrong. Spend MORE time with these folks.

[youtube-video] Deno 2 is here… will it actually kill Node.js this time? There was a ton of info packed into this 4-minute video and now I’m kinda psyched to use Deno.

[article] NotebookLM podcasts — the missing piece in the GenAI puzzle? Is this easily our most viral tech release in 2024? Probably. Tom looks at using NotebookLM in a technical context to better understand complex material. Daniel was also blown away.

[article] Enterprises cling to mainframe as cloud expands. It’s not a zero sum game. Just because Cloud grows, it doesn’t mean on-premises (necessarily) shrinks.

[blog] Don’t deploy these applications on serverless. For those whose serverless worldview consists of Amazon Lambda, this advice makes sense. For devs that use more modern runtimes (*cough* Cloud Run *cough*), these aren’t necessarily the same concerns.

[blog] Defining statistical models in JAX? We don’t make a lot of noise about JAX and why it’s awesome for ML Engineers. But many folks love using it.

[blog] Writing a circuit breaker in Go. How are you protecting your systems from cascading failures? Circuit breaker is one pattern to apply. This post looks at building one in Go.

[blog] Write better log queries, faster: Introducing pipe syntax in BigQuery and Cloud Logging. Maybe I’ll get into this? I don’t know. But if it makes querying easier, I’m all for it.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

October 9, 2024
Daily Reading List – October 8, 2024 (#414)

Today was another packed day here in Sweden, but I’m learning things and hopefully proving helpful to the folks here. I still made time for reading, and you’ll find some good ones below.

[blog] An advanced LlamaIndex RAG implementation on Google Cloud. There’s no shortage of agent frameworks for building AI apps. This post goes deep into using LlamaIndex.

[blog] How to Boost Performance and Reduce Costs with Stateful Feature Backfill in Apache Beam. The Intuit engineering team wrote up a summary of how they load historical data to provide state for realtime processing pipelines.

[blog] Protocol Buffer as data type in Spanner. Protobufs are used to serialize structured data, and now the Spanner database supports storing and retrieving them.

[article] How Saboteurs Threaten Innovation–and What to Do About It. Good stuff from Steve Blank about recognizing the challenges you’ll face as a founder or innovator.

[article] AI adoption drives ‘unmanageable’ spike in cloud costs. Sounds like a mix of better financial planning and smart deployment strategies are needed here.

[blog] Reduce unexpected costs with the new AI-powered Cost Anomaly Detection. This is such a useful capability! Get some help spotting unusual spend before it goes out of control.

[blog] 🌤️ IDX and Cloud Workstations: two Google tools empowering Cloud Development. I’ve played around with IDX, and need to give it another go. Hosted dev environments haven’t had their big moment yet, but it may be coming.

[article] Anyscale: New Optimized Runtime for Ray, Kubernetes Operator. We offer Ray in Vertex AI and GKE. It’s definitely growing in popularity!

[blog] Vertex AI Model Garden: All of your favorite LLMs in one place. Hyperscale clouds offer a few ways to consume LLMs. Nikita does a good job here of showing one way to deploy an LLM for inference.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

October 8, 2024
Daily Reading List – October 7, 2024 (#413)

It was a good flight to Stockholm, and I had some time yesterday to walk around, build tech demos, and meet friends for dinner. Today was bonkers, but discussions with five customers yielded some great insights. Enjoy the reading list below.

[blog] Three steps in mapping out your modern platform strategy. I wrote a paper! Google just published my perspectives on how you upgrade your platform to get ready for AI.

[article] Generative AI will spark mass upskilling of software engineers, Gartner says. Since Gartner thinks that 3/4 of all engineers will use AI assistance tools in a few years, training is needed.

[blog] Google BigQuery — Continuous Queries (hands-on). Quick look at this functionality that lets you process data streams directly within BigQuery.

[article] How to Build an AI Agent With Semantic Router and LLM Tools. It’s going to keep getting easier to implement this stuff for “free” within managed services and frameworks instead of building the plumbing yourself, but it’s good to know the fundamentals.

[article] OpenAI Launches New ChatGPT Interface, Designed for Coding. The app building and coding tools space is about to get VERY interesting.

[blog] A Gemini and Gemma tokenizer in Java. Token counts matter, and doing this activity locally (versus calling a remote endpoint) is convenient.

[blog] An Engineer’s Checklist of Logging Best Practices. These all seem like good suggestions for developers and ops folks to consider.

[article] Microsoft Taking Up the Mantra of Platform Engineering. Better late than never! Google has taken a platform approach to internal infrastructure and services for years, and seen the benefits.

[blog] Serving ML Models and Monitoring Predictions in Google Cloud Vertex AI. What do ML engineers actually do? This post walks us through storing, serviing, and monitoring models.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

October 7, 2024
Daily Reading List – October 4, 2024 (#412)

It was quite a busy day, but I’m ending the workday with a chance to catch up a bit. I’m off to Stockholm tomorrow, and am looking forward to a week with customers and colleagues.

[article] The Big Interview: Elastic CTO Shay Banon on suing AWS, returning to OSS, and GenAI. Open source is in a weird place. This article is about Elastic’s 180 degree turn. Also see this one about the WordPress drama that’s cost Automattic almost 10% of their staff.

[blog] How to Choose the Architecture for Your GenAI Application. Very interesting post from Lak that explores eight architectural patterns you might employ based on your creativity needs and risk tolerance.

[blog] AlloyDB supercharges PostgreSQL vector search with accuracy, speed, and 1B+ scale. The ScaNN index for PostgreSQL stems from years of Google research and offers some fantastic performance for those working with vectors.

[article] How to Measure Product-Market Fit. Not a long piece, but a good one if you’re looking for an approach (and metrics) to figure out if you’ve got an idea with traction.

[blog] Gemini 1.5 Flash-8B is now production ready. Nobody is shipping like Google right now. It’s so fun to watch. This new Gemini variant is fast, cheap, and good.

[blog] Introducing Stripe’s new API release process. Stripe is often (rightfully) held up as a good example of developer-friendly docs and tools. This is a refresh on how they version and communicate API changes.

[article] A Guide to Being a Great Panelist. Have you seen much advice on this topic? I have not. This is a good look at what you should do if you’ve been invited to participate in a panel.

[blog] When to use supervised fine-tuning for Gemini. Learn what supervised tuning is all about, how it helps you customize a model, and when to use it.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

October 4, 2024
Daily Reading List – October 3, 2024 (#411)

What a day. I’ve got an international trip this weekend, and I’m not feeling prepared. Hopefully a day full of meetings tomorrow will relieve the pressure 😜

[article] Google Equips PostgreSQL, Valkey Services for Vector Processing. We shared many database service updates yesterday. More details here.

[article] Redis Users Want a Change. Relatedly, this survey shows that a lot of folks are looking for something like Valkey as their Redis alternative.

[blog] Introducing Valkey 8.0 on Memorystore: unmatched performance and fully open-source. And Valkey 8.0 is now available on Google Cloud, with some impressive performance numbers.

[article] Workers have AI confidence — but no training to back it up, survey shows. Another survey showing a gap between exec and worker perspectives. And, that companies are under-investing in training.

[article] Developer Relations Foundation Aims to Clarify Role. I have mixed feelings on kicking off a whole foundation here. It’s good to drive clarity, but the number of people who care is relatively small! Related, State of DevRel report.

[blog] Keep your project structure simple! Does your project structure always reflect a certain pattern? Derek likes this example of matching the app’s behavior, not some pre-selected structure.

[blog] Mastering Dataflow: 5 In-Depth Guides to Real-World Applications. Good examples of where real-time processing helps, with links to guides for each.

[article] OpenStack is ready for the VMware refugees. There’s not a ton of choice for those running big, self-managed compute infrastructure. It’s good that OpenStack has stayed around.

[article] Fostering developers’ trust in generative artificial intelligence. Here’s some actionable advice for leaders looking to get developers comfortable with using AI tooling.

[blog] Persisting LLM chat history to Firestore. LLMs don’t (currently) remember anything, so you’ve got to store your “history” somewhere. Good example here.

[article] OpenAI raises $6.6B and is now valued at $157B. Impressive! Lots of drama over there, but they’ve clearly got technology and people worth investing in.

[blog] Evaluating Mitigations & Vulnerabilities in Chrome. This is a detailed and quite interesting look at how the Chrome Security Team assesses potential harm from exploits, and where we invest the most.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

October 3, 2024
Daily Reading List – October 2, 2024 (#410)

I had a day trip to San Francisco today, so a shorter reading list overall. Still some good ones in here!

[article] Using AI to jump-start code samples. Good experiment! It seems that AI generated code samples is one way to scale the work. But did it work and did the engineers like it? Tom shares his experience.

[blog] Fundamental challenges with Infrastructure as Code imply the language doesn’t matter. Brian thinks that the language you use—HCL, Java, whatever—doesn’t necessarily solve some of the trickier challenges of infrastructure as code.

[blog] Announcing Avien for AlloyDB Omni on Google Cloud, AWS, and Azure. This is a fun twist on multi-cloud. Avien took our run-anywhere PostgreSQL software (AlloyDB Omni) and made a service of it.

[article] ML Engineer comparison of Pytorch, TensorFlow, JAX, and Flax. There are lots of frameworks out there for ML engineers. This post looks at a handful of the most popular.

[blog] Using HBase Quotas to Share Resources at Scale. This is a post from HubSpot and goes into a lot of depth into their design decisions.

[blog] Understand your Cloud Storage footprint with AI-powered queries and insights. Here’s a good application of AI where it naturally fits into the existing product. Get smart recommendations and engage with your content.

[blog] Users engage with only 6% of product features: Product benchmark findings. Oof. I can believe it! Only a subset of functionality in what you’ve built gets the majority of usage.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:

Type your email…

October 2, 2024

←Previous Page

1 … 35 36 37 38 39 … 138

Richard Seroter's Architecture Musings

Loading Comments...

Write a Comment...

Email (Required)

Name (Required)

Website

Subscribe Subscribed
- Richard Seroter's Architecture Musings
- Already have a WordPress.com account? Log in now.