It’s been a long week! I spent some time this afternoon setting up a new work laptop, and I’m going to foolishly bring it on a trip tomorrow. What could go wrong?
[blog] Keys to a resilient Open Source future. Is AI going to be the best option for open source security? It might be, given the scale of code we’re talking and the volunteer-heavy approach.
[blog] Introducing Netflix’s Key-Value Data Abstraction Layer. Abstractions are tricky to maintain, and can accidentally block you from using unique features underneath. But for scenarios like this, the use case makes sense to me.
[blog] Quitting Time. Perseverance is important, but so is knowing when to quit. What’s your criteria, and can you stick to it?
[blog] Apache Airflow ETL in Google Cloud. The spectrum of hosting options is typically raw compute, managed compute, and managed servcies. That applies here as well.
Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:
Today was another day of team meetings so I was offline for much of it. But I got some good early-morning reading finished, and you’ll find them in my 400th list below.
[article] Study Finds No DevOps Productivity Gains from Generative AI. Hmm. I haven’t seen findings like this, which is why the article caught my eye. Run your own analysis when you introduce these tools into your environment to see where it makes a positive difference.
I had team meetings today, which made me grateful to work with excellent folks. Before starting this offsite, I read through some fascinating pieces that you’ll find below.
[blog] Advancing Our Chef Infrastructure. I can’t imagine that companies swap out infrastructure-as-code or configuration management investments often. Slack is all-in on Chef, and explains their evolution.
[blog] Should we decompose our monolith? This post looks at the messy world of today’s environments where there’s a mix of monolithic and microservice approaches.
[blog] Legacy Modernization meets GenAI. The ThoughtWorks crew has an interesting tool they’ve built to make modernizations better and more approachable.
[article] How (and why) should teams try distributed pair programming? While some companies are pushing return-to-office, many still have distributed teams that need to work together. This post looks at a study showing what makes distributed pair programming successful.
As hoped for, I mostly skipped doing work this weekend and spent longer-than-expected building an Apollo 11 lunar lander with one of my kids. We had fun. Today was back to the work that pays for these Lego sets.
[blog] Building LLM-powered applications in Go. Eli reminds us that the vast majority of “AI apps” are really just like apps that call LLMs. He shows off a handful of ways to create a RAG server in Go.
[blog] Open Source Foundations Considered Helpful. Do open source foundations just exist to throw lavish conferences? Nah. They add legit value, as James points out here.
[blog] RAG API. Here’s a good post about a managed solution for augmenting your LLM requests with customized data sources. I hadn’t tried this service out yet, and now i want to.
[youtube-video] Safe RAG for LLMs. *Three* RAG pieces today? Let’s embrace the discomfort. I liked this video which looked at patterns for RAG that avoid data leakage.
[article] Predicting developer attrition. How do you know a developer is on their way out the door? This paper looks at attrition and notes how burnout prevention and “opportunities to learn” are big factors.
Today was a good day. I got enough done that I think I can skip most work this weekend, and build that Lego set sitting on my shelf that my kiddo been asking me “when can we build this?” This weekend, son.
[blog] Intro to Ray for AI on Kubernetes. Kaslin explains this open source framework, why you’d use it, and how to get it running on Kubernetes.
[blog] Message Queues in System Design. Classic technology, but there’s always someone learning about it for the first time. Posts like this are a solid intro to how and when to use queues.
I’m starting to wonder if I’m writing too much. Between this daily list, blog posts (new one today), and a couple newsletters each week, maybe I need to use other mediums too. I dunno. Hopefully you’re enjoying it!
[blog] How big up-front design fails. Here are some lessons learned from projects with big upfront planning and design, and why a more iterative approach works better.
[podcast] 274: The Cloud Pod is Still Not Open Source. This latest episode of the podcast that reviews news from AWS, Microsoft and Google spent some time discussing my recent post about technical debt. Neat!
Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:
School is back in session, and I just met with a handful of teachers at a recent back-to-school night. They’re all figuring out how to account for generative AI tools that students have access to. I say, let’s give teachers the same tools to use. Specifically, what if a teacher wants a quick preliminary grade on book reports submitted by their students? To solve this, I used Gemini Flash 1.5 in Google Cloud Vertex AI in three different ways—one-off in the prompt editor, through code, and via declarative workflow.
Grade Homework in Vertex AI Studio
Maybe you just have one or two papers to grade. Something like Vertex AI Studio is a good choice. Even if you’re not a Google Cloud customer, you can use it for free through this link.
You don’t have any book reports handy to test this with? Me neither. In Vertex AI Studio, I prompted with something like “Write a 300 word book report for Pride and Prejudice from the perspective of an 8th grade student with a good vocabulary and strong writing skills.”
With that handy, let’s “grade” the paper. Here I’m using system instructions to direct the model. I used Gemini to generate the initial version and then I tweaked it a bit. Here’s the instructions:
Task: Evaluate 8th-grade book reports for an honors English class. You are a tough grader. Input: Book report text. Output: Initial letter grade (A, B, C, D, or F) based on: Structure: Clear introduction, body, and conclusion Grammar: Spelling, punctuation, sentence structure. Content: Understanding of the book, critical thinking. Consider: Age and language proficiency of the student.
This tells the model what to look for, the expected quality (“honors English”), and the desired output.
I added the system instructions to the marked section within Vertex AI Studio.
Then I simply pasted the book report into the “prompt” section and submitted the prompt.
I thought this was a pretty good book report, but Gemini 1.5 Flash had a high bar after obeying my system instructions.
This book report shows good understanding of the book and a strong grasp of the themes. However, there are some issues:
Structure: The introduction is a little long and could be more concise. The conclusion is also repetitive. The report could benefit from more clear paragraph breaks to organize the information more effectively.
Grammar: No significant grammar issues.
Content: The student demonstrates good understanding of the novel's major themes and characters. There's a strong focus on plot summary, which is okay, but the report could benefit from more critical analysis. The student needs to go beyond just describing the events to analyze how they impact the themes, characters, or messages of the book. Overall: This report shows good effort, but it needs to be refined.
Grade: B-
If I’m a teacher, this is a fascinating way to get a preliminary assessment of a writing assignment!
Grade Homework Through Code
The above solution works fine for one-off experiences, but how might you scale this AI-assisted grader? Another option is code.
To try this scenario out, I used Cloud Firestore as my document database holding the book reports. I created a collection named “Papers” in the default database and added three documents. Each one holds a different book report.
I think used the Firestore API and Vertex AI API to write some simple Go code that iterates through each Firestore document, calls Vertex AI using the provided system instructions, and then logs out the grade for each report. Note that I could have used a meta framework like LangChain, LlamaIndex, or Firebase Genkit, but I didn’t see the need.
package main
import (
"context"
"fmt"
"log"
"os"
"cloud.google.com/go/firestore"
"cloud.google.com/go/vertexai/genai"
"google.golang.org/api/iterator"
)
func main() {
// get configuration from environment variables
projectID := os.Getenv("PROJECT_ID")
collectionName := os.Getenv("COLLECTION_NAME") // "Papers"
location := os.Getenv("LOCATION") //"us-central1"
modelName := os.Getenv("MODEL_NAME") // "gemini-1.5-flash-001"
ctx := context.Background()
//initialize Vertex AI client
vclient, err := genai.NewClient(ctx, projectID, location)
if err != nil {
log.Fatalf("error creating vertex client: %v\n", err)
}
gemini := vclient.GenerativeModel(modelName)
//add system instructions
gemini.SystemInstruction = &genai.Content{
Parts: []genai.Part{genai.Text(`Task: Evaluate 8th-grade book reports for an honors English class. You are a tough grader. Input: Book report text. Output: Initial letter grade (A, B, C, D, or F) based on: Structure: Clear introduction, body, and conclusion Grammar: Spelling, punctuation, sentence structure. Content: Understanding of the book, critical thinking. Consider: Age and language proficiency of the student.
`)},
}
// Initialize Firestore client
client, err := firestore.NewClient(ctx, projectID)
if err != nil {
log.Fatalf("Failed to create client: %v", err)
}
defer client.Close()
// Get documents from the collection
iter := client.Collection(collectionName).Documents(ctx)
for {
doc, err := iter.Next()
if err != nil {
if err == iterator.Done {
break
}
log.Fatalf("error iterating through documents: %v\n", err)
}
//create the prompt
prompt := genai.Text(doc.Data()["Contents"].(string))
//call the model and get back the result
resp, err := gemini.GenerateContent(ctx, prompt)
if err != nil {
log.Fatalf("error generating context: %v\n", err)
}
//print out the top candidate part in the response
log.Println(resp.Candidates[0].Content.Parts[0])
}
fmt.Println("Successfully iterated through documents!")
}
The code isn’t great, but the results were. I’m also getting more verbose responses from the model, which is cool. This is a much more scalable way to quickly grade all the homework.
Grade Homework in Cloud Workflows
I like the code solution, but maybe I want to run this preliminary grading on a scheduled basis? Every Tuesday night? I could do that with my above code, but how about using a no-code workflow engine? Our Google Cloud Workflows product recently got a Vertex AI connector. Can I make it work with the same system instructions as the above two examples? Yes, yes I can.
I might be the first person to stitch all this together, but it works great. I first retrieved the documents from Firestore, looped through them, and called Vertex AI with the provided system instructions. Here’s the workflow’s YAML definition:
main:
params: [args]
steps:
- init:
assign:
- collection: ${args.collection_name}
- project_id: ${args.project_id}
- location: ${args.location}
- model: ${args.model_name}
- list_documents:
call: googleapis.firestore.v1.projects.databases.documents.list
args:
collectionId: ${collection}
parent: ${"projects/" + project_id + "/databases/(default)/documents"}
result: documents_list
- process_documents:
for:
value: document
in: ${documents_list.documents}
steps:
- ask_llm:
call: googleapis.aiplatform.v1.projects.locations.endpoints.generateContent
args:
model: ${"projects/" + project_id + "/locations/" + location + "/publishers/google/models/" + model}
region: ${location}
body:
contents:
role: "USER"
parts:
text: ${document.fields.Contents.stringValue}
systemInstruction:
role: "USER"
parts:
text: "Task: Evaluate 8th-grade book reports for an honors English class. You are a tough grader. Input: Book report text. Output: Initial letter grade (A, B, C, D, or F) based on: Structure: Clear introduction, body, and conclusion Grammar: Spelling, punctuation, sentence structure. Content: Understanding of the book, critical thinking. Consider: Age and language proficiency of the student."
generation_config:
temperature: 0.5
max_output_tokens: 2048
top_p: 0.8
top_k: 40
result: llm_response
- log_file_name:
call: sys.log
args:
text: ${llm_response}
No code! I executed the workflow, passing in all the runtime arguments.
In just a moment, I saw my workflow running, and “grades” being logged to the console. In real life, I’d probably update the Firestore document with this information. I’d also use Cloud Scheduler to run this on a regular basis.
While I made this post about rescuing educators from the toil of grading papers, you can apply these patterns to all sorts of scenarios. Use prompt editors like Vertex AI Studio for experimentation and finding the right prompt phrasing. Then jump into code to interact with models in a repeatable, programmatic way. And consider low-code tools when model interactions are scheduled, or part of long running processes.
23 years later, I still think about where I was the moment I heard that a plane crashed into the Twin Towers. Let’s never forget.
[youtube-video] Decoding Google Gemini with Jeff Dean. I liked this episode of the DeepMind podcast featuring our Chief Scientist. Jeff shares history, and future.
[blog] Kaggle Model Upload Made Easy. Millions of folks use Kaggle to try models, access datasets, compete in AI competitions, and more. This post looks at how you upload models.
[repo] Automatic Password Rotation. Do you regularly rotate your database passwords? Is that a manual or automated process? This reference architecture (and IaC scripts) shows you how to do it.
Big reading list today. Sorry, not sorry. Hopefully you find a few fun things that catch your eye.
[blog] Free Tools Every ML Beginner Should Use. There’s seemingly a new AI/ML tool released every day, but these are mature options for beginners and experts.
[blog] Cloud Run job with a Python Module. Background jobs with Cloud Run are great. Mazlum offers a detailed post about using Python modules with them.
[paper] In Defense of RAG in the Era of Long-Context Language Models. We got it going with Gemini’s 1m (now 2m) long context window, but now other models are starting to grow their input token size. But does that negate the need for RAG? Not according to this paper.
[site] Illuminate. Coolest thing I saw today. Generate an on-demand “podcast” for research papers and books. Try it with the paper above. Worked amazingly well.
[article] Coaching Founder Mode. Marty offers a must-read essay that proposes all product leaders be coached into a “founder mode” mindset.
[article] Take Your First Steps with Git. Git is for more than just developers; I’ve seen content authors, ops folks, database pros, and product managers rely on it. Having a base understanding is helpful.
[blog] Tinder API Style Guide — Part 1. Creating some API standards around URI patterns and headers? Take a look at what the Tinder engineering team put together.
Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:
Did you have a good start to your week? After writing a few thousand words this weekend about Cloud Run, I’m hunting for the next tech to mess around with. Suggestions?
[article] Measuring developers’ jobs-to-be-done. This story reviews a recent Google paper that explored “developer goals” that help internal teams optimize the dev experience.
[article] Why Generalists Own the Future. Specialists or generalists? It’s a common debate. Dan argues that generalists are well positioned in this dynamic future because they will adapt faster than those with specific expertise.
[blog] New Gemini model in LangChain4j. If you want a simple model-as-a-service experience with Gemini, the Google AI edition is a great place to start. If you’re looking for more advanced AI/ML features to go with it, Google Cloud’s Vertex AI is an ideal starting point. For the former, you can now use LangChain4j.
[blog] The insidious problem of configuration sprawl. If you’ve got all your system, application, and infrastructure configurations in a single place … you’re a unicorn. I’d like to meet you. Most folks have configs all over the place, as Brian talks about here.
Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below: