Field Notes · AI Engineering

250+ pages of free AI agent knowledge, from Google

Google and Kaggle released five production-grade whitepapers covering the whole arc of building AI agents — from vibe-coded prototype to secured, spec-driven system. Here's what each one covers, and where to read it.

5Whitepapers
250+Pages
$0Cost to read
GoogleExpert authors

Source: Google & Kaggle's 5-Day AI Agents Intensive. Each paper links to its original on Kaggle.

If you build with AI agents, this is the most useful free reading list of the year. Five papers, written by Google practitioners, that move past the demo and into what it actually takes to ship an agent.

We read through the set and pulled out the core idea of each one. The short version: writing agent code is now the easy part — the hard parts are giving an agent the right tools and context, keeping it secure when its behavior is non-deterministic, and turning a promising prototype into something you can put in front of real users. Below is the map, in the order the curriculum lays it out.

The five papers

01

The New SDLC

Vibe Coding & the Reshaped Development Lifecycle

The starting point: how AI agents change the software development lifecycle itself. The paper traces the shift from writing syntax by hand to working in intent — describing what you want and letting an agent draft it — and argues this only scales when paired with the discipline of "agentic engineering" rather than ad-hoc prompting.

SDLC intent-driven dev agentic engineering
Read the paper
02

Tools & Interoperability

How Agents Connect, Communicate & Act

An agent is only as capable as the tools it can reach. This paper covers the functions that let an agent take actions and pull real-time data beyond its training set, and lays out best practices for designing tools that agents can use reliably — clear contracts, predictable inputs, and interoperability across systems.

tool calling interoperability real-time data
Read the paper
03

Agent Skills

The Capabilities Agents Need in Real Environments

What does an agent actually need to operate in production, not a sandbox? This paper looks at the skills and context that separate a clever demo from a dependable system — dynamically assembling the right information in the context window so the agent stays stateful, grounded, and personalized across a real task.

context engineering state & memory grounding
Read the paper
04

Security & Evaluation

Securing & Evaluating Vibe-Coded Agents

The chapter most teams skip. Because agent behavior is non-deterministic, you can't test it the old way — the paper proposes building continuous "effective trust" through a layered architecture, with practical safeguards: ephemeral sandboxing, defenses against hallucinated ("slopsquatting") package installs, and trajectory-level evaluation with OpenTelemetry.

effective trust sandboxing slopsquatting trajectory eval
Read the paper
05

Spec-Driven Development

From Prototype to Production-Grade

How to bridge the gap between a fragile vibe-coded prototype and enterprise software you can stand behind. The approach: treat code as disposable and make behavior — captured as clear, Gherkin-style specifications — the source of truth, so the system can be regenerated and verified against what it's supposed to do.

spec-driven Gherkin specs production-grade
Read the paper

Where Kaizen fits

This is the exact arc we work in — tools and context engineering, agent evaluation and observability, and the spec-driven discipline that gets an AI prototype to production. If your team is moving from "it works in the demo" to "it's running for customers," that's the conversation we have every day. Talk to our team →

Building with AI agents?

Let's turn your prototype into a production-grade system.

Get in Touch