All courses

AI

Course starts

Apr 9

AI Evals

AI Evals

Gain a practical playbook for defining, measuring, and improving the quality of AI agents

Gain a practical playbook for defining, measuring, and improving the quality of AI agents

Hosted by

Justin Bauer and Sandhya Hegde

Course outcomes

"Writing evals is going to become a core skill for product managers. It is such a critical part of making a good product with AI." — Kevin Weil, CPO@OpenAI

Traditional software was designed to guide users through a single "happy path," but AI products behave differently. Each interaction produces a wide distribution of personalized outcomes for every user. This means the PM craft is fundamentally changing — all the way from writing PRDs to analytics. The AI PM must adapt to this new reality to ship high quality AI products that customers love.

This course is a practical introduction to AI Evaluation for product managers. We will walk through the product management process for AI agents and give you a step-by-step playbook for building an eval-driven roadmap, including AI PRDs, golden datasets, eval rubrics, trace analysis, and automated evaluators. We will use examples, case studies, demos, and DIY exercises to help demonstrate what this process looks like in real life. By the end of the course, our goal is to help you grow into a full stack AI PM who can lead the development of high-quality AI products.

Course outcomes

"Writing evals is going to become a core skill for product managers. It is such a critical part of making a good product with AI." — Kevin Weil, CPO@OpenAI

Traditional software was designed to guide users through a single "happy path," but AI products behave differently. Each interaction produces a wide distribution of personalized outcomes for every user. This means the PM craft is fundamentally changing — all the way from writing PRDs to analytics. The AI PM must adapt to this new reality to ship high quality AI products that customers love.

This course is a practical introduction to AI Evaluation for product managers. We will walk through the product management process for AI agents and give you a step-by-step playbook for building an eval-driven roadmap, including AI PRDs, golden datasets, eval rubrics, trace analysis, and automated evaluators. We will use examples, case studies, demos, and DIY exercises to help demonstrate what this process looks like in real life. By the end of the course, our goal is to help you grow into a full stack AI PM who can lead the development of high-quality AI products.

Who this course is for

  • PMs and Senior PMs who are building (or want to learn how) AI features or agents and want a practical way to reason about quality, not just ship demos

  • Directors, Heads, and VPs of Product who are responsible for setting standards for how their teams build, evaluate, and improve AI products

  • Product leaders designing AI Product Reviews who feel their current toolkit is out of date and want to develop modern, AI-native practices

If you’re responsible for deciding what “good” looks like for an AI product (or expect to be in the next 6-12 months) then this course is for you.

Who this course is for

  • PMs and Senior PMs who are building (or want to learn how) AI features or agents and want a practical way to reason about quality, not just ship demos

  • Directors, Heads, and VPs of Product who are responsible for setting standards for how their teams build, evaluate, and improve AI products

  • Product leaders designing AI Product Reviews who feel their current toolkit is out of date and want to develop modern, AI-native practices

If you’re responsible for deciding what “good” looks like for an AI product (or expect to be in the next 6-12 months) then this course is for you.

Who this course is not for

As this is designed as an introductory course, it is not intended for those with advanced technical knowledge (e.g., ML engineers)

Who this course is not for

As this is designed as an introductory course, it is not intended for those with advanced technical knowledge (e.g., ML engineers)

Course syllabus

Module

0

The Evaluation Gap

Module

1

AI PRDs & Eval-driven Roadmaps

Module

2

Eval Lifecycle and Introduction to Trace Analysis

Module

3

Automated Evaluation - Code-based & LLM-as-Judge

Module

4

Evaluating Agent Architectures

Course syllabus

Module

0

The Evaluation Gap

Module

1

AI PRDs & Eval-driven Roadmaps

Module

2

Eval Lifecycle and Introduction to Trace Analysis

Module

3

Automated Evaluation - Code-based & LLM-as-Judge

Module

4

Evaluating Agent Architectures

Live course schedule

Week

1

Thu, April 9

AI PRDs & Eval-driven Roadmaps

Week

2

Thu, April 16

Eval Lifecycle and Introduction to Trace Analysis

Week

3

Thu, April 23

Automated Evaluation: Code-based & LLM-as-Judge

Week

4

Thu, April 30

Evaluating Agent Architectures

Live course schedule

Week

1

Thu, April 9

AI PRDs & Eval-driven Roadmaps

Week

2

Thu, April 16

Eval Lifecycle and Introduction to Trace Analysis

Week

3

Thu, April 23

Automated Evaluation: Code-based & LLM-as-Judge

Week

4

Thu, April 30

Evaluating Agent Architectures

Unique benefits of live courses

Tackle challenging projects with direct access to top tech experts who have hands-on experience with the exact problem you're facing.

Live courses

On-demand courses

Study material

Step-by-step lessons, frameworks, tips & common mistakes, and examples

Direct access to tech experts

Live insights and advice from experts plus unlimited chat access with course host

Weekly live sessions and recordings

Dive deeper into the lessons with course host insights and lively Q&A discussions

Relevant case studies

Contextualize lessons with real-world insights from guest experts

Collaborative learning

Dedicated Slack channel to engage with a group of peers sharing similar objective

Structure and accountability

Weekly priorities to help you stay on track with your learning schedule

Course created by

AI Evals

Gain a practical playbook for defining, measuring, and improving the quality of AI agents

Live course

Starts
Apr 9, 2026

Hosted by

Justin Bauer and Sandhya Hegde

AI Evals

Gain a practical playbook for defining, measuring, and improving the quality of AI agents

Live course

Starts
Apr 9, 2026

Hosted by

Justin Bauer and Sandhya Hegde

Related courses