All courses
AI
Course starts
Apr 9
AI Evals
AI Evals
Gain a practical playbook for defining, measuring, and improving the quality of AI agents
Gain a practical playbook for defining, measuring, and improving the quality of AI agents
Hosted by
Justin Bauer and Sandhya Hegde



Course outcomes
"Writing evals is going to become a core skill for product managers. It is such a critical part of making a good product with AI." — Kevin Weil, CPO@OpenAI
Traditional software was designed to guide users through a single "happy path," but AI products behave differently. Each interaction produces a wide distribution of personalized outcomes for every user. This means the PM craft is fundamentally changing — all the way from writing PRDs to analytics. The AI PM must adapt to this new reality to ship high quality AI products that customers love.
This course is a practical introduction to AI Evaluation for product managers. We will walk through the product management process for AI agents and give you a step-by-step playbook for building an eval-driven roadmap, including AI PRDs, golden datasets, eval rubrics, trace analysis, and automated evaluators. We will use examples, case studies, demos, and DIY exercises to help demonstrate what this process looks like in real life. By the end of the course, our goal is to help you grow into a full stack AI PM who can lead the development of high-quality AI products.
Course outcomes
"Writing evals is going to become a core skill for product managers. It is such a critical part of making a good product with AI." — Kevin Weil, CPO@OpenAI
Traditional software was designed to guide users through a single "happy path," but AI products behave differently. Each interaction produces a wide distribution of personalized outcomes for every user. This means the PM craft is fundamentally changing — all the way from writing PRDs to analytics. The AI PM must adapt to this new reality to ship high quality AI products that customers love.
This course is a practical introduction to AI Evaluation for product managers. We will walk through the product management process for AI agents and give you a step-by-step playbook for building an eval-driven roadmap, including AI PRDs, golden datasets, eval rubrics, trace analysis, and automated evaluators. We will use examples, case studies, demos, and DIY exercises to help demonstrate what this process looks like in real life. By the end of the course, our goal is to help you grow into a full stack AI PM who can lead the development of high-quality AI products.
Who this course is for
PMs and Senior PMs who are building (or want to learn how) AI features or agents and want a practical way to reason about quality, not just ship demos
Directors, Heads, and VPs of Product who are responsible for setting standards for how their teams build, evaluate, and improve AI products
Product leaders designing AI Product Reviews who feel their current toolkit is out of date and want to develop modern, AI-native practices
If you’re responsible for deciding what “good” looks like for an AI product (or expect to be in the next 6-12 months) then this course is for you.
Who this course is for
PMs and Senior PMs who are building (or want to learn how) AI features or agents and want a practical way to reason about quality, not just ship demos
Directors, Heads, and VPs of Product who are responsible for setting standards for how their teams build, evaluate, and improve AI products
Product leaders designing AI Product Reviews who feel their current toolkit is out of date and want to develop modern, AI-native practices
If you’re responsible for deciding what “good” looks like for an AI product (or expect to be in the next 6-12 months) then this course is for you.
Who this course is not for
As this is designed as an introductory course, it is not intended for those with advanced technical knowledge (e.g., ML engineers)
Who this course is not for
As this is designed as an introductory course, it is not intended for those with advanced technical knowledge (e.g., ML engineers)
Course syllabus
Module
0
The Evaluation Gap
Module
1
AI PRDs & Eval-driven Roadmaps
Module
2
Eval Lifecycle and Introduction to Trace Analysis
Module
3
Automated Evaluation - Code-based & LLM-as-Judge
Module
4
Evaluating Agent Architectures
Course syllabus
Module
0
The Evaluation Gap
Module
1
AI PRDs & Eval-driven Roadmaps
Module
2
Eval Lifecycle and Introduction to Trace Analysis
Module
3
Automated Evaluation - Code-based & LLM-as-Judge
Module
4
Evaluating Agent Architectures
Live course schedule
Week
1
Thu, April 9
AI PRDs & Eval-driven Roadmaps
Week
2
Thu, April 16
Eval Lifecycle and Introduction to Trace Analysis
Week
3
Thu, April 23
Automated Evaluation: Code-based & LLM-as-Judge
Week
4
Thu, April 30
Evaluating Agent Architectures
Live course schedule
Week
1
Thu, April 9
AI PRDs & Eval-driven Roadmaps
Week
2
Thu, April 16
Eval Lifecycle and Introduction to Trace Analysis
Week
3
Thu, April 23
Automated Evaluation: Code-based & LLM-as-Judge
Week
4
Thu, April 30
Evaluating Agent Architectures
Unique benefits of live courses
Tackle challenging projects with direct access to top tech experts who have hands-on experience with the exact problem you're facing.
Live courses
On-demand courses
Study material
Step-by-step lessons, frameworks, tips & common mistakes, and examples
Direct access to tech experts
Live insights and advice from experts plus unlimited chat access with course host
✕
Weekly live sessions and recordings
Dive deeper into the lessons with course host insights and lively Q&A discussions
✕
Relevant case studies
Contextualize lessons with real-world insights from guest experts
✕
Collaborative learning
Dedicated Slack channel to engage with a group of peers sharing similar objective
✕
Structure and accountability
Weekly priorities to help you stay on track with your learning schedule
✕
Course hosted by
Course created by
AI Evals
Gain a practical playbook for defining, measuring, and improving the quality of AI agents
Live course
Starts
Apr 9, 2026
Hosted by
Justin Bauer and Sandhya Hegde
AI Evals
Gain a practical playbook for defining, measuring, and improving the quality of AI agents
Live course
Starts
Apr 9, 2026
Hosted by
Justin Bauer and Sandhya Hegde













