AI Evals Course | Reforge Learning

Reforge has joined Miro ↗

Learning

Pricing

Blog

Book a demo

All courses

AI Evals

Gain a practical playbook for defining, measuring, and improving the quality of AI agents

Justin Bauer and Sandhya Hegde

Course outcomes

"Writing evals is going to become a core skill for product managers. It is such a critical part of making a good product with AI." — Kevin Weil, CPO@OpenAI

Traditional software was designed to guide users through a single "happy path," but AI products behave differently. Each interaction produces a wide distribution of personalized outcomes for every user. This means the PM craft is fundamentally changing — all the way from writing PRDs to analytics. The AI PM must adapt to this new reality to ship high quality AI products that customers love.

This course is a practical introduction to AI Evaluation for product managers. We will walk through the product management process for AI agents and give you a step-by-step playbook for building an eval-driven roadmap, including AI PRDs, golden datasets, eval rubrics, trace analysis, and automated evaluators. We will use examples, case studies, demos, and DIY exercises to help demonstrate what this process looks like in real life. By the end of the course, our goal is to help you grow into a full stack AI PM who can lead the development of high-quality AI products.

Course outcomes

"Writing evals is going to become a core skill for product managers. It is such a critical part of making a good product with AI." — Kevin Weil, CPO@OpenAI

Who this course is for

PMs and Senior PMs who are building (or want to learn how) AI features or agents and want a practical way to reason about quality, not just ship demos
Directors, Heads, and VPs of Product who are responsible for setting standards for how their teams build, evaluate, and improve AI products
Product leaders designing AI Product Reviews who feel their current toolkit is out of date and want to develop modern, AI-native practices

If you’re responsible for deciding what “good” looks like for an AI product (or expect to be in the next 6-12 months) then this course is for you.

Who this course is for

PMs and Senior PMs who are building (or want to learn how) AI features or agents and want a practical way to reason about quality, not just ship demos
Directors, Heads, and VPs of Product who are responsible for setting standards for how their teams build, evaluate, and improve AI products
Product leaders designing AI Product Reviews who feel their current toolkit is out of date and want to develop modern, AI-native practices

If you’re responsible for deciding what “good” looks like for an AI product (or expect to be in the next 6-12 months) then this course is for you.

Who this course is not for

As this is designed as an introductory course, it is not intended for those with advanced technical knowledge (e.g., ML engineers)

Who this course is not for

As this is designed as an introductory course, it is not intended for those with advanced technical knowledge (e.g., ML engineers)

Course syllabus

Module

The Evaluation Gap

Module

AI PRDs & Eval-driven Roadmaps

Module

Eval Lifecycle and Introduction to Trace Analysis

Module

Automated Evaluation - Code-based & LLM-as-Judge

Module

Evaluating Agent Architectures

Course syllabus

Module

The Evaluation Gap

Module

AI PRDs & Eval-driven Roadmaps

Module

Eval Lifecycle and Introduction to Trace Analysis

Module

Automated Evaluation - Code-based & LLM-as-Judge

Module

Evaluating Agent Architectures

Course created by

Justin Bauer

Co-founder, Calibre

Justin is a product leader and former founder with over 15 years of experience turning data and AI into scalable, customer-driven products. As Chief Product Officer at Amplitude, he helped pioneer the discipline of product analytics—enabling thousands of teams to build better products with data science and experimentation.

Justin Bauer

Co-founder, Calibre

Justin Bauer

Co-founder, Calibre

Sandhya Hegde

Co-founder, Calibre

Sandhya is an AI investor (Sequoia Capital, Khosla Ventures) and startup executive (Amplitude) with a background in software entrepreneurship and data science. She has led early rounds in high growth AI startups like AirOps and Vizcom while consulting with some of the largest SaaS companies in the world on AI strategy and evaluation.

Sandhya Hegde

Co-founder, Calibre

Sandhya Hegde

Co-founder, Calibre

AI Evals

Gain a practical playbook for defining, measuring, and improving the quality of AI agents

Get Reforge Learning

40+ on-demand courses

600+ practical how-to guides

1,400+ ready-to-use artifacts

Reforge extension with AI-powered expert advice

Start trial

This course is not available on-demand.

Other on-demand courses ↗

Start Trial

Get reimbursed

Save with your team

AI Evals

Gain a practical playbook for defining, measuring, and improving the quality of AI agents

Get Reforge Learning

40+ on-demand courses

600+ practical how-to guides

1,400+ ready-to-use artifacts

Reforge extension with AI-powered expert advice

Start trial

This course is not available on-demand.

Other on-demand courses ↗

Start Trial

Get reimbursed

Save with your team

Related courses

AI Evals

Gain a practical playbook for defining, measuring, and improving the quality of AI agents

Created by

Justin Bauer and Sandhya Hegde

AI Evals

Gain a practical playbook for defining, measuring, and improving the quality of AI agents

Created by

Justin Bauer and Sandhya Hegde

AI Prototyping

Learn how to use AI-powered prototyping to test concepts, validate assumptions, and ship better products faster.

Created by

Ravi Mehta

AI Prototyping

Learn how to use AI-powered prototyping to test concepts, validate assumptions, and ship better products faster.

Created by

Ravi Mehta

AI Growth

In the world of AI, understand what is changing, what isn't, and how to build a sustainable growth system that still wins

Created by

Brian Balfour, Lauryn Motamdei, Bruno Estrella

AI Growth

In the world of AI, understand what is changing, what isn't, and how to build a sustainable growth system that still wins

Created by

Brian Balfour, Lauryn Motamdei, Bruno Estrella

AI Strategy

Fundamentally rethink how you compete and win in the most intense strategic environment product leaders have ever faced

Created by

Ravi Mehta, Brian Balfour

AI Strategy

Fundamentally rethink how you compete and win in the most intense strategic environment product leaders have ever faced

Created by

Ravi Mehta, Brian Balfour

AI Productivity

Learn how leading product managers use AI to become faster, smarter, and gain super powers beyond their traditional role

Created by

Sachin Rekhi

AI Productivity

Learn how leading product managers use AI to become faster, smarter, and gain super powers beyond their traditional role

Created by

Sachin Rekhi

AI Product Leadership

Learn how to lead effectively in the AI era: evolving strategy, tooling, team structure, and go-to-market to drive speed, leverage, and defensibility

Created by

Jiaona Zhang and Britt Jamison

AI Product Leadership

Learn how to lead effectively in the AI era: evolving strategy, tooling, team structure, and go-to-market to drive speed, leverage, and defensibility

Created by

Jiaona Zhang and Britt Jamison

AI Foundations

Master the building blocks of everything from foundation models to multi-agent systems

Created by

Brian Balfour

AI Foundations

Master the building blocks of everything from foundation models to multi-agent systems

Created by

Brian Balfour

View all courses

Information

Company

Legal

Information

Company

Legal

Information

Company

Legal