← All Webinars

On-Demand

Scaling AI Reliability: Real world lessons from Mistral AI

Learn how one of the world's leading AI companies monitors its infrastructure, manages incidents, and prepares for a future where agents respond to pages before humans do.

Date

Jan

26

2026

Watch Now

For AI infrastructure operators, reliability isn't optional — it's existential. Devon Mizelle from Mistral AI shares how they transitioned to automated monitoring workflows, including dynamic synthetic check generation for production models without manual setup.

What we'll cover

  1. Automated monitoring generation for production models
  2. Monitoring as code — eliminating manual configuration at scale
  3. Standardized alerting thresholds across services
  4. Autonomous incident resolution — capabilities and limitations
  5. The evolution of on-call practices toward AI-agent workflows

What you'll walk away with

  • A real-world blueprint for automating monitoring at AI-infrastructure scale
  • Clarity on where autonomous incident resolution works today and where it doesn't
  • Practical patterns for standardizing alerts and checks across a growing model fleet

Speakers

Sylvain Kalache

Head of AI Labs · Rootly AI

Giovanni Rago

Head of Customer Solutions · Checkly

Devon Mizelle

Sr. SRE · Mistral AI