On-Demand

Scaling AI Reliability: Real world lessons from Mistral AI

Learn how one of the world's leading AI companies monitors its infrastructure, manages incidents, and prepares for a future where agents respond to pages before humans do.

Date

Jan

2026

Watch Now

For AI infrastructure operators, reliability isn't optional — it's existential. Devon Mizelle from Mistral AI shares how they transitioned to automated monitoring workflows, including dynamic synthetic check generation for production models without manual setup.

What we'll cover

Automated monitoring generation for production models
Monitoring as code — eliminating manual configuration at scale
Standardized alerting thresholds across services
Autonomous incident resolution — capabilities and limitations
The evolution of on-call practices toward AI-agent workflows

What you'll walk away with

—A real-world blueprint for automating monitoring at AI-infrastructure scale
—Clarity on where autonomous incident resolution works today and where it doesn't
—Practical patterns for standardizing alerts and checks across a growing model fleet

Speakers

Sylvain Kalache

Head of AI Labs · Rootly AI

Giovanni Rago

Head of Customer Solutions · Checkly

Devon Mizelle

Sr. SRE · Mistral AI

DETECT

Uptime Monitoring

Synthetic Monitoring

COMMUNICATE

Status Pages

Alerts

Dashboards

RESOLVE

Rocky AIAnalysis

Tracing

Developers

Resources

Webinars & Events

Community

Scaling AI Reliability: Real world lessons from Mistral AI

Watch Now

What we'll cover

What you'll walk away with

Speakers