Chalk Talk | Advanced
Eval Me Maybe: Twilio's approach to faster LLM development
Details

Want more confidence in your AI outputs? 

This session dives into how Twilio leverages evaluations—automated benchmarks, human assessments, and LLMs as evaluators—to measure understanding, performance, and error avoidance in large language models. You'll learn how our Emerging Tech and Innovation team accelerates development with evals, and what we’ve built to help customers run their own. 

Walk away with practical guidance on integrating evaluations into your LLM workflows using diverse metrics, contextual relevance, and transparent practices—so you can ship smarter, faster, and with fewer surprises.

Speakers
Emily Shenfield
Emily Shenfield
Product Marketing Engineer
Twilio