Certification: Production, Deployment & Scaling
This section tests your understanding of running LiveKit voice agents in production — testing strategies, evaluation metrics, monitoring, scaling, and deployment patterns.
Testing Evaluation Monitoring Scaling Self-hosting Load testing
Test your knowledge
Question 1 of 5
A client asks how to test their voice agent before deploying to production. What testing strategy do you recommend?
A A layered approach: (1) Unit tests for individual tools and functions, (2) Integration tests that verify the agent calls the right tools for given inputs, (3) Conversation-level evaluation with scripted scenarios measuring task completion rate, (4) Load testing to verify concurrent call capacity. Automated evaluation is essential because voice agents are non-deterministic — you need statistical confidence, not single-pass verification. B Focus on end-to-end conversation testing with LLM-as-judge evaluation. Record 50-100 representative conversations, then use a separate LLM to score each conversation on task completion, tone, and accuracy. Unit testing tools is unnecessary because the LLM handles tool selection — if the conversation passes, the tools are working correctly. C Use A/B testing in production with a shadow agent. Deploy the new agent alongside the existing one, route 10% of traffic to the new agent, and compare metrics. This is more realistic than synthetic tests because it uses real caller behavior. Roll back automatically if task completion rate drops below the baseline by more than 5%. D Build a comprehensive test suite using recorded audio files from real calls. Replay each recording through the full pipeline (STT → LLM → TTS) and use speech recognition on the agent's audio output to verify correctness. This audio-in/audio-out testing captures the complete pipeline including STT accuracy and TTS quality.