Provider Selection Guide

Choose the right STT, LLM, and TTS for your use case

A decision framework for selecting STT, LLM, and TTS providers in LiveKit's modular pipeline. Compare latency, accuracy, cost, and language support across providers with concrete recommendations for common use cases.

What You Build

Select and configure the optimal provider stack for a given business scenario.

Prerequisites

->Course 1.1

Chapters

Pipeline selection & latency budgets

20m

How LiveKit's modular pipeline lets you choose each provider independently. Understand the latency budget (STT + LLM + TTS target: ~500ms) and how streaming overlap reduces perceived latency.

Latency budgetProvider swappingStreaming overlapPlugin architecture

STT, LLM & TTS provider comparison

25m

Side-by-side comparison of providers at each pipeline stage: Deepgram vs Whisper vs Azure for STT, GPT-4o vs Claude vs Gemini for LLM, Cartesia vs ElevenLabs vs PlayHT for TTS. Latency, accuracy, cost, and language support.

STT providersLLM providersTTS providersLatency vs quality tradeoffs

Cost analysis & use case recommendations

15m

Cost breakdown for typical 5-minute conversations across budget to premium stacks. Concrete recommendations for customer service, healthcare, education, and entertainment use cases.

Cost per conversationStack optimizationUse case matchingDecision framework

What You Walk Away With

Ability to evaluate and select STT, LLM, and TTS providers based on latency, accuracy, cost, and use case requirements.