Chapter 115m

Deployment architecture

Deployment architecture

Your voice AI agent works beautifully on localhost. Now it needs to work for real users, at scale, around the clock. This chapter maps out the three deployment models for LiveKit voice agents -- Cloud, self-hosted, and hybrid -- so you can choose the right architecture before writing a single deployment script.

CloudSelf-hostedHybrid

What you'll learn

  • The three deployment models for LiveKit voice agents and what each entails
  • Pros and cons of managed, self-hosted, and hybrid architectures
  • How to evaluate which model fits your constraints (compliance, cost, team size)
  • The high-level architecture of each deployment model

LiveKit Cloud (fully managed)

LiveKit Cloud is the fastest path to production. You push your agent code, and LiveKit handles everything else: WebRTC infrastructure, container orchestration, scaling, TLS, monitoring, and global edge routing.

terminalbash
# The entire deployment workflow for LiveKit Cloud
lk agent deploy

How it works:

Your agent is packaged into a Docker container. LiveKit Cloud pulls the image, runs it on managed infrastructure, and routes incoming sessions to available instances. When call volume spikes, new instances spin up automatically. When volume drops, they scale back down.

What LiveKit Cloud manages for you:

ComponentManaged by LiveKit Cloud
WebRTC SFUYes -- global edge network
Container orchestrationYes -- automatic scheduling
Auto-scalingYes -- based on concurrent sessions
TLS and TURNYes -- automatic certificate management
Monitoring and tracesYes -- Cloud Insights dashboard
Secret managementYes -- encrypted at rest

Cloud is the right default

LiveKit Cloud is the right choice for the vast majority of teams. It handles scaling, monitoring, upgrades, and global distribution — plus exclusive features like Krisp noise cancellation, GPU-accelerated turn detection, and intelligent barge-in handling that are not available on self-hosted infrastructure. Even for data residency needs, contact LiveKit about custom Cloud deployments before considering self-hosting.

Self-hosted (your infrastructure)

Self-hosting means running the full LiveKit stack on your own servers: the LiveKit Server (SFU), your agent workers, Redis for coordination, and any supporting infrastructure like monitoring and load balancers. This is a significant operational commitment that most teams should avoid.

What you take on:

  • Server provisioning and OS updates
  • LiveKit Server deployment and upgrades
  • Network configuration (UDP port ranges, TURN servers, TLS)
  • Container orchestration (Docker Compose, Kubernetes, ECS)
  • Scaling policies and capacity planning
  • Monitoring, logging, and alerting
  • Redis cluster for multi-node setups
  • 24/7 on-call for media infrastructure
  • Security patching and CVE tracking

You also lose access to Cloud-exclusive features: Krisp noise cancellation, GPU-accelerated turn detection, and the intelligent barge-in model — all of which meaningfully improve voice AI quality.

What's happening

Self-hosting is like running your own phone system versus using a cloud PBX. You own every outage, every security patch, and every 3 AM alert. The operational burden is significant — plan for at least one engineer spending meaningful time on infrastructure every week, indefinitely.

Self-hosting is rarely the right choice

Before committing to self-hosting, contact LiveKit about custom Cloud deployments for your data residency or compliance needs. Self-hosting only makes sense for air-gapped environments or specific government contracts that mandate on-premises infrastructure. For everything else, Cloud or hybrid is the better path.

Hybrid (split responsibility)

The hybrid model uses LiveKit Cloud for WebRTC infrastructure while running your agent workers on your own servers. This is a practical middle ground: LiveKit handles the hard parts (SFU routing, global edge, TURN, TLS) while you keep agent execution in your environment.

terminalbash
# Your agent connects to LiveKit Cloud but runs on your servers
export LIVEKIT_URL=wss://your-project.livekit.cloud
export LIVEKIT_API_KEY=APIxxxxxxx
export LIVEKIT_API_SECRET=xxxxxxxxxxxxxxx

python agent.py start

Why hybrid makes sense:

Your agent code often needs access to internal systems -- databases, CRMs, EHR systems -- that sit behind your firewall. With hybrid, the agent runs inside your network and connects outbound to LiveKit Cloud. No inbound firewall rules needed. LiveKit Cloud handles media routing; your agent handles business logic on your infrastructure.

AspectCloudSelf-hostedHybrid
Time to productionHoursWeeksDays
Ops burdenMinimalHighMedium
Data residency controlLimitedFullAgent-side only
WebRTC expertise neededNoneSignificantNone
Cost at low scaleLowHigh (fixed infra)Medium
Cost at high scaleUsage-basedSeemingly lower, but ops costs add upUsage-based SFU
Access to internal systemsVia public APIsDirectDirect

Hybrid is increasingly common

Many production deployments use the hybrid model. It gives you the reliability of LiveKit Cloud's WebRTC infrastructure while keeping sensitive data processing inside your own network boundary.

Architecture decision framework

Use this framework to make the choice concrete:

1

Check compliance requirements

If regulations require data residency controls, contact LiveKit about custom Cloud deployments first. They offer dedicated infrastructure in specific regions and custom data processing agreements. Only consider self-hosting for air-gapped or on-premises-only mandates. If only agent-side data (transcripts, tool calls) needs to stay internal, hybrid works.

2

Assess team capacity

Self-hosting LiveKit requires WebRTC expertise, Kubernetes operations, and on-call coverage. If your team does not have this — and most teams do not — use Cloud or hybrid.

3

Estimate scale and cost

At any scale, Cloud is almost always cheaper when you factor in engineering time, on-call burden, and lost Cloud-exclusive features (Krisp noise cancellation, GPU turn detection, barge-in model). The infrastructure savings from self-hosting get eaten by operational overhead.

4

Consider internal system access

If your agent needs to reach databases, APIs, or services behind a firewall, hybrid gives you direct access without exposing those systems to the internet.

What's happening

There is no universally correct answer. The right deployment model depends on your constraints today and your growth trajectory. The good news is that switching between models is straightforward -- your agent code is the same regardless of where it runs. The deployment wrapper changes, not the agent logic.

Test your knowledge

Question 1 of 3

What is the key advantage of the hybrid deployment model over fully self-hosted?

What you learned

  • LiveKit Cloud is the fastest path to production with minimal ops burden -- best for most teams starting out
  • Self-hosted gives you full control but requires significant infrastructure expertise and operational investment
  • Hybrid splits the difference: LiveKit Cloud handles WebRTC, your infrastructure handles agent execution
  • Your agent code is portable across all three models -- only the deployment configuration changes

Next up

With your deployment model chosen, the next step is containerizing your agent. In the next chapter, you will write optimized Dockerfiles for both Python and Node.js agents using multi-stage builds, layer caching, and minimal base images.

Concepts covered
CloudSelf-hostedHybrid