Chapter 115m

Deployment architecture

Deployment architecture

Your voice AI agent works beautifully on localhost. Now it needs to work for real users, at scale, around the clock. This chapter maps out the three deployment models for LiveKit voice agents -- Cloud, self-hosted, and hybrid -- so you can choose the right architecture before writing a single deployment script.

CloudSelf-hostedHybrid

What you'll learn

  • The three deployment models for LiveKit voice agents and what each entails
  • Pros and cons of managed, self-hosted, and hybrid architectures
  • How to evaluate which model fits your constraints (compliance, cost, team size)
  • The high-level architecture of each deployment model

LiveKit Cloud (fully managed)

LiveKit Cloud is the fastest path to production. You push your agent code, and LiveKit handles everything else: WebRTC infrastructure, container orchestration, scaling, TLS, monitoring, and global edge routing.

terminalbash
# The entire deployment workflow for LiveKit Cloud
lk agent deploy

How it works:

Your agent is packaged into a Docker container. LiveKit Cloud pulls the image, runs it on managed infrastructure, and routes incoming sessions to available instances. When call volume spikes, new instances spin up automatically. When volume drops, they scale back down.

What LiveKit Cloud manages for you:

ComponentManaged by LiveKit Cloud
WebRTC SFUYes -- global edge network
Container orchestrationYes -- automatic scheduling
Auto-scalingYes -- based on concurrent sessions
TLS and TURNYes -- automatic certificate management
Monitoring and tracesYes -- Cloud Insights dashboard
Secret managementYes -- encrypted at rest

When to choose Cloud

Choose LiveKit Cloud when you want to move fast, your team is small, and you do not have strict data residency requirements that prevent using managed infrastructure. Most teams start here and only move to self-hosting when they have a concrete reason.

Self-hosted (your infrastructure)

Self-hosting means running the full LiveKit stack on your own servers: the LiveKit Server (SFU), your agent workers, Redis for coordination, and any supporting infrastructure like monitoring and load balancers.

docker-compose.ymlyaml
version: "3.9"
services:
livekit-server:
  image: livekit/livekit-server:latest
  ports:
    - "7880:7880"   # HTTP
    - "7881:7881"   # WebRTC TCP
    - "50000-60000:50000-60000/udp"  # WebRTC UDP
  volumes:
    - ./livekit-config.yaml:/etc/livekit.yaml
  command: --config /etc/livekit.yaml

redis:
  image: redis:7-alpine
  ports:
    - "6379:6379"

agent-worker:
  build: .
  environment:
    - LIVEKIT_URL=ws://livekit-server:7880
    - LIVEKIT_API_KEY=${LIVEKIT_API_KEY}
    - LIVEKIT_API_SECRET=${LIVEKIT_API_SECRET}
    - OPENAI_API_KEY=${OPENAI_API_KEY}
    - DEEPGRAM_API_KEY=${DEEPGRAM_API_KEY}
  depends_on:
    - livekit-server
    - redis

What you manage:

  • Server provisioning and OS updates
  • LiveKit Server deployment and upgrades
  • Network configuration (UDP port ranges, TURN servers, TLS)
  • Container orchestration (Docker Compose, Kubernetes, ECS)
  • Scaling policies and capacity planning
  • Monitoring, logging, and alerting
  • Redis cluster for multi-node setups
What's happening

Self-hosting is like running your own phone system versus using a cloud PBX. You get complete control over every component, but you also own every outage, every security patch, and every 3 AM alert. The operational burden is significant -- plan for at least one engineer spending meaningful time on infrastructure.

When to choose self-hosted

Choose self-hosting when you have hard data residency requirements (healthcare, government, finance), need to run in air-gapped environments, or when your scale justifies the engineering investment in a dedicated infrastructure team.

Hybrid (split responsibility)

The hybrid model uses LiveKit Cloud for WebRTC infrastructure while running your agent workers on your own servers. This is a practical middle ground: LiveKit handles the hard parts (SFU routing, global edge, TURN, TLS) while you keep agent execution in your environment.

terminalbash
# Your agent connects to LiveKit Cloud but runs on your servers
export LIVEKIT_URL=wss://your-project.livekit.cloud
export LIVEKIT_API_KEY=APIxxxxxxx
export LIVEKIT_API_SECRET=xxxxxxxxxxxxxxx

python agent.py start

Why hybrid makes sense:

Your agent code often needs access to internal systems -- databases, CRMs, EHR systems -- that sit behind your firewall. With hybrid, the agent runs inside your network and connects outbound to LiveKit Cloud. No inbound firewall rules needed. LiveKit Cloud handles media routing; your agent handles business logic on your infrastructure.

AspectCloudSelf-hostedHybrid
Time to productionHoursWeeksDays
Ops burdenMinimalHighMedium
Data residency controlLimitedFullAgent-side only
WebRTC expertise neededNoneSignificantNone
Cost at low scaleLowHigh (fixed infra)Medium
Cost at high scaleUsage-basedLower marginalUsage-based SFU
Access to internal systemsVia public APIsDirectDirect

Hybrid is increasingly common

Many production deployments use the hybrid model. It gives you the reliability of LiveKit Cloud's WebRTC infrastructure while keeping sensitive data processing inside your own network boundary.

Architecture decision framework

Use this framework to make the choice concrete:

1

Check compliance requirements

If regulations require all data processing on your infrastructure, self-hosted is the only option. If only agent-side data (transcripts, tool calls) needs to stay internal, hybrid works.

2

Assess team capacity

Self-hosting LiveKit requires WebRTC expertise, Kubernetes operations, and on-call coverage. If your team does not have this, start with Cloud or hybrid.

3

Estimate scale and cost

At low to medium scale, Cloud is almost always cheaper when you factor in engineering time. At very high scale (thousands of concurrent sessions), self-hosting can reduce per-session costs.

4

Consider internal system access

If your agent needs to reach databases, APIs, or services behind a firewall, hybrid gives you direct access without exposing those systems to the internet.

What's happening

There is no universally correct answer. The right deployment model depends on your constraints today and your growth trajectory. The good news is that switching between models is straightforward -- your agent code is the same regardless of where it runs. The deployment wrapper changes, not the agent logic.

Test your knowledge

Question 1 of 3

What is the key advantage of the hybrid deployment model over fully self-hosted?

What you learned

  • LiveKit Cloud is the fastest path to production with minimal ops burden -- best for most teams starting out
  • Self-hosted gives you full control but requires significant infrastructure expertise and operational investment
  • Hybrid splits the difference: LiveKit Cloud handles WebRTC, your infrastructure handles agent execution
  • Your agent code is portable across all three models -- only the deployment configuration changes

Next up

With your deployment model chosen, the next step is containerizing your agent. In the next chapter, you will write optimized Dockerfiles for both Python and Node.js agents using multi-stage builds, layer caching, and minimal base images.

Concepts covered
CloudSelf-hostedHybrid