Chapter 125m

Self-hosting architecture & decision framework

Self-Hosting Architecture and Decision Framework

Running LiveKit on your own infrastructure gives you total control over data, latency, and cost at scale. It also means you own every operational concern: upgrades, security patches, monitoring, scaling, and disaster recovery. This chapter maps out the architecture you will build across this course and helps you decide whether self-hosting is the right call for your project.

Cloud vs self-hostedArchitecture overviewHardware requirementsAgent worker deployment

What you'll learn

  • A decision framework for choosing between LiveKit Cloud and self-hosting
  • The full architecture of a self-hosted deployment: SFU, signaling, TURN, Redis, agent workers
  • Network topology: ports, protocols, and NAT traversal
  • Hardware requirements and sizing for target concurrent sessions

When to self-host vs LiveKit Cloud

LiveKit Cloud is the fastest path to production. It handles scaling, monitoring, upgrades, and global distribution. For most teams, it is the right starting point. But specific scenarios make self-hosting not just preferred but required.

Data sovereignty. Regulated industries -- healthcare, finance, government -- often mandate that media streams never leave specific geographic boundaries or networks. Self-hosting lets you run LiveKit inside your own VPC, on-premises data center, or air-gapped network.

Latency control. When your users are concentrated in a region without a LiveKit Cloud point of presence, self-hosting in a nearby data center shaves critical milliseconds off the media path.

Cost at scale. At very high volumes -- thousands of concurrent rooms -- dedicated infrastructure economics can become favorable. This crossover point is higher than most teams expect.

Custom configuration. Self-hosting gives you access to every configuration knob: codec settings, TURN behavior, port ranges, Redis topology, and resource limits.

FactorSelf-hostedLiveKit Cloud
Time to productionDays to weeksMinutes
Data locationFull controlCloud regions
Operational burdenYou own it allManaged for you
Cost at low volumeHigher (fixed infra)Lower (pay per use)
Cost at high volumePotentially lowerScales with usage
UpgradesManual, you scheduleAutomatic, zero-downtime
Global distributionYou build itBuilt in
Custom configurationFull accessStandard options

Self-hosting is an ongoing commitment

Self-hosting is not a one-time setup. You take on responsibility for upgrades, security patches, monitoring, scaling, and disaster recovery. Budget for ongoing operational effort, not just initial deployment.

Start with Cloud, migrate later

The LiveKit client SDKs and agent framework connect to any LiveKit server -- Cloud or self-hosted. You can develop against Cloud, validate your application, and migrate to self-hosted infrastructure later without changing application code.

Architecture overview

A production self-hosted LiveKit deployment consists of six components working together. Every component after the first is optional for development, but all are required for production.

1

LiveKit Server (SFU)

The core Selective Forwarding Unit. It handles WebRTC connections, room management, track routing, and the signaling protocol. Clients establish a WebSocket for signaling (room join, track subscription, data messages) and a separate UDP connection for media (audio and video packets). You can run a single instance for development or multiple instances behind a load balancer for production.

2

Redis

The coordination layer for multi-node deployments. Redis stores room-to-node mappings, enables inter-node messaging via pub/sub, and provides distributed locking. Every LiveKit node points at the same Redis instance and discovers other nodes automatically. A single-node deployment can skip Redis, but production should always include it.

3

TURN server

LiveKit includes a built-in TURN server for NAT traversal. When participants sit behind restrictive firewalls or symmetric NATs, TURN relays media through a known port. Without TURN, a significant percentage of users will fail to connect. The built-in server listens on TLS port 5349 and optionally UDP port 3478.

4

Reverse proxy / load balancer

Terminates TLS for signaling (WebSocket over HTTPS) and routes traffic to LiveKit instances. The media path (UDP) bypasses the proxy -- clients connect directly to the server node hosting their room.

5

Agent workers

If you are running voice AI or other agent-based applications, agent workers connect to LiveKit as participants. They can run on the same Kubernetes cluster or on separate GPU-equipped nodes. Agents register with LiveKit through the same Redis instance, so dispatch routing works automatically.

6

Monitoring stack

Prometheus scrapes metrics from LiveKit and agent workers. Grafana provides dashboards. In production, you monitor room counts, participant counts, packet loss, CPU usage, bandwidth, and agent health.

What's happening

The critical distinction in this architecture is the separation of signaling and media. Signaling -- room creation, participant join, track subscription -- flows through WebSocket over HTTPS and can be proxied and load-balanced normally. Media -- the actual audio and video packets -- flows over UDP directly between the client and the LiveKit server node hosting the room. Your network and firewall configuration must account for both paths.

Network topology

LiveKit uses three distinct port ranges, each with different networking requirements.

PortProtocolPurposeRouting
7880TCPHTTP API + WebSocket signalingThrough load balancer / ingress
7881TCPRTC over TCP (fallback)Direct to server node
50000-60000UDPRTC media (audio/video)Direct to server node, cannot pass through HTTP proxy
5349TCPTURN over TLSDirect to server node
3478UDPTURN over UDP (optional)Direct to server node

UDP ports must be directly reachable

The UDP port range 50000-60000 cannot pass through an HTTP reverse proxy or most Kubernetes ingress controllers. Use hostNetwork: true in your pod spec, or a UDP-capable load balancer (AWS NLB, GCP external LB). This is the most common deployment mistake.

The livekit-server config.yaml structure

Every LiveKit server reads a single YAML configuration file. Here is the structure with all key sections annotated.

config.yamlyaml
# HTTP API and WebSocket signaling port
port: 7880

# WebRTC media configuration
rtc:
port_range_start: 50000
port_range_end: 60000
tcp_port: 7881
use_external_ip: true          # Discover public IP for ICE candidates

# Redis for multi-node coordination
redis:
address: redis:6379
# password: your-redis-password
# use_tls: true
# db: 0

# API key/secret pairs (multiple supported for rotation)
keys:
your-api-key: your-api-secret

# Built-in TURN server
turn:
enabled: true
domain: turn.example.com
tls_port: 5349
# udp_port: 3478

# Logging
logging:
level: info                     # debug, info, warn, error
json: true                      # Structured JSON logs for production

# Resource limits
limit:
num_tracks: 0                   # 0 = unlimited
bytes_per_sec: 0                # 0 = unlimited
What's happening

The use_external_ip: true setting is required for any cloud deployment where LiveKit runs behind a NAT. Without it, LiveKit advertises its private IP during ICE negotiation and external clients cannot connect. If your server has a public IP directly assigned to its network interface (bare metal, Elastic IP), this setting still works correctly.

Hardware requirements and sizing

Requirements vary significantly based on expected load. LiveKit is CPU-bound for packet forwarding -- CPU is almost always the bottleneck before memory or network.

Single-node development or testing:

  • 2 CPU cores, 4 GB RAM
  • 100 Mbps network
  • Any modern Linux distribution (Ubuntu 22.04+ recommended)
  • Docker or direct binary installation

Production (per LiveKit server node):

  • 4-8 CPU cores, 8-16 GB RAM
  • 1 Gbps network (dedicated, not shared)
  • Low-latency storage for logs
  • Linux with kernel 5.4+ for optimal UDP performance

Agent worker nodes (if running AI agents):

  • CPU-only agents: 2-4 cores, 4-8 GB RAM per worker
  • GPU agents (STT/TTS): 1 GPU (T4 or better), 4 cores, 16 GB RAM
  • Scale agent replicas independently from LiveKit server nodes
terminalbash
# Quick check: verify your server meets minimum requirements
echo "CPU cores: $(nproc)"
echo "RAM: $(free -h | awk '/^Mem:/ {print $2}')"
echo "Kernel: $(uname -r)"
echo "Docker: $(docker --version 2>/dev/null || echo 'not installed')"

# Check that required ports are not already in use
ss -tulnp | grep -E ':(7880|7881|3478|5349) '

Capacity estimates (4-core, 8 GB node):

WorkloadRooms per nodeNotes
Audio-only, 2 participants~500Typical voice AI scenario
Audio-only, 5 participants~200Group voice calls
Video 720p, 4 participants~50Video conferencing
Video 1080p, 2 participants~80High-quality 1:1 video

Measure, don't guess

These estimates are starting points. Deploy your actual workload on a single node, monitor CPU and bandwidth (covered in Chapter 3), and use those real numbers for capacity planning.

Test your knowledge

Question 1 of 3

Why does a production LiveKit deployment require separate network configuration for signaling and media traffic?

What you learned

  • Self-hosting is justified by data sovereignty, latency requirements, cost at scale, or configuration needs -- not by default
  • A production deployment includes LiveKit Server, Redis, TURN, a reverse proxy, agent workers, and monitoring
  • Signaling (WebSocket/HTTPS) and media (UDP) follow different network paths with different requirements
  • The config.yaml controls ports, Redis, keys, TURN, logging, and resource limits
  • Hardware sizing is CPU-bound; measure your actual workload before planning capacity

Next up

In the next chapter, you will deploy LiveKit on Kubernetes using Helm charts, configure Redis for multi-node coordination, and set up TURN for NAT traversal.

Concepts covered
Cloud vs self-hostedArchitecture overviewHardware requirementsAgent worker deployment