Self-hosting architecture & decision framework
Self-Hosting Architecture and Decision Framework
Running LiveKit on your own infrastructure gives you total control over data, latency, and cost at scale. It also means you own every operational concern: upgrades, security patches, monitoring, scaling, and disaster recovery. This chapter maps out the architecture you will build across this course and helps you decide whether self-hosting is the right call for your project.
What you'll learn
- A decision framework for choosing between LiveKit Cloud and self-hosting
- The full architecture of a self-hosted deployment: SFU, signaling, TURN, Redis, agent workers
- Network topology: ports, protocols, and NAT traversal
- Hardware requirements and sizing for target concurrent sessions
When to self-host vs LiveKit Cloud
LiveKit Cloud is the fastest path to production. It handles scaling, monitoring, upgrades, and global distribution. For most teams, it is the right starting point. But specific scenarios make self-hosting not just preferred but required.
Data sovereignty. Regulated industries -- healthcare, finance, government -- often mandate that media streams never leave specific geographic boundaries or networks. Self-hosting lets you run LiveKit inside your own VPC, on-premises data center, or air-gapped network.
Latency control. When your users are concentrated in a region without a LiveKit Cloud point of presence, self-hosting in a nearby data center shaves critical milliseconds off the media path.
Cost at scale. At very high volumes -- thousands of concurrent rooms -- dedicated infrastructure economics can become favorable. This crossover point is higher than most teams expect.
Custom configuration. Self-hosting gives you access to every configuration knob: codec settings, TURN behavior, port ranges, Redis topology, and resource limits.
| Factor | Self-hosted | LiveKit Cloud |
|---|---|---|
| Time to production | Days to weeks | Minutes |
| Data location | Full control | Cloud regions |
| Operational burden | You own it all | Managed for you |
| Cost at low volume | Higher (fixed infra) | Lower (pay per use) |
| Cost at high volume | Potentially lower | Scales with usage |
| Upgrades | Manual, you schedule | Automatic, zero-downtime |
| Global distribution | You build it | Built in |
| Custom configuration | Full access | Standard options |
Self-hosting is an ongoing commitment
Self-hosting is not a one-time setup. You take on responsibility for upgrades, security patches, monitoring, scaling, and disaster recovery. Budget for ongoing operational effort, not just initial deployment.
Start with Cloud, migrate later
The LiveKit client SDKs and agent framework connect to any LiveKit server -- Cloud or self-hosted. You can develop against Cloud, validate your application, and migrate to self-hosted infrastructure later without changing application code.
Architecture overview
A production self-hosted LiveKit deployment consists of six components working together. Every component after the first is optional for development, but all are required for production.
LiveKit Server (SFU)
The core Selective Forwarding Unit. It handles WebRTC connections, room management, track routing, and the signaling protocol. Clients establish a WebSocket for signaling (room join, track subscription, data messages) and a separate UDP connection for media (audio and video packets). You can run a single instance for development or multiple instances behind a load balancer for production.
Redis
The coordination layer for multi-node deployments. Redis stores room-to-node mappings, enables inter-node messaging via pub/sub, and provides distributed locking. Every LiveKit node points at the same Redis instance and discovers other nodes automatically. A single-node deployment can skip Redis, but production should always include it.
TURN server
LiveKit includes a built-in TURN server for NAT traversal. When participants sit behind restrictive firewalls or symmetric NATs, TURN relays media through a known port. Without TURN, a significant percentage of users will fail to connect. The built-in server listens on TLS port 5349 and optionally UDP port 3478.
Reverse proxy / load balancer
Terminates TLS for signaling (WebSocket over HTTPS) and routes traffic to LiveKit instances. The media path (UDP) bypasses the proxy -- clients connect directly to the server node hosting their room.
Agent workers
If you are running voice AI or other agent-based applications, agent workers connect to LiveKit as participants. They can run on the same Kubernetes cluster or on separate GPU-equipped nodes. Agents register with LiveKit through the same Redis instance, so dispatch routing works automatically.
Monitoring stack
Prometheus scrapes metrics from LiveKit and agent workers. Grafana provides dashboards. In production, you monitor room counts, participant counts, packet loss, CPU usage, bandwidth, and agent health.
The critical distinction in this architecture is the separation of signaling and media. Signaling -- room creation, participant join, track subscription -- flows through WebSocket over HTTPS and can be proxied and load-balanced normally. Media -- the actual audio and video packets -- flows over UDP directly between the client and the LiveKit server node hosting the room. Your network and firewall configuration must account for both paths.
Network topology
LiveKit uses three distinct port ranges, each with different networking requirements.
| Port | Protocol | Purpose | Routing |
|---|---|---|---|
| 7880 | TCP | HTTP API + WebSocket signaling | Through load balancer / ingress |
| 7881 | TCP | RTC over TCP (fallback) | Direct to server node |
| 50000-60000 | UDP | RTC media (audio/video) | Direct to server node, cannot pass through HTTP proxy |
| 5349 | TCP | TURN over TLS | Direct to server node |
| 3478 | UDP | TURN over UDP (optional) | Direct to server node |
UDP ports must be directly reachable
The UDP port range 50000-60000 cannot pass through an HTTP reverse proxy or most Kubernetes ingress controllers. Use hostNetwork: true in your pod spec, or a UDP-capable load balancer (AWS NLB, GCP external LB). This is the most common deployment mistake.
The livekit-server config.yaml structure
Every LiveKit server reads a single YAML configuration file. Here is the structure with all key sections annotated.
# HTTP API and WebSocket signaling port
port: 7880
# WebRTC media configuration
rtc:
port_range_start: 50000
port_range_end: 60000
tcp_port: 7881
use_external_ip: true # Discover public IP for ICE candidates
# Redis for multi-node coordination
redis:
address: redis:6379
# password: your-redis-password
# use_tls: true
# db: 0
# API key/secret pairs (multiple supported for rotation)
keys:
your-api-key: your-api-secret
# Built-in TURN server
turn:
enabled: true
domain: turn.example.com
tls_port: 5349
# udp_port: 3478
# Logging
logging:
level: info # debug, info, warn, error
json: true # Structured JSON logs for production
# Resource limits
limit:
num_tracks: 0 # 0 = unlimited
bytes_per_sec: 0 # 0 = unlimitedThe use_external_ip: true setting is required for any cloud deployment where LiveKit runs behind a NAT. Without it, LiveKit advertises its private IP during ICE negotiation and external clients cannot connect. If your server has a public IP directly assigned to its network interface (bare metal, Elastic IP), this setting still works correctly.
Hardware requirements and sizing
Requirements vary significantly based on expected load. LiveKit is CPU-bound for packet forwarding -- CPU is almost always the bottleneck before memory or network.
Single-node development or testing:
- 2 CPU cores, 4 GB RAM
- 100 Mbps network
- Any modern Linux distribution (Ubuntu 22.04+ recommended)
- Docker or direct binary installation
Production (per LiveKit server node):
- 4-8 CPU cores, 8-16 GB RAM
- 1 Gbps network (dedicated, not shared)
- Low-latency storage for logs
- Linux with kernel 5.4+ for optimal UDP performance
Agent worker nodes (if running AI agents):
- CPU-only agents: 2-4 cores, 4-8 GB RAM per worker
- GPU agents (STT/TTS): 1 GPU (T4 or better), 4 cores, 16 GB RAM
- Scale agent replicas independently from LiveKit server nodes
# Quick check: verify your server meets minimum requirements
echo "CPU cores: $(nproc)"
echo "RAM: $(free -h | awk '/^Mem:/ {print $2}')"
echo "Kernel: $(uname -r)"
echo "Docker: $(docker --version 2>/dev/null || echo 'not installed')"
# Check that required ports are not already in use
ss -tulnp | grep -E ':(7880|7881|3478|5349) 'Capacity estimates (4-core, 8 GB node):
| Workload | Rooms per node | Notes |
|---|---|---|
| Audio-only, 2 participants | ~500 | Typical voice AI scenario |
| Audio-only, 5 participants | ~200 | Group voice calls |
| Video 720p, 4 participants | ~50 | Video conferencing |
| Video 1080p, 2 participants | ~80 | High-quality 1:1 video |
Measure, don't guess
These estimates are starting points. Deploy your actual workload on a single node, monitor CPU and bandwidth (covered in Chapter 3), and use those real numbers for capacity planning.
Test your knowledge
Question 1 of 3
Why does a production LiveKit deployment require separate network configuration for signaling and media traffic?
What you learned
- Self-hosting is justified by data sovereignty, latency requirements, cost at scale, or configuration needs -- not by default
- A production deployment includes LiveKit Server, Redis, TURN, a reverse proxy, agent workers, and monitoring
- Signaling (WebSocket/HTTPS) and media (UDP) follow different network paths with different requirements
- The
config.yamlcontrols ports, Redis, keys, TURN, logging, and resource limits - Hardware sizing is CPU-bound; measure your actual workload before planning capacity
Next up
In the next chapter, you will deploy LiveKit on Kubernetes using Helm charts, configure Redis for multi-node coordination, and set up TURN for NAT traversal.