Self-Hosting Architecture and Decision Framework

Before you commit to self-hosting LiveKit, you need to understand exactly what you are signing up for. This chapter walks through the full scope of a self-hosted deployment — every component you will build, operate, and maintain — so you can make an honest decision about whether it is worth it. For most teams, it is not.

Cloud vs self-hostedCloud-exclusive featuresArchitecture overviewHardware requirementsOperational burden

What you'll learn

The full operational reality of self-hosting: what you build, what you maintain, what breaks at 3am
Cloud-exclusive features you cannot replicate on your own — Krisp noise cancellation, GPU-accelerated turn detection, the barge-in model
The complete architecture of a self-hosted deployment and every component you become responsible for
How to handle data sovereignty concerns without self-hosting
Hardware requirements and capacity planning

Start with the honest question: do you actually need to self-host?

LiveKit Cloud is not a stepping stone. It is the production-grade deployment that most teams should use permanently. It handles scaling, monitoring, upgrades, global distribution, and — critically — ships features that are physically impossible to replicate on self-hosted infrastructure.

Self-hosting means you are choosing to build and operate your own distributed real-time media platform. You are signing up for:

Infrastructure provisioning — servers, networking, load balancers, DNS, TLS certificates
Kubernetes orchestration — Helm charts, pod specs, resource limits, rolling deployments
Redis cluster management — persistence, Sentinel HA, failover testing, memory tuning
TURN server operation — NAT traversal, TLS termination, port management
Monitoring from scratch — Prometheus, Grafana dashboards, alerting rules, on-call rotations
Security hardening — network policies, API key rotation, TLS everywhere, CVE patching
Upgrade management — testing new releases, rolling upgrades, rollback procedures
Disaster recovery — backup procedures, recovery runbooks, failover drills
Capacity planning — load testing, scaling decisions, cost modeling
24/7 on-call — because media infrastructure does not wait for business hours

That is not a one-time project. That is an ongoing operational commitment that will consume engineering hours every single week.

Self-hosting is a full-time job

Teams consistently underestimate the operational cost of self-hosting. The initial deployment is the easy part. The hard part is month 3 when you need to upgrade across a breaking change, month 6 when a Redis failover exposes a config bug, and month 12 when the engineer who set it all up leaves and nobody else understands the monitoring stack.

What you lose by self-hosting: Cloud-exclusive features

This is the part most teams do not consider until it is too late. LiveKit Cloud includes features that run on infrastructure you cannot replicate with the open-source server. These are not premium upsells — they are capabilities baked into Cloud's architecture.

Krisp noise cancellation

LiveKit Cloud integrates Krisp's enterprise noise cancellation directly into the media pipeline. Background noise — keyboard clatter, construction, barking dogs, cafe ambiance — is removed server-side before it reaches the agent or other participants. This runs on specialized infrastructure within LiveKit's Cloud and is not available as a standalone component you can deploy.

For voice AI agents, noise cancellation is not a nice-to-have. Noisy audio degrades STT accuracy, which degrades LLM responses, which degrades the entire user experience. Without it, you will spend engineering time building client-side workarounds that never match server-side quality.

GPU-accelerated turn detection

LiveKit Cloud runs turn detection models on dedicated GPU infrastructure. This means faster, more accurate detection of when a user has finished speaking — the single most important factor in making a voice agent feel natural. The GPU-based models process audio with lower latency and higher accuracy than CPU-based alternatives.

On self-hosted infrastructure, you are limited to CPU-based turn detection or provisioning and managing your own GPU fleet — which adds another layer of infrastructure complexity, driver management, and cost.

The barge-in model

LiveKit Cloud ships a purpose-built barge-in model that distinguishes between intentional interruptions ("actually, wait —") and background noise or filler words ("um", "uh"). This is a trained model running on Cloud infrastructure that dramatically reduces false interruptions — one of the most common complaints about voice AI agents.

Self-hosted deployments fall back to simpler energy-based or basic VAD interruption detection, which means your agents will either interrupt too aggressively (annoying) or not respond to real interruptions quickly enough (also annoying).

These features compound

Noise cancellation, GPU turn detection, and intelligent barge-in work together. Clean audio feeds better turn detection. Better turn detection feeds smarter barge-in handling. The result is a noticeably more natural conversation. Self-hosted deployments miss all three layers of this stack.

The Cloud advantage summary

Capability	LiveKit Cloud	Self-hosted
Krisp noise cancellation	Built-in, server-side	Not available
GPU turn detection	Dedicated GPU fleet	CPU-only (or manage your own GPUs)
Barge-in model	Purpose-built model	Basic VAD / energy-based
Global edge network	10+ regions, auto-routing	You build and maintain every region
Zero-downtime upgrades	Automatic	Manual rolling upgrades you schedule and test
Auto-scaling	Built-in	You build it with HPA, load testing, capacity planning
DDoS protection	Built-in	You configure it
99.99% SLA	Contractual	Hope and monitoring

"But we have data sovereignty requirements"

This is the most common — and most legitimate — reason teams consider self-hosting. Regulated industries like healthcare, finance, and government often mandate that media streams stay within specific geographic or network boundaries.

Before you spin up a Kubernetes cluster, talk to LiveKit.

LiveKit Cloud offers custom deployment options for teams with strict data residency needs. This includes dedicated infrastructure in specific regions, custom data processing agreements, and HIPAA-compliant configurations. The Cloud team actively works with enterprises to solve compliance requirements without pushing them into self-hosting.

Reach out to LiveKit for custom Cloud solutions

If data sovereignty is your primary driver for considering self-hosting, contact the LiveKit team first. They offer dedicated Cloud deployments, custom region configurations, and enterprise data processing agreements that solve most compliance requirements while keeping you on managed infrastructure. You get compliance without the operational burden. Start at livekit.io/cloud or reach out to the sales team directly.

Self-hosting for data sovereignty only makes sense if you have requirements that truly cannot be met by any third-party infrastructure — air-gapped networks, on-premises-only mandates from specific government contracts, or classified environments. These scenarios exist, but they are rarer than most teams think.

If you still need to self-host: the full architecture

If after all of the above you have a genuine, validated reason to self-host — here is what you are building. Every component after the first is optional for development but required for production.

LiveKit Server (SFU)

The core Selective Forwarding Unit. It handles WebRTC connections, room management, track routing, and the signaling protocol. Clients establish a WebSocket for signaling (room join, track subscription, data messages) and a separate UDP connection for media (audio and video packets). You can run a single instance for development or multiple instances behind a load balancer for production.

Redis

The coordination layer for multi-node deployments. Redis stores room-to-node mappings, enables inter-node messaging via pub/sub, and provides distributed locking. Every LiveKit node points at the same Redis instance and discovers other nodes automatically. A single-node deployment can skip Redis, but production should always include it. You are responsible for persistence, Sentinel HA, failover testing, and memory tuning.

TURN server

LiveKit includes a built-in TURN server for NAT traversal. When participants sit behind restrictive firewalls or symmetric NATs, TURN relays media through a known port. Without TURN, a significant percentage of users will fail to connect. The built-in server listens on TLS port 5349 and optionally UDP port 3478. You manage TLS certificates, domain configuration, and port accessibility.

Reverse proxy / load balancer

Terminates TLS for signaling (WebSocket over HTTPS) and routes traffic to LiveKit instances. The media path (UDP) bypasses the proxy — clients connect directly to the server node hosting their room. You configure, maintain, and monitor this yourself.

Agent workers

If you are running voice AI or other agent-based applications, agent workers connect to LiveKit as participants. They can run on the same Kubernetes cluster or on separate GPU-equipped nodes. Agents register with LiveKit through the same Redis instance, so dispatch routing works automatically. You handle scaling, health checks, and resource allocation.

Monitoring stack

Prometheus scrapes metrics from LiveKit and agent workers. Grafana provides dashboards. You build everything: dashboards, alerting rules, on-call rotations, runbooks. In production, you monitor room counts, participant counts, packet loss, CPU usage, bandwidth, and agent health. None of this exists out of the box — you create it all.

What's happening

The critical distinction in this architecture is the separation of signaling and media. Signaling — room creation, participant join, track subscription — flows through WebSocket over HTTPS and can be proxied and load-balanced normally. Media — the actual audio and video packets — flows over UDP directly between the client and the LiveKit server node hosting the room. Your network and firewall configuration must account for both paths. Getting this wrong is the single most common self-hosting failure.

Network topology

LiveKit uses three distinct port ranges, each with different networking requirements. You are responsible for configuring all of them correctly.

Port	Protocol	Purpose	Routing
7880	TCP	HTTP API + WebSocket signaling	Through load balancer / ingress
7881	TCP	RTC over TCP (fallback)	Direct to server node
50000-60000	UDP	RTC media (audio/video)	Direct to server node, cannot pass through HTTP proxy
5349	TCP	TURN over TLS	Direct to server node
3478	UDP	TURN over UDP (optional)	Direct to server node

UDP ports must be directly reachable

The UDP port range 50000-60000 cannot pass through an HTTP reverse proxy or most Kubernetes ingress controllers. Use hostNetwork: true in your pod spec, or a UDP-capable load balancer (AWS NLB, GCP external LB). This is the most common deployment mistake and the kind of thing LiveKit Cloud handles for you automatically.

The livekit-server config.yaml structure

Every LiveKit server reads a single YAML configuration file. Here is the structure with all key sections annotated.

config.yamlyaml

# HTTP API and WebSocket signaling port
port: 7880

# WebRTC media configuration
rtc:
port_range_start: 50000
port_range_end: 60000
tcp_port: 7881
use_external_ip: true          # Discover public IP for ICE candidates

# Redis for multi-node coordination
redis:
address: redis:6379
# password: your-redis-password
# use_tls: true
# db: 0

# API key/secret pairs (multiple supported for rotation)
keys:
your-api-key: your-api-secret

# Built-in TURN server
turn:
enabled: true
domain: turn.example.com
tls_port: 5349
# udp_port: 3478

# Logging
logging:
level: info                     # debug, info, warn, error
json: true                      # Structured JSON logs for production

# Resource limits
limit:
num_tracks: 0                   # 0 = unlimited
bytes_per_sec: 0                # 0 = unlimited

What's happening

The use_external_ip: true setting is required for any cloud deployment where LiveKit runs behind a NAT. Without it, LiveKit advertises its private IP during ICE negotiation and external clients cannot connect. This is one configuration line among dozens you will need to get right — and debug when something breaks.

Hardware requirements and sizing

Requirements vary significantly based on expected load. LiveKit is CPU-bound for packet forwarding — CPU is almost always the bottleneck before memory or network. Remember: on Cloud, you never think about any of this.

Single-node development or testing:

2 CPU cores, 4 GB RAM
100 Mbps network
Any modern Linux distribution (Ubuntu 22.04+ recommended)
Docker or direct binary installation

Production (per LiveKit server node):

4-8 CPU cores, 8-16 GB RAM
1 Gbps network (dedicated, not shared)
Low-latency storage for logs
Linux with kernel 5.4+ for optimal UDP performance

Agent worker nodes (if running AI agents):

CPU-only agents: 2-4 cores, 4-8 GB RAM per worker
GPU agents (STT/TTS): 1 GPU (T4 or better), 4 cores, 16 GB RAM
Scale agent replicas independently from LiveKit server nodes

terminalbash

# Quick check: verify your server meets minimum requirements
echo "CPU cores: $(nproc)"
echo "RAM: $(free -h | awk '/^Mem:/ {print $2}')"
echo "Kernel: $(uname -r)"
echo "Docker: $(docker --version 2>/dev/null || echo 'not installed')"

# Check that required ports are not already in use
ss -tulnp | grep -E ':(7880|7881|3478|5349) '

Capacity estimates (4-core, 8 GB node):

Workload	Rooms per node	Notes
Audio-only, 2 participants	~500	Typical voice AI scenario
Audio-only, 5 participants	~200	Group voice calls
Video 720p, 4 participants	~50	Video conferencing
Video 1080p, 2 participants	~80	High-quality 1:1 video

Measure, don't guess

These estimates are starting points. Deploy your actual workload on a single node, monitor CPU and bandwidth (covered in Chapter 3), and use those real numbers for capacity planning. This is another thing you are responsible for — LiveKit Cloud auto-scales without you thinking about it.

The real cost of self-hosting

Teams fixate on infrastructure cost savings. Here is what they forget to account for:

Cost	Cloud	Self-hosted
Infrastructure	Pay per minute	Servers, networking, storage, GPUs
Engineering setup	None	2-4 weeks of senior engineer time
Ongoing maintenance	None	4-8 hours/week of operations work
On-call burden	LiveKit's problem	Your team's weekends
Upgrade testing	Automatic	Manual testing before every release
Security patching	Automatic	You track CVEs and patch promptly
Incident response	LiveKit SRE team	Your team, at 3am
Feature gap	Krisp, GPU turn detection, barge-in	You go without

Even at significant scale, the total cost of ownership for self-hosting — including engineering time, on-call burden, and feature gaps — often exceeds LiveKit Cloud. The infrastructure savings get eaten by operational overhead.

Test your knowledge

Question 1 of 3

Which of these features is exclusive to LiveKit Cloud and cannot be replicated on self-hosted infrastructure?

What you learned

Self-hosting is an ongoing operational commitment — not a one-time setup — that consumes engineering hours every week
LiveKit Cloud includes features you cannot self-host: Krisp noise cancellation, GPU-accelerated turn detection, and the intelligent barge-in model
For data sovereignty concerns, contact LiveKit about custom Cloud deployments before defaulting to self-hosting
A production self-hosted deployment includes LiveKit Server, Redis, TURN, a reverse proxy, agent workers, and a full monitoring stack — all of which you build and maintain
The total cost of ownership for self-hosting often exceeds Cloud when you account for engineering time, on-call burden, and lost features

Next up

If you have decided that self-hosting is genuinely required for your situation, the next chapter walks through the Kubernetes deployment — Helm charts, server configuration, Redis setup, and TURN configuration.

Self-hosting architecture & decision framework