Chapter 810m

The universal SDK ecosystem

The universal SDK ecosystem

In this chapter, you will learn the full breadth of LiveKit's SDK ecosystem — client SDKs for every major platform, server SDKs for backend operations, agent SDKs for AI, and pre-built UI components that accelerate development. You will see how a consistent API design across all SDKs means that learning LiveKit once lets you build everywhere, and why this ecosystem breadth is itself a significant competitive advantage.

SDKsConsistent APIAgents UIServer SDKsESP32

Why SDK breadth matters

A realtime platform is only as useful as the places you can run it. A brilliant SFU architecture means nothing if you cannot connect to it from an iOS app, an Android app, a web browser, a Unity game, a Raspberry Pi, and a backend server. Every missing SDK is a missing use case — a customer who cannot adopt the platform, an application that cannot be built.

LiveKit's ecosystem covers more platforms than any competing realtime infrastructure. This is not accidental. It reflects a strategic understanding that realtime communication is not a browser-only problem, and that the winning platform is the one developers can use wherever they need it.

Client SDKs: every platform, one API

Client SDKs connect end-user devices to LiveKit rooms. Each SDK implements the same core abstractions — Room, Participant, Track, publication, subscription — adapted to the idioms and conventions of its target platform.

SDKPlatformLanguageNotes
livekit-clientWeb browsersTypeScript/JavaScriptThe reference implementation. Works in all modern browsers.
LiveKit SwiftiOS, macOS, visionOSSwiftNative Apple platform support with SwiftUI integration
LiveKit AndroidAndroidKotlinNative Android with Jetpack Compose support
LiveKit FlutteriOS, Android, Web, DesktopDartCross-platform from a single codebase
LiveKit React NativeiOS, AndroidTypeScriptFor React Native mobile applications
LiveKit UnityWindows, macOS, Linux, WebGLC#For games, simulations, and 3D experiences
LiveKit C++Desktop, embeddedC++For native desktop apps and embedded systems
LiveKit RustDesktop, server, embeddedRustHigh-performance native applications

Learn once, build everywhere

Every client SDK exposes the same conceptual API. Connecting to a room, subscribing to tracks, publishing media, handling participant events — these operations work the same way whether you are writing TypeScript for the web, Swift for iOS, or Kotlin for Android. The method names adapt to platform conventions, but the mental model is identical.

This consistency matters for teams. A developer who has built a LiveKit web application can contribute to the mobile app without learning a new conceptual framework. The Room is still a Room. A Participant is still a Participant. Tracks still publish and subscribe. The architecture chapter you read earlier — rooms, participants, tracks — is the API you use in every SDK.

Server SDKs: backend operations

Server SDKs run on your backend and handle operations that should never happen on a client: generating access tokens, creating and managing rooms, ejecting participants, starting egress, and controlling the server-side lifecycle of your LiveKit deployment.

SDKLanguagePrimary use cases
livekit-server-sdk-pythonPythonToken generation, room management, webhooks
livekit-server-sdk-jsNode.js / TypeScriptToken generation, room management, webhooks
livekit-server-sdk-goGoToken generation, room management, webhooks
livekit-server-sdk-rubyRubyToken generation, room management, webhooks
livekit-server-sdk-kotlinKotlin / JavaToken generation, room management, webhooks
livekit-server-sdk-phpPHPToken generation, room management, webhooks
livekit-server-sdk-rustRustToken generation, room management, webhooks

Every server SDK provides the same capabilities: create access tokens with specific grants, list and manage rooms, send data to rooms, start and stop egress, and receive webhook events. The choice of server SDK is driven purely by your backend's language — not by any feature difference.

Room Service API

The Room Service API is the server-side control plane for rooms and participants. It lets your backend manage the lifecycle of rooms and participants programmatically — useful for admin dashboards, moderation, and orchestration.

OperationWhat it does
Create RoomExplicitly create a room with specific settings (max participants, empty timeout, metadata) before anyone joins
List RoomsEnumerate all active rooms — useful for dashboards and monitoring
Delete RoomForce-close a room and disconnect all participants
List ParticipantsGet all participants in a specific room with their metadata and track info
Get ParticipantRetrieve details for a specific participant by identity
Remove ParticipantEject a participant from a room (moderation, policy enforcement)
Mute Published TrackServer-side mute a participant's track (moderation)
Update ParticipantChange a participant's metadata, permissions, or name
Update Room MetadataChange room-level metadata visible to all participants
Send DataPush data messages to participants from the server side

Most agents do not need the Room Service API

For typical voice AI workflows, rooms are created automatically when participants join. The Room Service API is for when you need server-side control: building admin tools, enforcing policies, orchestrating multi-room workflows, or integrating with external systems that need to inspect or modify room state.

Webhooks for server-side events

Server SDKs also handle webhook validation. LiveKit sends webhooks for events like participant joined, participant left, room started, room finished, egress completed, and more. Your server SDK validates the webhook signature and provides typed event objects, making it straightforward to react to room lifecycle events from your backend.

Key webhook events include:

EventWhen it fires
room_startedA new room is created
room_finishedA room is closed (all participants left)
participant_joinedA participant enters a room
participant_leftA participant leaves a room
track_publishedA participant publishes a new track
track_unpublishedA participant removes a track
egress_startedAn egress (recording/stream) begins
egress_endedAn egress completes
ingress_startedAn ingress (RTMP/WHIP ingest) begins
ingress_endedAn ingress ends

Webhooks are essential for server-side automation: logging call records, triggering post-call processing, updating CRM systems when calls end, or alerting when rooms exceed expected durations.

Agent SDKs: building AI participants

The Agents Framework — LiveKit's toolkit for building AI participants — has dedicated SDKs in two languages:

SDKLanguageFocus
livekit-agents (Python)PythonFull agent framework with plugin ecosystem for STT, LLM, TTS
livekit-agents (Node.js)Node.js / TypeScriptFull agent framework, same capabilities as the Python SDK

Agent SDKs are more than thin wrappers around the client SDK. They provide the Agents Framework — a structured way to build AI participants that handle the full voice AI pipeline (STT, LLM, TTS), manage conversation state, use tools, and interact with room participants. The framework handles the hard parts: voice activity detection, interruption handling, turn-taking, audio buffering, and graceful degradation.

Python is the primary language for AI/ML development, and the Python agent SDK has the broadest plugin ecosystem — integrations with OpenAI, Anthropic, Deepgram, ElevenLabs, Cartesia, Silero, and many others. The Node.js SDK provides the same core framework for teams whose backend is JavaScript-native.

What's happening

The three SDK tiers — client, server, agent — mirror the three roles in a LiveKit application. Clients connect users to rooms. Servers manage infrastructure and authorization. Agents add intelligence. Each tier has SDKs in the languages that make sense for that role, and all three tiers share the same underlying concepts.

UI components: pre-built interfaces

Building realtime UI from scratch is tedious and error-prone. Managing video tile layouts, audio level visualizations, connection state indicators, mute buttons, and participant lists involves substantial frontend work that is the same across most applications. LiveKit provides pre-built UI component libraries that handle this common work.

LibraryPlatformWhat it provides
@livekit/components-reactReact (Web)Video/audio renderers, layout components, controls, chat
LiveKit Components (Swift)SwiftUI (iOS/macOS)Native video views, audio visualizers, participant lists
LiveKit Components (Android)Jetpack ComposeNative video renderers, controls, participant management

The React component library is the most mature, providing composable components for video grids, audio visualizers, chat interfaces, media device selectors, and connection management. These components handle the visual complexity of realtime communication — tile reflow when participants join or leave, audio-level-driven speaker highlighting, responsive layouts for different screen sizes — so you can focus on your application's unique features.

Composable, not monolithic

LiveKit's UI components are designed to be composed, not consumed as a monolith. You can use the full pre-built layout, or you can use individual primitives — a single video renderer, a single audio visualizer — and arrange them however your design requires. The components handle the hard realtime parts; you control the design.

ESP32: voice AI on microcontrollers

At the far edge of the ecosystem sits something remarkable: LiveKit support for the ESP32, a low-cost microcontroller commonly used in IoT devices. This means you can build a hardware device — a smart speaker, a voice-controlled appliance, an embedded kiosk — that connects directly to a LiveKit room and participates in voice AI conversations.

The ESP32 integration represents the logical extreme of LiveKit's "everything is a participant" philosophy. A five-dollar microcontroller with a microphone and speaker becomes a full room participant, publishing and subscribing to audio tracks, receiving text streams, and interacting with AI agents — all through the same room model that powers browser and mobile applications.

This is not a toy. Voice AI on embedded hardware opens use cases that browser and mobile apps cannot serve: always-on devices in physical spaces, industrial equipment with voice interfaces, accessibility devices, and hardware products with conversational AI built in.

Test your knowledge

Question 1 of 2

What is the primary difference between LiveKit client SDKs and server SDKs?

The ecosystem as competitive advantage

Consider the full picture:

CategoryPlatforms covered
WebAll modern browsers via JavaScript/TypeScript
MobileiOS (Swift), Android (Kotlin), cross-platform (Flutter, React Native)
DesktopmacOS, Windows, Linux via C++, Rust, Unity
Games/3DUnity (C#)
Embedded/IoTESP32, any platform via C++ or Rust
ServerPython, Node.js, Go, Ruby, Kotlin/Java, PHP, Rust
AI agentsPython, Node.js

No other realtime platform covers this range. Competing platforms typically cover web and mobile, with limited or no support for desktop, games, embedded, or AI agents. The breadth of LiveKit's ecosystem means you can start with a web prototype, add a mobile app, integrate a voice AI agent, deploy to hardware — all using the same platform, the same room model, the same conceptual API.

What's happening

SDK ecosystem breadth is a compounding advantage. Every new SDK expands the set of applications that can be built on the platform, which attracts more developers, which justifies investment in more SDKs. LiveKit's coverage — from ESP32 microcontrollers to Unity games to server-side Go — represents years of sustained investment in making the platform accessible everywhere developers need it. For teams evaluating realtime infrastructure, this breadth significantly de-risks the choice: wherever your application needs to go next, LiveKit can follow.

Concepts covered
SDKsConsistent APIAgents UIServer SDKsESP32