Physical AI: Voice on Hardware
Connect microcontrollers and edge devices to LiveKit
Deploy voice AI agents on physical hardware. Connect ESP32 microcontrollers to LiveKit Cloud over WebRTC, handle audio streaming with Opus, implement wake word detection, and control hardware through agent tools.
What You Build
ESP32 voice device that connects to LiveKit Cloud, streams audio, responds to wake words, and controls physical hardware via agent tools.
Prerequisites
- →Course 1.1
Embedded voice architecture
25mHow an ESP32 connects to LiveKit Cloud: hardware setup (INMP441 mic, MAX98357A speaker), I2S audio configuration, Opus codec on constrained devices, and the architecture that makes it work.
Wake word detection & audio streaming
25mImplement always-on wake word detection with Porcupine or ESP-SR. Transition from low-power listening to full audio streaming when activated. Manage power for battery-operated devices.
Hardware control via agent tools
25mUse LiveKit data channels for bidirectional hardware control. Register agent tools that control LEDs, relays, and servos. Send sensor data from the device to the agent.
Production: OTA updates & fleet management
20mShip embedded voice devices to production: OTA firmware updates with secure boot, graceful offline fallback with local command recognition, health telemetry, and fleet management with staged rollouts.
What You Walk Away With
Ability to connect embedded devices to LiveKit Cloud for voice AI, with wake word detection, hardware control, and offline fallback.