What is LiveKit?
LiveKit is an open-source platform that enables scalable, multi-user conferencing with WebRTC. It provides the tools you need to add real-time video, audio, and data capabilities to your applications. By combining LiveKit with Cerebras’s ultra-fast inference, you can build responsive voice AI agents that handle conversations with minimal latency. Learn more at LiveKit.io.Prerequisites
Before you begin, ensure you have:- Cerebras API Key - Get a free API key here
- OpenAI API Key - Required for speech-to-text (Whisper). Get one at OpenAI
- LiveKit Account - Visit LiveKit Cloud and create an account to get your API credentials
- Python 3.11 - 3.13 - LiveKit agents require Python < 3.14. Verify your version with
python --version.
Cerebras provides ultra-fast LLM inference but does not currently offer speech-to-text (STT) models. This guide uses OpenAI’s Whisper for STT and Cerebras for the LLM, giving you the best of both worlds.
Configure LiveKit with Cerebras
1
Create and activate a virtual environment
Set up an isolated Python environment for your project. This keeps dependencies organized and prevents conflicts with other projects.
2
Install LiveKit agents and dependencies
Install the LiveKit agents framework with the necessary plugins. This includes OpenAI-compatible clients (which we’ll use to connect to Cerebras), voice activity detection (VAD), and text-to-speech capabilities.
The
openai plugin allows LiveKit to work with any OpenAI-compatible API, including Cerebras Inference.3
Configure environment variables
Create a Get your LiveKit credentials from the LiveKit Cloud dashboard:
.env file in your project directory with your API credentials. These credentials authenticate your application with Cerebras, OpenAI, and LiveKit services.- LIVEKIT_URL: Your project URL (starts with
wss://) - LIVEKIT_API_KEY and LIVEKIT_API_SECRET: Generate these in Settings → Keys
4
Create a basic voice agent
Build a complete voice AI agent that uses OpenAI Whisper for speech-to-text and Cerebras for ultra-fast LLM responses.Create a file named
voice_agent.py:Server
This example uses OpenAI’s Whisper for speech-to-text and OpenAI for text-to-speech, with Cerebras’s
llama-3.3-70b providing the ultra-fast intelligence.5
Run and test your voice agent
Start your voice agent with the LiveKit CLI. The agent will connect to your LiveKit room and wait for a user to join.To test your agent:
- Go to the LiveKit Agents Playground
- If authenticated: You’ll see available rooms and can join directly Otherwise: Use manual connection by entering the URL and token from your terminal (displayed in blue when you run the agent)
- Paste the token generated when running
python voice_agent.py dev - Click Connect
- Approve microphone access when Chrome prompts you (required for voice interaction)
- Speak into your microphone - the agent should respond!
Example Use Cases
Combining LiveKit with Cerebras enables powerful real-time AI applications:- Multimodal Assistants - Support text, voice, and screen sharing with an AI assistant that responds instantly.
- Telehealth - Enable real-time AI support during virtual medical consultations with HIPAA-compliant infrastructure.
- Call Centers - Automate inbound and outbound customer support with AI voice agents that handle multiple conversations simultaneously.
- Real-time Translation - Translate conversations instantly across languages with minimal latency.
- Interactive Education - Create voice-enabled tutoring systems that provide immediate feedback.
- Voice Commerce - Build conversational shopping experiences with natural voice interactions.
Advanced Configuration
Using Different Cerebras Models
You can easily swap models based on your needs. Choose faster models for lower latency or more capable models for complex reasoning tasks.Troubleshooting
Agent not responding to voice input
Agent not responding to voice input
Check your microphone permissions - Ensure your browser or application has access to your microphone.Verify VAD settings - The Silero VAD may need tuning for your audio environment. Try adjusting
min_speech_duration and min_silence_duration parameters.Test STT independently - Make a direct API call to OpenAI Whisper to verify your audio is being transcribed correctly.High latency in responses
High latency in responses
Use a smaller model - Try
llama3.1-8b instead of llama-3.3-70b for faster responses. The 8B model typically responds 2-3x faster.Check network connectivity - Ensure stable connections to both LiveKit and Cerebras endpoints. Use ping and traceroute to diagnose network issues.Optimize instructions - Shorter, more focused system instructions lead to faster generation. Aim for instructions under 200 words.Monitor token usage - Longer conversations accumulate context. Consider implementing context window management to keep prompts concise.Connection errors
Connection errors
Verify API keys - Double-check that your
CEREBRAS_API_KEY and LiveKit credentials are correct and not expired.Check base URL - Ensure you’re using https://api.cerebras.ai/v1 for the Cerebras endpoint (note the /v1 suffix).Review firewall settings - LiveKit requires WebRTC connections which may be blocked by some firewalls. Ensure UDP ports 50000-60000 are open.Test connectivity - Verify you can reach both services:Audio quality issues
Audio quality issues
Enable noise cancellation - Configure noise cancellation in your
RoomInputOptions if needed for noisy environments.Check sample rates - Ensure your audio input matches the expected sample rate for Whisper (16kHz). Mismatched sample rates can cause quality degradation.Test TTS provider - Try different TTS providers if Cartesia isn’t meeting your quality needs. LiveKit supports multiple TTS engines including ElevenLabs and Deepgram.Monitor bandwidth - Poor audio quality can result from insufficient bandwidth. LiveKit automatically adjusts quality, but ensure you have at least 1 Mbps available.Python version compatibility
Python version compatibility
Verify Python version - LiveKit agents require Python 3.11.5 or later. Check your version:Use pyenv for version management - If you need multiple Python versions:Check async compatibility - Ensure you’re using
async/await syntax correctly. LiveKit agents are fully asynchronous.Next Steps
Now that you have a working voice agent, explore these advanced features:- Model Selection - Try different Cerebras models to optimize for speed vs. capability based on your use case.
- Custom Frontend - Build a custom client using the LiveKit client SDKs for web, iOS, or Android.
- Production Deployment - Deploy your agent to production using LiveKit Cloud or self-hosted infrastructure.
- Monitoring and Analytics - Implement logging and monitoring to track agent performance and user interactions.
- See how to use the latest GLM4.6 with Cerebras - GLM4.6 migration guide
Additional Resources
- LiveKit Documentation - Complete guide to LiveKit features and APIs
- LiveKit Agents GitHub - Source code and examples
- Cerebras API Reference - Detailed API documentation
- LiveKit Community - Get help from the LiveKit community
- Example Applications - Sample projects and use cases

