ElevenLabs Voice Interface UI for React: Build It

A voice interface is mostly state: idle, listening, thinking, speaking, and the UI has to make each one obvious or the whole thing feels broken.

Lawrence Arya Founder & CEO of VP0 · June 3, 2026 · 6 min read Updated June 4, 2026 View as Markdown

TL;DR

The fastest free way to build an ElevenLabs voice interface in React is to start from a finished design on VP0, generate the mic button, state indicator and transcript UI in Cursor, then wire audio through a server that holds your ElevenLabs API key. VP0 is the free, AI-readable design library that AI builders copy from, so the AI nails the layout and states, and you focus on the streaming, the mic permissions and keeping the key off the client.

The fastest free way to build an ElevenLabs voice interface in React is to start from a finished design on VP0, generate the mic button, state indicator and transcript UI in Cursor or Claude Code, then stream audio through a server that holds your API key. VP0 is the free, AI-readable design library that AI builders copy from, so the model nails the layout and the states from a concrete target. That leaves your attention on what matters: the streaming, the microphone permissions, and keeping the ElevenLabs key off the client.

A voice UI is a state machine

The interaction has a handful of states: idle and ready, listening while capturing the mic, thinking while processing, and speaking while playing audio back, plus an error state. Each must be visually distinct, because a voice interface with no visible state feels broken. This is the kind of clear, stateful layout an AI editor builds well from a design, much like the player covered in the AI-generated audio player for React, where the visible state drives trust.

Keep the API key on the server

This is the rule that protects you. Never ship the ElevenLabs API key in client code, where it can be extracted and abused. Route requests through your own backend: the app sends audio or text, your server calls ElevenLabs with the key, and streams the result back. The browser owns the mic and playback through the MediaRecorder and getUserMedia APIs, which are supported in over 97% of browsers in use per caniuse. Your server owns the credential.

Map the voice UI to the work

Part	Generate from design	Wire yourself
Mic button	Button with active state	getUserMedia, recording indicator
State indicator	Idle / listening / thinking / speaking	Drive from real pipeline events
Transcript	Message list	Stream partial text, scroll behavior
Audio playback	Player controls	Stream from your server, not direct
Permissions	Prompt UI	In-context request, consent each session
API call	None (server)	Backend holds the ElevenLabs key

A worked example

Open VP0, pick a voice or chat design, and paste it into Cursor. Ask for a typed React component with a mic button, an animated state indicator and a transcript list, using your tokens. Wire the mic with getUserMedia and show a clear recording indicator the moment capture starts. Send the captured audio to your backend, which calls ElevenLabs with the server-side key and streams audio back for playback. Drive the state indicator from real pipeline events so idle, listening, thinking and speaking are always accurate. Add an error state and an obvious stop control. The UI came from the design; your work was the streaming, the permissions and the key security. For a node-based agent flow behind it, see the React Flow node editor AI generator.

Common mistakes

The first mistake is putting the ElevenLabs key in the client, where it leaks. The second is starting to listen with no visible recording indicator, which breaks trust and consent. The third is hiding the current state, so the user cannot tell if the app is listening or thinking. The fourth is buffering the entire audio response instead of streaming it, which adds latency. The fifth is ignoring the error state, so a failed call looks like a frozen app.

Key takeaways

Start free from a VP0 design so the AI nails the voice UI and its states.
A voice interface is a state machine: make idle, listening, thinking, speaking and error distinct.
Never ship the ElevenLabs key in the client; route through a server that holds it.
Use getUserMedia and MediaRecorder for the mic, with a clear recording indicator and consent.
Stream audio back rather than buffering, and always show an error state.

Keep reading: for a 3D interface see the React Three Fiber AI 3D generator, and for the generation workflow see AI for generating React code.

FAQ

How do you build an ElevenLabs voice interface UI in React?

Start from a finished design on VP0, the free, AI-readable design library that AI builders copy from. Generate the mic button, the state indicator (idle, listening, thinking, speaking) and the transcript in Cursor or Claude Code, then stream audio through your own server that holds the ElevenLabs API key. The AI builds the UI; you own the streaming, the mic permissions and the key security.

Should the ElevenLabs API key be in the React app?

No. Never put the ElevenLabs API key in client code, where it can be extracted and abused. Route requests through your own backend that holds the key, calls ElevenLabs, and streams audio back to the app. The browser handles the mic and playback; your server handles the credential and the API call.

What states does a voice interface UI need?

At minimum: idle (ready), listening (capturing mic), thinking (processing), and speaking (playing audio back), plus an error state. Make each visually distinct, because a voice UI with no visible state feels broken. A clear indicator and an obvious mic control are what make the interaction feel responsive.

How do I handle the microphone in a voice UI?

Request microphone permission in context, show a clear recording indicator when capturing, and never start listening silently. Use the browser MediaRecorder and getUserMedia APIs, which are supported in over 97% of browsers in use. Always give the user an obvious way to stop, and respect that consent on every session.

Can AI generate a voice interface component?

It generates the UI well: the mic button, the animated state indicator, the transcript list and the layout, especially from a design. Treat the audio pipeline as your responsibility: streaming, permissions, error handling and keeping the API key server-side. The AI gives you a strong visual first draft you then wire up.

Keep reading

Guides 6 min read

React Flow Node Editor AI Generator: Build Guide

Build a node-based editor fast: start from a free VP0 design, generate the canvas chrome in Cursor, then wire nodes, edges and custom nodes with React Flow.

Lawrence Arya · June 3, 2026

Guides 6 min read

AI Agent Chat UI: React Components That Ship

Building an AI agent chat UI in React? Here are the components you need, the streaming pattern that works, and why VP0 is the free design to start from.

Lawrence Arya · June 2, 2026

Guides 6 min read

Canva-Style App Builder UI Components in React

Build a Canva-style editor in React: start from a free VP0 design, generate the canvas, layers and properties panels, then own the editor state and undo/redo.

Lawrence Arya · June 4, 2026

Guides 6 min read

Real Estate Proptech UI Kit in React: Build It Fast

Build a proptech UI in React: start from a free VP0 design, generate listing search, map and property detail, then own listing freshness and map performance.

Lawrence Arya · June 4, 2026

Guides 6 min read

SaaS Gamification React UI: Engagement Without Dark Patterns

Build SaaS gamification UI in React that motivates honestly: progress, streaks and badges tied to real value, not dark patterns. Start from a VP0 design.

Lawrence Arya · June 4, 2026

Guides 6 min read

Spline 3D React Component With an AI Prompt

Embed a Spline 3D scene in React the right way: design the 3D in Spline, then use AI to wire the embed and the UI around it. Start from a free VP0 design.

Lawrence Arya · June 4, 2026

A voice UI is a state machine

Keep the API key on the server

Map the voice UI to the work

A worked example

Common mistakes

Key takeaways

FAQ

How do you build an ElevenLabs voice interface UI in React?

Should the ElevenLabs API key be in the React app?

What states does a voice interface UI need?

How do I handle the microphone in a voice UI?

Can AI generate a voice interface component?

More questions from VP0 vibe coders

Keep reading

React Flow Node Editor AI Generator: Build Guide

AI Agent Chat UI: React Components That Ship

Canva-Style App Builder UI Components in React

Real Estate Proptech UI Kit in React: Build It Fast

SaaS Gamification React UI: Engagement Without Dark Patterns

Spline 3D React Component With an AI Prompt