New
Announcing Featherless' Realtime API Beta Learn more

Realtime API Beta

Featherless's real-time API is now available in closed beta.

In October of 2024, OpenAI launched a real-time audio protocol. This protocol, which runs over either a Websocket or a WebRTC connection, is a stateful protocol and in a real-time native interface, intended to support the development of speech-to-speech applications with low latency requirements. Like their assistants API, this API is still in Beta today.

Today, February 23rd, 2025, Featherless is pleased to announce our real-time audio API is now available as a private beta. This allows any of our catalogue of our +3,800 models to be used, with by a Speech-to-Text (STT) model as input processor and Text-to-Speech (TTS) model as output processor and via the OpenAI Realtime audio protocol. This consumes 2 concurrency units for STT and TTS layer + whatever concurrency needed for the underlying model. This means that Premium subscribers can use one real-time connection with a model up to 34B in size.

At this time, our API implements a subset of OpenAI and will be of interest to developers that are building client applications and are familiar with OpenAI’s real-time API specification. As our list of supported applications grows, we will expand the beta to end-users; we plan to support first via our assistant app, Phoenix, and later in Wyvern.

To join the private beta, please emails us at [email protected].

Support

Our API follows OpenAI’s real-time API definition closely (official docs here and here). We follow the same session establishment mechanism, we support both websockets and Web RTC, we do not currently support voice audio detection (VAD) (this means a client application must allow a user to take action outside of the audio channel to trigger an iteraction) and we support a subset of the events in the spec, specifically

Client Events

Event

Currently supported?

Reference

session.update

ref

input_audio_buffer.append

ref

input_audio_buffer.commit

ref

input_audio_buffer.clear

ref

conversation.item.create

ref

conversation.item.truncate

ref

conversation.item.delete

ref

response.create

ref

response.cancel

ref

Server Events

Event

Generated?

Reference

error

ref

session.created

ref

session.updated

ref

conversation.created

ref

conversation.item.created

ref

conversation.item.input_audio_transcription.completed

ref

conversation.item.input_audio_transcription.failed

ref

conversation.item.truncated

ref

conversation.item.deleted

ref

input_audio_buffer.committed

ref

input_audio_buffer.cleared

ref

input_audio_buffer.speech_started

ref

input_audio_buffer.speech_stopped

ref

response.created

ref

response.done

ref

response.output_item.added

ref

response.output_item.done

ref

response.content_part.added

ref

response.content_part.done

ref

response.text.delta

ref

response.text.done

ref

response.audio_transcript.delta

ref

response.audio_transcript.done

ref

response.audio.delta

ref

response.audio.done

ref

response.function_call_arguments.delta

ref

response.function_call_arguments.done

ref

rate_limits.updated

ref

Session Establishment

WebRTC

WebRTC is the intended form of connection for browser-based applications. WebRTC session establishment happens in three parts

  1. server generates an ephemeral key for the current session

  2. client uses the ephemeral key to obtain a WebRTC session description

  3. client connects a RTCPeerConnection object via that session description

The endpoints needed for steps 1. and 2. above are https://api.featherless.ai/v1/realtime/sessions and https://api.featherless.ai/v1/realtime .

(server-side) code for generating the ephemeral key for the current session might look as follows

WebRTC Session Establishment, Part I
import express from "express";

const app = express();

// An endpoint which would work with the client code above - it returns
// the contents of a REST API request to this protected endpoint
app.get("/session", async (req, res) => {
  const r = await fetch("https://api.featherless.ai/v1/realtime/sessions", {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${process.env.FEATHERLESS_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      model: "recursal/QRWKV6-32B-Instruct-Preview-v0.1",
      voice: "Darok/tommy",
    }),
  });
  const data = await r.json();

  // Send back the JSON we received from the OpenAI REST API
  res.send(data);
});

app.listen(3000);

The client side code for connecting via WebRTC might look like the following

example client-side WebRTC connection establishment code
async function init() {
  // Get an ephemeral key from your server - see server code above
  const tokenResponse = await fetch("/session");
  const data = await tokenResponse.json();
  const EPHEMERAL_KEY = data.client_secret.value;

  // Create a peer connection
  const pc = new RTCPeerConnection();

  // Set up to play remote audio from the model
  const audioEl = document.createElement("audio");
  audioEl.autoplay = true;
  pc.ontrack = e => audioEl.srcObject = e.streams[0];

  // Add local audio track for microphone input in the browser
  const ms = await navigator.mediaDevices.getUserMedia({
    audio: true
  });
  pc.addTrack(ms.getTracks()[0]);

  // Set up data channel for sending and receiving events
  const dc = pc.createDataChannel("oai-events");
  dc.addEventListener("message", (e) => {
    // Realtime server events appear here!
    console.log(e);
  });

  // Start the session using the Session Description Protocol (SDP)
  const offer = await pc.createOffer();
  await pc.setLocalDescription(offer);

  const baseUrl = "https://api.featherless.ai/v1/realtime";
  const model = "recursal/QRWKV6-32B-Instruct-Preview-v0.1";
  const sdpResponse = await fetch(`${baseUrl}?model=${model}`, {
    method: "POST",
    body: offer.sdp,
    headers: {
      Authorization: `Bearer ${EPHEMERAL_KEY}`,
      "Content-Type": "application/sdp"
    },
  });

  const answer = {
    type: "answer",
    sdp: await sdpResponse.text(),
  };
  await pc.setRemoteDescription(answer);
}

init();

Websockets

The websocket transport is intended for server-to-server communication scenarios.

Example (server-side) websocket connection code
import WebSocket from "ws";

const url = "wss://api.featherless.ai/v1/realtime?model=recursal/QRWKV6-32B-Instruct-Preview-v0.1";
const ws = new WebSocket(url, {
  headers: {
    "Authorization": "Bearer " + process.env.FEATHERLESS_API_KEY,
    "OpenAI-Beta": "realtime=v1",
  },
});

ws.on("open", function open() {
  console.log("Connected to server.");
});

ws.on("message", function incoming(message) {
  console.log(JSON.parse(message.toString()));
});

Supported Voices

  • Darok/america

  • Darok/joshua

  • Darok/paola

  • Darok/jessica

  • Darok/grace

  • Darok/maya

  • Darok/knightley

  • Darok/myriam

  • Darok/tommy