Skip to main content

Overview

Use this endpoint for realtime sessions such as streaming speech recognition, speech synthesis, speech translation, or realtime multimodal models. Non-WebSocket GET requests return endpoint metadata; WebSocket upgrade requests proxy the realtime upstream session.
For agents, discover realtime-capable models with /v1/models and choose a model whose supported endpoints include realtime before opening the socket.

Connection

model
string
required
Realtime model ID. Use a model whose public contract lists realtime support.
Authorization
string
required
Bearer API key. WebSocket clients should send Authorization: Bearer sk-your-api-key during the upgrade request.
import WebSocket from 'ws';

const socket = new WebSocket('wss://api.tokenlab.sh/v1/realtime?model=qwen-tts-realtime', {
  headers: { Authorization: 'Bearer sk-your-api-key' }
});

socket.on('open', () => {
  socket.send(JSON.stringify({ type: 'session.start' }));
});

socket.on('message', (data) => {
  console.log('realtime event', data.toString());
});

Messages

TokenLab forwards WebSocket messages between your client and the routed realtime provider. Keep provider-specific event shapes for the selected realtime model, and include model in the query string rather than in each event.

Billing and closing

Realtime sessions use the same API key balance as other endpoints. TokenLab pre-deducts a small estimate when the socket opens, then settles or refunds when the session closes. Close the client socket when the session is complete. If the upstream closes first, TokenLab relays that close code to your client when possible.