Route Mapping
| Existing workload | TokenLab base URL | Primary endpoint | Migration note |
|---|---|---|---|
| OpenAI Chat Completions | https://api.tokenlab.sh/v1 | /chat/completions | Smallest change for OpenAI-compatible chat and function calling |
| OpenAI Responses | https://api.tokenlab.sh/v1 | /responses | Use when your app depends on Responses-specific input, tools, or output handling |
| Anthropic SDK | https://api.tokenlab.sh | /v1/messages | Do not append /v1 to the SDK base URL |
| Gemini REST | https://api.tokenlab.sh | /v1beta/models/:model:generateContent | Keep Gemini-native fields on the Gemini route |
| Media generation | https://api.tokenlab.sh/v1 | /images, /videos, /music, /3d | Discover models with recommended_for and expect async polling where documented |
| Management and billing | https://api.tokenlab.sh/v1 | /management/... | Use management tokens for server-side usage and billing reconciliation |
OpenAI-Compatible Migration
GET /v1/models before production traffic. For image generation, send model explicitly and read the image guide because image models differ more than chat models.
Anthropic Migration
/v1/messages for Claude-native tool use, thinking flows, and Anthropic message semantics. Do not translate Anthropic-only fields through Chat Completions unless you intentionally want an OpenAI-compatible behavior change.
Gemini Migration
/v1beta when your app depends on Gemini-native behavior.
Media Migration
- Query
GET /v1/models?recommended_for=image|video|music|3d. - Read
tokenlab.public_contract_summaryin list responses and the fulltokenlab.public_contractwhere available. - Send an explicit
model, especially for image endpoints. - Store
task_id,poll_url, endpoint, model, and your own job ID for async jobs. - Reconcile costs through usage records and
billing_transaction_id, not provider task IDs.
Production Rollout Plan
| Phase | Goal | Checks |
|---|---|---|
| 1. Inventory | List endpoints, models, request fields, streaming/async behavior, and billing owner | No hidden provider-only fields are assumed public |
| 2. Single-route pilot | Move one endpoint and one model family | Response shape, cost, and logs match expectations |
| 3. Shadow or sample | Compare selected outputs against the previous provider | User-visible quality and latency are acceptable |
| 4. Gradual rollout | Increase traffic by key, org, or feature flag | Watch 4xx, 5xx, latency, balance, and duplicate async jobs |
| 5. Cleanup | Remove old provider path only after stable usage | Rollback path and support playbook are documented |
Migration Pitfalls
- Do not put every model behind one OpenAI Chat Completions path if your app needs native Anthropic, Gemini, or Responses behavior.
- Do not assume old image defaults. Send
modelexplicitly. - Do not retry async create requests without checking whether a task was already created.
- Do not expose provider routing metadata in your logs or UI.
- Do not compare billing with provider task IDs. Use TokenLab usage records.
API Reference
| Topic | Reference |
|---|---|
| Multi-Format API | Multi-Format API |
| OpenAI SDK | OpenAI SDK |
| Anthropic SDK | Anthropic SDK |
| Gemini Native | Gemini Native API |
| Image Generation | Image Generation |
| Async Jobs & Polling | Async Jobs & Polling |