NEWClaude Opus 4.8 is live on TokenLayer

OneAPIforeverymodelyourstackwillneed.

›

The unified inference layer for GPT, Claude, Gemini, DeepSeek, Llama and 50+ models. Transparent pricing, predictable latency, one key.

Get started — free

View documentation

API live · OpenAI & Anthropic compatible

OpenAI

GPT-5.4

Input · per 1M tokens

Official$1.25

TokenLayer$0.25-80%

Output · per 1M tokens

Official$10.00

TokenLayer$2.00-80%

Effective savings: up to 80% across all models

8 models · Auto-rotating · Use arrows or dots to navigate

Max savings vs. going direct

Models in catalog

Capabilities

Theinfrastructurelayerbetweenyouandeverymodel.

Everything you need to ship reliable AI products — without managing six provider contracts.

01 / Unified

One API. Every model.

Swap between GPT, Claude, Gemini, DeepSeek, Llama and 50+ models without rewriting a single line. OpenAI-compatible by default.

02 / Cost

before$2,840/mo

with tokenlayer$568/mo

−80% on identical workloads

Pay less for the same answer.

Top up and your credits are worth multiples of what you paid. Same models, fraction of the bill.

03 / Speed

Streaming, straight through.

Requests proxy directly to providers with a single thin hop. Tokens stream the moment they're generated.

04 / Billing

OpenAI

$412

Anthropic

$298

Google

$140

DeepSeek

$54

One invoice, all providers.

No more juggling six vendor dashboards. Set budgets, allocate by team.

05 / Analytics

See every token.

Per-model latency, error rates and cost breakdowns. Built-in tracing.

06 / DX

$ npm i openai
$ export OPENAI_BASE_URL=https://api.tokenlayer.net/v1
$ node app.js

Built for developers.

Keep the official OpenAI or Anthropic SDK — just change the base URL. Streaming, tools, structured output.

How it works

Three steps from signup to scale.

est. setup time

≈ 4 minutes

Create your account.

› signup —› tokenlayer.net/signup

Grab your API key.

Generate a key, set a monthly cap, and we'll handle quota across every provider.

› TOKENLAYER_KEY=tl_live_a1b2c3d4...

Start building.

Drop-in compatible with the OpenAI SDK. Change baseURL — keep your existing code.

› client.chat.completions.create({ model: "gpt-5.4" })

The model catalog

Every frontier model. Always up to date.

New models land on TokenLayer shortly after release — same SDK call, same billing, same observability.

OpenAI

GPT

Flagship reasoning and multimodal models.

5.45.1o3

model.gpt

Anthropic

Claude

Long-context and coding-grade quality.

Opus 4.8Sonnet 4.6Haiku 4.5

model.claude

Google

Gemini

Native multimodal with massive context windows.

3 Pro2.5 Pro2.5 Flash

model.gemini

DeepSeek

Frontier-grade reasoning at a fraction of the cost.

V3.2R1V3

model.deepseek

Every frontier model. Always up to date.

New models land on TokenLayer shortly after release — same SDK call, same billing, same observability.

catalog updated continuously

OpenAI

GPT

Flagship reasoning and multimodal models.

5.45.1o3

model.gpt

Anthropic

Claude

Long-context and coding-grade quality.

Opus 4.8Sonnet 4.6Haiku 4.5

model.claude

Google

Gemini

Native multimodal with massive context windows.

3 Pro2.5 Pro2.5 Flash

model.gemini

DeepSeek

Frontier-grade reasoning at a fraction of the cost.

V3.2R1V3

model.deepseek

See how much you'll save.

Adjust the sliders to estimate your monthly costs based on real provider pricing.

Tip: Drag the sliders or click any value to type your own

Select Model

Monthly requests

1001.0M

Avg tokens per request

10032,000

Total tokens / mo

20.0M

OpenAI

$1.25/$10 / 1M

Your monthly cost

Direct from provider

$95.00

With TokenLayer-80%

$19.00

You save / month

$76.00

You save / year

$912

Start saving today

Estimates based on 60/40 input/output split. Prices from pricepertoken.com

Developer experience

Fiveminutestoyourfirsttoken.

OpenAI and Anthropic compatible API with native streaming, structured outputs and tools. Keep your SDK — change one line.

Works with the official OpenAI SDK
Anthropic Messages API supported too
Streaming + structured outputs out of the box
Per-request usage & cost in your dashboard

Read the quickstart

app.ts

// Official OpenAI SDK — one line changed.
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.TOKENLAYER_KEY,
  baseURL: "https://api.tokenlayer.net/v1",
});

const res = await client.chat.completions.create({
  model: "claude-sonnet-4-6",
  messages: [{ role: "user", content: "Summarize Q3." }],
  stream: true,
});

for await (const chunk of res) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

● 200 OK→ claude-sonnet-4-6$0.0021

642ms · 1,284 tok

↳ one key

Any model id in the catalog works here — GPT, Claude, Gemini and more.

Pricing

Transparent.Nosurprises.

No subscriptions, no seat fees. Top up, get multiplied credits, and pay per token.

Pay-as-you-go · no lock-in

Free

Start without a card.

$0to start

Claim $20 in credits

$20 of free credits on signup
Access to every model in the catalog
OpenAI & Anthropic compatible API
Usage dashboard included

POPULAR

Pay as you go

Top up when you need more.

−80%vs. going direct

Start building

Pay only for the tokens you use
Top up by card, UPI or crypto
Credits worth multiples of what you pay
Per-key spend caps
No subscription, no seat fees

High volume

For heavy, sustained workloads.

Custom

Chat with us

Volume top-up deals
Help with migration & integration
Direct line to the team on Telegram

Models are metered at provider list prices — your topped-up credits are simply worth more. Estimate your savings →

What you can do

Buildmore.Spendless.Sleepbetter.

Concrete outcomes you can expect after pointing your stack at TokenLayer.

Up to 80% below provider list prices

Save

before

$2,840

with tokenlayer

$568

−80%

Cut your AI spend by up to 80%.

When you top up, your balance is multiplied — so every request effectively costs a fraction of the provider's list price.

Compatible

api.openai.com

→api.tokenlayer.net

// every other line stays the same

OpenAI & Anthropic SDK compatible.

Drop-in replacement. Change one line — your base URL — and keep every line of code you already shipped.

Works with

CursorClaude CodeContinueAider

Cursor, Claude Code, Continue, Aider.

Use TokenLayer as the inference layer for your favorite AI coding tools. One key powers your whole stack.

Topups

Visa / MastercardUPIUSDTBTC

Pay with card, UPI or crypto.

Top up your balance however suits you — cards and UPI via Razorpay, or USDT, BTC and more via OxaPay.

Ship faster

GPT-5.4

78%

Sonnet 4.6

92%

Gemini 3

71%

Try models before you wire them in.

Run the same prompt across GPT, Claude and Gemini in the built-in chat, then ship the one that wins.

Predictable

spend cap · sk-prod-…$642 / $1,000

live usage · hard stop at the cap

Per-key spend caps.

Cap what each API key can spend and watch usage live on the dashboard. No surprise bills.

Questions

Common questions, honest answers.

Still curious? Ping us on Telegram — it's the fastest way to reach the team.

Yes. Point your base URL at api.tokenlayer.net/v1 and use any model in the catalog. Our endpoints are compatible with the OpenAI Chat Completions API and the Anthropic Messages API — including streaming, tool calling, and structured outputs.

Ready when you are

Buildwitheverymodelbeforelunch.

Start building free

Chat with us on Telegram

No credit card required$20 free creditsOpenAI & Anthropic compatiblePay-as-you-go