Top open-source LLMs · 20% below resellers

Open-source inference,
20% below the rest.

The most popular open-source models — DeepSeek V3, DeepSeek R1, Qwen3.5-35B-A3B — through an OpenAI-compatible API. Cheaper than every other reseller. Change one line of code.

Get API Key View Pricing

or try all three models live on HuggingFace — no signup required.

No subscription

OpenAI compatible

Pay as you go

python

# One line change. That's it.
from openai import OpenAI

client = OpenAI(
    base_url="https://api.quicksilverpro.io/v1",
    api_key="your-api-key",
)

Pricing

Cheapest open-source inference

Per million tokens · compared against OpenRouter, Together AI, Fireworks.

Model	Context	Input	Output	Savings
Qwen3.5-35B-A3B qwen3.5-35b Best for long-context RAG, summarization	262K	$0.13 $0.16	$1.00 $1.25	−20%
DeepSeek V3 Default deepseek-v3 Best for chat, coding, structured output	128K	$0.24 $0.30	$0.70 $0.88	−20%
DeepSeek R1 Reasoning deepseek-r1 Best for math, multi-step reasoning, logic	128K	$0.40 $0.50	$1.70 $2.15	−20%

Compared against OpenRouter, Together AI, and Fireworks AI. Prices as of April 2026.

Thinking models (Qwen3.5-35B-A3B, DeepSeek R1): every request spends "reasoning tokens" internally (typical 500–8,000) that are billed at the output rate. max_tokens only caps visible content, not reasoning. DeepSeek R1 returns its reasoning as message.reasoning_content; Qwen3.5-35B-A3B hides it (matching OpenRouter default). For cheap one-shot tasks, prefer deepseek-v3.

Side-by-side pricing vs every competitor

DeepSeek V3 for tool-calling agents →

Reasoning

DeepSeek R1 for math & algorithms →

Long context

Qwen3.5-35B-A3B for 262K RAG →

See all comparisons →

Calculator

How much would you save?

Plug in your monthly usage — see the cost on QSP vs every competitor.

Embed this calculator on your own site — get the iframe snippet.

CLI qsp

Built for terminals and AI agents. --json output with stable exit codes — Claude Code, Cursor, Aider can call it without parsing HTML.

PyPI GitHub Quickstart →

FAQ

Common questions

What is QuickSilver Pro?

An OpenAI-compatible HTTP API for the top open-source LLMs — DeepSeek V3, DeepSeek R1, and Qwen3.5-35B-A3B. Point the official OpenAI SDK at our base URL and get the same chat-completions interface, 20% below competing resellers.

How much cheaper than OpenRouter / OpenAI?

20% below the public per-token rates at OpenRouter, Together AI, Fireworks AI, and DeepInfra on the same open-source models. DeepSeek V3: $0.24 / $0.70. DeepSeek R1: $0.40 / $1.70. Qwen3.5-35B-A3B: $0.13 / $1.00. We don't serve closed models (GPT-4, Claude).

Is it really a drop-in OpenAI replacement?

Yes. Change base_url to https://api.quicksilverpro.io/v1 in the official openai Python / Node / Swift SDKs. Streaming, tool calling, json_schema strict mode, and usage.cost accounting all work out of the box.

Is there a free tier?

New accounts get $1 in free credits on registration — enough for ~500-700 real DeepSeek V3 calls to evaluate the service. After that it's pay-as-you-go starting at $5, no subscription.

See all 8 questions

Start saving on inference today

Create an account, buy credits, get your API key in 30 seconds.

Get API Key