What is DeepSeek R1 good for?

DeepSeek R1 is a reasoning model trained with reinforcement learning to produce explicit chain-of-thought before answering. It excels at math (AIME, MATH), competitive programming (Codeforces), logic puzzles, formal proofs, and multi-step planning. For tasks where the answer's quality depends on the reasoning process, R1 outperforms non-reasoning models like DeepSeek V3 at the cost of 3-5x more output tokens.

How does DeepSeek R1 pricing compare to OpenAI o1?

OpenAI o1 costs $15 per million input tokens and $60 per million output tokens. DeepSeek R1 on QuickSilver Pro costs $0.40 input and $1.70 output per million tokens. For the same workload, R1 is ~37x cheaper on input and ~35x cheaper on output — with comparable math and coding benchmark performance.

How do I access the reasoning trace?

DeepSeek R1 returns a reasoning_content field in the message object alongside content. reasoning_content holds the chain-of-thought trace; content holds the final answer. Both are billed as output tokens. If you only need the answer, you can discard reasoning_content — the cost is the same.

Is R1 overkill for simple questions?

Yes. R1 generates a long chain-of-thought even for trivial questions, which is wasted output cost. For factual Q&A, simple summarization, or casual chat, use DeepSeek V3 ($0.70 per 1M output) instead of R1 ($1.70 per 1M output). Reserve R1 for problems where the reasoning step materially changes the answer quality.

Use case · Reasoning & math

DeepSeek R1 for reasoning

DeepSeek R1 is an open-source reasoning model trained with RL to emit explicit chain-of-thought. It's competitive with OpenAI o1 on AIME and MATH benchmarks, while costing ~35x less: $0.40 input / $1.70 output per 1M tokens on QuickSilver Pro vs o1's $15 / $60. For math, code challenges, and logic-heavy agent loops, R1 is the open-source default.

What R1 is good at

Math

Strong on AIME-2024, MATH-500, and Olympiad-level problems. The reasoning trace walks through derivations; final answer appears in content.

Algorithms

Competitive-programming-grade code generation. LiveCodeBench and Codeforces benchmark scores rival o1. Better than V3 for novel-algorithm tasks; slower because of CoT.

Multi-step planning

Useful in agent loops where the planner needs to decompose before acting. Each planning call has explicit reasoning, which improves tool-use decisions.

Quickstart: solve a math problem

Python · openai SDK

from openai import OpenAI

client = OpenAI(
    base_url="https://api.quicksilverpro.io/v1",
    api_key="sk-qsp-...",
)

resp = client.chat.completions.create(
    model="deepseek-r1",
    messages=[{
        "role": "user",
        "content": "A box has 12 red and 8 blue balls. Three drawn without replacement. Probability exactly two are red?",
    }],
)

# Chain-of-thought reasoning:
print(resp.choices[0].message.reasoning_content)

# Final answer:
print(resp.choices[0].message.content)

print(f"Output tokens: {resp.usage.completion_tokens}")
print(f"Cost: ${resp.usage.cost:.6f}")

R1 returns reasoning_content (the thinking trace) separately from content (the final answer). Both are billed as output tokens. Typical reasoning traces are 500–3000 tokens.

Pricing

Provider	Input / 1M	Output / 1M	Output vs QSP
QuickSilver Pro	$0.40	$1.70	—
OpenRouter	$0.50	$2.15	+26%
DeepInfra	$0.55	$2.19	+29%
Together AI	$3.00	$7.00	4.1x
Fireworks AI	$3.00	$8.00	4.7x
OpenAI o1	$15.00	$60.00	35x

Because R1 generates long reasoning traces (often 1000-3000 extra output tokens), output cost dominates. The 79% savings on output vs Fireworks compound — if your workload is 10M R1 output tokens per month, the difference is $17/month on QSP vs $80/month on Fireworks.

When R1 is worth the extra tokens

Use R1 for: math word problems, novel algorithm design, logic puzzles, theorem proving, multi-step tool planning, hard debugging. Tasks where the reasoning step is where the model earns its keep.

Skip R1 for: factual Q&A, code completion, summarization, entity extraction, simple classification, translation. V3 is cheaper, faster, and quality is equivalent on non-reasoning tasks.

Cost calibration: a 2000-word essay takes V3 ~600 output tokens ($0.42/1000 essays). R1 on the same essay takes ~2500 output tokens including reasoning trace ($4.25/1000 essays). 10x premium. Reserve R1 for when that premium buys something.

FAQ

Is DeepSeek R1 as good as o1?

On published math (AIME-2024, MATH-500), coding (LiveCodeBench, Codeforces), and reasoning (GPQA Diamond) benchmarks, DeepSeek R1 is within a few points of o1 and exceeds o1-mini on most. For production use at 35x lower cost, it's the open-source equivalent.

How long are the reasoning traces?

Typical range is 500-3000 tokens. For hard problems (IMO-grade math), traces can exceed 5000 tokens. All reasoning tokens are billed as output tokens — account for this in cost projections.

Does R1 support tool calling?

R1 accepts the OpenAI tools array but is less reliable at tool calling than V3. For agent loops, use V3 as the tool-calling executor and invoke R1 only for hard planning sub-problems. This hybrid pattern gets the best of both.

Can I hide the reasoning trace from users?

Yes. Ignore reasoning_content server-side and return only content. You still pay for reasoning tokens because R1 has to generate them to reach the answer — there's no cheap "skip thinking" mode.

DeepSeek R1 for reasoning

What R1 is good at

Quickstart: solve a math problem

Pricing

When R1 is worth the extra tokens

FAQ

Related

Start reasoning on $1 free