DeepSeek OpenAI-Compatible API: SDK Integration & Migration
Configure DeepSeek's OpenAI-compatible API and Anthropic API format. SDK integration patterns, LangChain support, migration from OpenAI/Anthropic, streaming differences, and model mapping behavior.
DeepSeek's API is uniquely dual-format — supporting both OpenAI-compatible and Anthropic API endpoints. Migration from either ecosystem is typically a single-line base_url change. But there are format-specific differences in thinking mode, parameter support, and streaming behavior that you need to know before migrating production workloads.
OpenAI-Compatible Format
from openai import OpenAI
client = OpenAI(
api_key="<DeepSeek API Key>",
base_url="https://api.deepseek.com"
)
response = client.chat.completions.create(
model="deepseek-v4-pro",
messages=[{"role": "user", "content": "Hello"}],
reasoning_effort="high",
extra_body={"thinking": {"type": "enabled"}},
stream=False
)
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.deepseek.com",
apiKey: "<DeepSeek API Key>",
});
const completion = await client.chat.completions.create({
model: "deepseek-v4-pro",
messages: [{ role: "user", content: "Hello" }],
});
Anthropic API Format
import anthropic
client = anthropic.Anthropic(
base_url="https://api.deepseek.com/anthropic",
api_key="<DeepSeek API Key>"
)
response = client.messages.create(
model="deepseek-v4-pro",
max_tokens=4096,
system="You are a helpful assistant.",
messages=[{"role": "user", "content": "Hello"}]
)
Migration from OpenAI
Step 1: Change base URL
# Before
client = OpenAI(api_key=openai_key)
# After
client = OpenAI(
api_key=deepseek_key,
base_url="https://api.deepseek.com"
)
Step 2: Update model name
# Before
model="gpt-4o"
# After
model="deepseek-v4-pro"
Step 3: Move thinking parameter (if using reasoning)
# GPT doesn't have a native reasoning mode — CoT is prompt-based
# DeepSeek: Enable thinking mode
response = client.chat.completions.create(
model="deepseek-v4-pro",
messages=messages,
reasoning_effort="high",
extra_body={"thinking": {"type": "enabled"}}
)
Migration Checklist
| Check | OpenAI | DeepSeek |
|---|---|---|
temperature | Supported | Ignored in thinking mode |
top_p | Supported | Ignored in thinking mode |
response_format: json_object | Supported | Supported (known empty-content issue) |
tools | Supported | Supported (+ strict mode beta) |
stream | Supported | Supported (keep-alive lines in response) |
max_tokens | Supported | Supported (max 384K output) |
stop | Supported | Supported |
Migration from Anthropic/Claude
Step 1: Change base URL
# Before
client = anthropic.Anthropic(api_key=anthropic_key)
# After
client = anthropic.Anthropic(
base_url="https://api.deepseek.com/anthropic",
api_key=deepseek_key
)
Step 2: Auto Model Mapping
DeepSeek automatically maps Claude model names — you don't even need to change the model parameter:
claude-opus-*→deepseek-v4-proclaude-sonnet-*orclaude-haiku-*→deepseek-v4-flash
Migration Checklist
| Claude Feature | DeepSeek Support |
|---|---|
thinking with budget_tokens | budget_tokens IGNORED — use output_config.effort |
temperature | Supported (0.0-2.0) in non-thinking mode |
tools | Fully supported |
system prompt | Fully supported |
stop_sequences | Fully supported |
cache_control | IGNORED — use DeepSeek's automatic context caching |
| Image input | NOT supported |
| Document input | NOT supported |
| MCP servers | NOT supported through Anthropic format |
LangChain Integration
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(
model="deepseek-v4-pro",
openai_api_key="<DeepSeek API Key>",
openai_api_base="https://api.deepseek.com"
)
Streaming Differences
Non-streaming (stream=false, default): DeepSeek returns empty lines as TCP keep-alive during processing. If you parse HTTP responses directly, handle these empty lines.
Streaming (stream=true): DeepSeek sends SSE keep-alive comments (: keep-alive) between chunks. Standard OpenAI SDK handles these automatically. If using raw SSE parsing, filter out lines starting with :.
# Robust streaming pattern
stream = client.chat.completions.create(
model="deepseek-v4-pro",
messages=messages,
stream=True,
extra_body={"thinking": {"type": "enabled"}}
)
for chunk in stream:
delta = chunk.choices[0].delta
if delta.reasoning_content:
# Thinking tokens arrive first
yield {"type": "reasoning", "content": delta.reasoning_content}
elif delta.content:
# Answer tokens arrive after reasoning
yield {"type": "content", "content": delta.content}
Error Handling
from openai import APIError, RateLimitError
try:
response = client.chat.completions.create(...)
except RateLimitError:
# 429 — concurrent request limit exceeded (Flash: 2500, Pro: 500)
time.sleep(1)
retry()
except APIError as e:
if e.status_code == 400 and "reasoning_content" in str(e):
# Missing reasoning_content in tool-call loop
# Fix: Always pass full message object
pass
elif e.status_code == 402:
# Insufficient balance
pass
Note:
Pro Move: When migrating from Claude to DeepSeek via the Anthropic API format, keep your existing Claude model names in code. DeepSeek auto-maps them. This means zero code changes beyond base_url and api_key — your entire Claude toolchain works immediately with DeepSeek as the backend.
Note:
Migration gotcha: If your Claude code uses cache_control for prompt caching, it will be silently ignored by DeepSeek. DeepSeek uses automatic prefix-match caching instead. You don't need to add cache markers — the system handles it automatically.
Related Pages
- Tool Calls with Thinking — Combine tools and reasoning with the mandatory
reasoning_contentpassback rule. - DeepSeek for Coding — Practical example: Claude Code integration using the Anthropic API format.
Related Articles
Line Art & Outline Minimalism SREF Codes
Clean line work with minimal detail, precise outlines, and maximum clarity through simple strokes.
Creating Effective Claude Artifacts: Trigger & Specify
Learn to consistently trigger Artifact generation in Claude. Specify artifact types (code, docs, diagrams, React components, SVGs), craft prompts that produce high-quality first-draft artifacts, and avoid generic template output.
Nano Banana Prompts: Google's AI Image Generation Guide
Master Nano Banana (Gemini 2.5 Flash & 3 Pro Image) with expert prompts for image generation, editing, and transformations. Best-in-class text rendering and photo editing.