Prompt Caching

Cache repeated context for faster responses and significant cost savings. Pay once to cache, reuse at a fraction of the cost.

50%

Cost Reduction

~85%

Faster Time to First Token

5 min

Cache TTL

How It Works

📝

System Prompt

Static context

→

💾

Cache

Stored for reuse

→

⚡

Fast Response

Skip processing

When you send a request, we check if the prompt prefix matches a recent request. If it does, we skip reprocessing those tokens — you get faster responses and pay less.

Comparison

❌ Without Caching

Slower

Input cost (10K tokens) $0.05

Time to first token ~500ms

10 requests/day $0.50/day

✓ With Caching

Faster

Input cost (10K tokens) $0.025

Time to first token ~75ms

10 requests/day $0.25/day

Usage

Caching is automatic for prompts over 1,024 tokens. Structure your prompts with static content first:

Python - Automatic Caching

# Long system prompt gets cached automatically
system_prompt = """You are a helpful assistant for Acme Corp.
Company background: [... 5,000 tokens of context ...]
Product catalog: [... 3,000 tokens ...]
Support policies: [... 2,000 tokens ...]
"""

# First request: Full processing, creates cache
response1 = client.chat.completions.create(
    model="mythic-4",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": "What's your return policy?"}
    ]
)

# Second request: Cache hit! Faster and cheaper
response2 = client.chat.completions.create(
    model="mythic-4",
    messages=[
        {"role": "system", "content": system_prompt},  # Same prefix
        {"role": "user", "content": "How do I track my order?"}
    ]
)
                

Best Use Cases

🤖 Chatbots

Cache system prompts and company context across user sessions.

📚 Document QA

Cache embedded documents for repeated questions about the same content.

💻 Code Assistants

Cache codebase context for faster code completion and review.

🔄 Batch Processing

Process multiple items with the same instructions efficiently.

Start Saving

Enable automatic caching in your applications today.

Learn More →