Understand usage tiers, request limits, and how to optimize your API usage for the best performance.
Usage Tiers
Tokens Per Minute
Requests Per Minute
Requests Per Day
Rate limits are determined by your usage tier, which is based on your spending history and account age. You can view your current tier in the dashboard.
| Tier | Qualification | Usage Limit | RPM | TPM |
|---|---|---|---|---|
| Free | New accounts | $100 / month | 20 | 40,000 |
| Tier 1 | $5 paid | $100 / month | 500 | 200,000 |
| Tier 2 | $50 paid + 7 days | $500 / month | 2,000 | 1,000,000 |
| Tier 3 | $100 paid + 14 days | $2,000 / month | 5,000 | 5,000,000 |
| Tier 4 | $500 paid + 30 days | $50,000 / month | 10,000 | 30,000,000 |
Enterprise customers can request custom rate limits. Contact our sales team to discuss your requirements.
Different models have different rate limits based on their compute requirements.
Every API response includes headers to help you track your usage:
When you exceed rate limits, the API returns a 429 status code. Implement exponential backoff:
import time from mythicdot import MythicDot, RateLimitError client = MythicDot() def make_request_with_retry(prompt, max_retries=5): for attempt in range(max_retries): try: return client.chat.completions.create( model="mythic-4", messages=[{"role": "user", "content": prompt}] ) except RateLimitError as e: if attempt == max_retries - 1: raise # Exponential backoff: 1s, 2s, 4s, 8s, 16s wait_time = 2 ** attempt print(f"Rate limited. Waiting {wait_time}s...") time.sleep(wait_time) # Or use the built-in retry (recommended) client = MythicDot(max_retries=5)
Group multiple requests together using batch endpoints to reduce RPM usage and save 50% on costs.
Use exponential backoff with jitter to handle 429 errors gracefully without hammering the API.
Cache responses for identical requests to avoid unnecessary API calls and reduce token usage.
Track rate limit headers and monitor your dashboard to stay within limits and plan scaling.