Best Practices - MythicDot.AI

🔒 Security

🔑

Secure API Key Management

Essential

Never embed API keys in client-side code or version control
Use environment variables or secure vaults (AWS Secrets Manager, Azure Key Vault)
Rotate keys periodically and immediately after any potential exposure
Use separate keys for development, staging, and production

                        # Good: Load from environment

api_key = os.environ.get("MYTHIC_API_KEY")

# Bad: Hardcoded key

api_key = "sk-abc123..."  # Never do this!

🛡️

Input Validation & Sanitization

Essential

Validate and sanitize all user inputs before sending to the API
Set maximum input length limits to prevent abuse
Use content moderation for user-generated prompts
Implement rate limiting on your application layer

⚠️ Prompt Injection Warning

Never directly concatenate user input into system prompts. Always validate and sanitize inputs, and consider using structured formats like JSON for user data.

⚡ Reliability

🔄

Implement Retry Logic

Essential

Handle transient errors gracefully with exponential backoff:

                        import time

import random

def retry_with_backoff(func, max_retries=3):

    for attempt in range(max_retries):

        try:

            return func()

        except RateLimitError:

            wait = (2 ** attempt) + random.random()

            time.sleep(wait)

📊

Monitor & Log Everything

Recommended

Log all API requests with request IDs for debugging
Track latency, error rates, and token usage
Set up alerts for anomalies (latency spikes, error rate increases)
Use structured logging for easier analysis

🚀 Performance

🌊

Use Streaming for Long Responses

Recommended

Streaming reduces time-to-first-token and improves perceived performance:

Enable streaming for user-facing applications
Process tokens as they arrive for real-time display
Handle stream interruptions gracefully

⚡

Batch Requests When Possible

Advanced

Use the Batch API for non-time-sensitive workloads
Group related embeddings requests
Process multiple items in parallel (respecting rate limits)

💰 Cost Optimization

📦

Use Context Caching

Recommended

Cache system prompts and common context (75% token savings)
Structure prompts with static content at the beginning
Use appropriate TTL for your use case

🎯

Choose the Right Model

Essential

Use mythic-4-mini for simple tasks (10x cheaper)
Reserve mythic-4 for complex reasoning
Use mythic-embed-3-small when full precision isn't needed
Consider fine-tuning for specialized, high-volume use cases

📝 Prompt Engineering

✍️

Write Clear, Specific Prompts

Essential

Be explicit about the desired output format
Provide examples (few-shot prompting) for complex tasks
Break complex tasks into steps
Use JSON mode for structured outputs

🚀 Production Readiness Checklist

API keys stored securely (not in code)

Retry logic with exponential backoff

Error handling for all API responses

Logging and monitoring configured

Rate limiting on application layer

Input validation and sanitization

Content moderation enabled

Streaming implemented for UX