Vision API

Analyze and understand images with state-of-the-art multimodal AI. Extract text, detect objects, describe scenes, and answer questions about visual content.

🏙️
city skyline skyscrapers sunset

A modern city skyline at sunset with tall glass skyscrapers reflecting the orange and pink hues of the evening sky.

What Can Vision Do?

📝

Text Extraction (OCR)

Extract text from images, documents, receipts, signs, and handwritten notes with high accuracy.

🎯

Object Detection

Identify and locate objects within images. Get bounding boxes and confidence scores.

💬

Image Q&A

Ask natural language questions about images and get detailed, contextual answers.

🏷️

Classification

Categorize images into custom categories. Content moderation, product tagging, and more.

📊

Chart Understanding

Extract data and insights from charts, graphs, and data visualizations.

📄

Document Analysis

Process invoices, forms, IDs, and structured documents. Extract key fields automatically.

Industry Applications

🛒

E-Commerce

Auto-tag products & visual search

🏥

Healthcare

Medical image analysis

🏭

Manufacturing

Quality control inspection

🏦

Finance

Document processing & ID verification

🚗

Automotive

Damage assessment

🏠

Real Estate

Property image analysis

🛡️

Security

Content moderation

Accessibility

Alt-text generation

Simple Integration

Python
from mythicdot import MythicDot

client = MythicDot()

# Analyze an image from URL
response = client.chat.completions.create(
    model="mythic-4-vision",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image_url",
                    "image_url": {"url": "https://example.com/image.jpg"}
                }
            ]
        }
    ]
)

print(response.choices[0].message.content)

# Or use base64 encoded image
import base64

with open("photo.jpg", "rb") as f:
    image_data = base64.b64encode(f.read()).decode()

response = client.chat.completions.create(
    model="mythic-4-vision",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Describe this image in detail"},
            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{image_data}"}}
        ]
    }]
)

Key Features

Multiple Image Support

Analyze up to 10 images in a single request

High Resolution Mode

Process images up to 4096x4096 pixels

URL or Base64

Accept images from URLs or base64 encoded

All Image Formats

PNG, JPEG, GIF, WebP, and more

Streaming Support

Stream responses for real-time UX

JSON Mode

Structured output for easy parsing

Simple Pricing

Resolution Cost per Image Best For
Low (512px) $0.00065 Quick classification
Standard (1024px) $0.00255 General analysis
High (2048px) $0.00765 Detailed extraction
Ultra (4096px) $0.01530 Document OCR

See the World Through AI

Start analyzing images in minutes. Free tier includes 1,000 images per month.

Read the Docs Get Started Free