Tokenization - MythicDot.AI

What Are Tokens?

Tokens are the basic units that our models process. They can be words, parts of words, or characters. On average, 1 token ≈ 4 characters in English, or about 0.75 words.

📝

~4

Characters per token (average)

📖

~750

Words per 1,000 tokens

📄

~2

Pages per 1,000 tokens

Interactive Tokenizer

Hello , Myth ic Dot . AI ! How are you today ?

13 tokens · 38 characters · 0.34 tokens per character

Context Window Limits

The context window is the maximum number of tokens a model can process in a single request, including both input and output.

Model	Context Window	Max Output
mythic-4	128,000 tokens	16,384 tokens
mythic-4-mini	128,000 tokens	16,384 tokens
mythic-3.5-turbo	16,000 tokens	4,096 tokens
mythic-embed	8,192 tokens	N/A

Token Examples

🔤 Common Words

"the" "and" "for" "are"

1 token each

🔢 Numbers

"2024" → 1 token
"123456789" → 3 tokens

Varies by length

🌐 Non-English

"こんにちは" → 5 tokens
"Привет" → 6 tokens

~3x more than English

💻 Code

def hello():
return "hi"

9 tokens

Counting Tokens

Python - tiktoken

                    import tiktoken

# Load the encoding for our models
encoding = tiktoken.get_encoding("cl100k_base")

# Count tokens in a string
text = "Hello, how are you doing today?"
tokens = encoding.encode(text)
print(f"Token count: {len(tokens)}")  # Output: 8

# Decode tokens back to text
decoded = encoding.decode(tokens)
print(decoded)  # "Hello, how are you doing today?"

# Count tokens for chat messages
def count_chat_tokens(messages):
    num_tokens = 0
    for message in messages:
        num_tokens += 4  # overhead per message
        for key, value in message.items():
            num_tokens += len(encoding.encode(value))
    num_tokens += 2  # priming
    return num_tokens
                

Optimization Tips

💡 Reduce Token Usage

Be concise: Remove unnecessary words and filler text
Use abbreviations: When context is clear, use shorter forms
Summarize context: Instead of full documents, include summaries
Truncate history: Keep only recent conversation turns
Use caching: Leverage context caching for repeated prefixes

❌ Inefficient

Long system prompt 2,000 tokens

Full chat history 5,000 tokens

Verbose user query 500 tokens

Total 7,500 tokens

✅ Optimized

Concise system prompt 500 tokens

Last 5 turns only 1,500 tokens

Clear user query 100 tokens

Total 2,100 tokens

Handling Long Content

When your content exceeds the context window, consider these strategies:

✂️ Chunking

Split long documents into overlapping chunks, process separately, then combine results.

📊 Summarization

Create summaries of earlier content and include those instead of full text.

🔍 RAG

Use embeddings to retrieve only the most relevant chunks for each query.

💾 Memory

Store key facts externally and inject them as needed for each request.