A simple trick cuts your LLM costs by 50%!
Just stop using JSON and use this instead:
TOON (Token-Oriented Object Notation) slashes your LLM token usage in half while keeping data perfectly readable.
Here's why it works:
TOON's sweet spot: uniform arrays with consistent fields per row. It merges YAML's indentation and CSV's tabular structure, optimized for minimal tokens.
Look at the example below.
JSON:
{
"๐๐๐ฒ๐ฟ๐": [
{ "๐ถ๐ฑ": ๐ญ, "๐ป๐ฎ๐บ๐ฒ": "๐๐น๐ถ๐ฐ๐ฒ", "๐ฟ๐ผ๐น๐ฒ": "๐ฎ๐ฑ๐บ๐ถ๐ป" },
{ "๐ถ๐ฑ": ๐ฎ, "๐ป๐ฎ๐บ๐ฒ": "๐๐ผ๐ฏ", "๐ฟ๐ผ๐น๐ฒ": "๐๐๐ฒ๐ฟ" }
]
}
Toon:
๐๐๐ฒ๐ฟ๐[๐ฎ]{๐ถ๐ฑ,๐ป๐ฎ๐บ๐ฒ,๐ฟ๐ผ๐น๐ฒ}:
๐ญ,๐๐น๐ถ๐ฐ๐ฒ,๐ฎ๐ฑ๐บ๐ถ๐ป
๐ฎ,๐๐ผ๐ฏ,๐๐๐ฒ๐ฟ
It's obvious how few tokens are being used to represent the same information!
To summarise, here are the key features:
๐ธ 30โ60% fewer tokens than JSON
๐ Borrows the best from YAML & CSV
๐คฟ Built-in validation with explicit lengths & fields
๐ฑ Minimal syntax (no redundant braces, brackets, etc.)
IMPORTANT!!
That said, for deeply nested or non-uniform data, JSON might be more efficient.
In the next tweet, I've shared some benchmark results demonstrating the effectiveness of this technique in reducing the number of tokens and improving retrieval accuracy with popular LLM providers.
Where do you think this could be effective in your existing workflows?
Find the relevant links in the next tweet!
Here are the benchmarks for token efficiency and retrieval accuracy as provided by the Toon team.
You can find the same information in their GitHub repo: github.com/toon-format/toon
If you found it insightful, reshare with your network. Find me โ @akshay_pachaar โ๏ธ
For more insights and tutorials on LLMs, AI Agents, and Machine Learning!
A simple trick cuts your LLM costs by 50%!
Just stop using JSON and use this instead:
TOON (Token-Oriented Object Notation) slashes your LLM token usage in half while keeping data perfectly readable.
Here's why it works:
TOON's sweet spot: uniform arrays with consistent fields per row. It merges YAML's indentation and CSV's tabular structure, optimized for minimal tokens.
Look at the example below.
JSON:
{
"๐๐๐ฒ๐ฟ๐": [
{ "๐ถ๐ฑ": ๐ญ, "๐ป๐ฎ๐บ๐ฒ": "๐๐น๐ถ๐ฐ๐ฒ", "๐ฟ๐ผ๐น๐ฒ": "๐ฎ๐ฑ๐บ๐ถ๐ป" },
{ "๐ถ๐ฑ": ๐ฎ, "๐ป๐ฎ๐บ๐ฒ": "๐๐ผ๐ฏ", "๐ฟ๐ผ๐น๐ฒ": "๐๐๐ฒ๐ฟ" }
]
}
Toon:
๐๐๐ฒ๐ฟ๐[๐ฎ]{๐ถ๐ฑ,๐ป๐ฎ๐บ๐ฒ,๐ฟ๐ผ๐น๐ฒ}:
๐ญ,๐๐น๐ถ๐ฐ๐ฒ,๐ฎ๐ฑ๐บ๐ถ๐ป
๐ฎ,๐๐ผ๐ฏ,๐๐๐ฒ๐ฟ
It's obvious how few tokens are being used to represent the same information!
To summarise, here are the key features:
๐ธ 30โ60% fewer tokens than JSON
๐ Borrows the best from YAML & CSV
๐คฟ Built-in validation with explicit lengths & fields
๐ฑ Minimal syntax (no redundant braces, brackets, etc.)
IMPORTANT!!
That said, for deeply nested or non-uniform data, JSON might be more efficient.
In the next tweet, I've shared some benchmark results demonstrating the effectiveness of this technique in reducing the number of tokens and improving retrieval accuracy with popular LLM providers.
Where do you think this could be effective in your existing workflows?
Find the relevant links in the next tweet!
Nov 7, 2025 ยท 7:20 PM UTC



