Stop Using JSON for LLMs: The Case for Token Efficiency
Why JSON is costing you money and performance in AI applications, and how switching to TOON can reduce token usage by up to 60%.
Reviewing your API bill at the end of the month? That sinking feeling might be caused by JSON bloat. Large Language Models (LLMs) charge by the token, and JSON is notoriously inefficient with tokens.
Every quote, every bracket, every repeated key name is a token you are paying for, but which adds zero semantic value to your prompt.
In this post, we'll explain why JSON is the silent killer of AI budgets and introduce TOON, the token-optimized alternative that cuts costs by up to 60%. For a deeper look at how data formats affect AI reasoning and alignment, see our post on why LLMs agree with you.
The JSON Tax: Why You're Overpaying
JSON was designed for JavaScript engines, not tokenizers. Let's look at a simple example.
Here is a list of users in JSON:
[
{ "id": 1, "name": "Alice", "role": "admin" },
{ "id": 2, "name": "Bob", "role": "user" },
{ "id": 3, "name": "Charlie", "role": "user" }
]Notice the repetition? `"id"`, `"name"`, `"role"` are repeated for every single record. In a list of 1,000 users, you are sending those keys 1,000 times.
For an LLM tokenizer (like OpenAI's cl100k_base), this repetition is disastrous. You are burning context window space on syntax, leaving less room for actual data and reasoning.
Enter TOON: The Token-Optimized Object Notation
TOON solves this by using a header-based schema definition, similar to how CSV works but significantly more powerful and structured.
Here is that same data in TOON:
[3]{id,name,role}:
1,Alice,admin
2,Bob,user
3,Charlie,userKey Differences:
- No Repeated Keys: Headers are defined once at the top.
- Minimal Syntax: No trailing commas, no quoted keys, no nested heavy brackets.
- Whitespace Efficiency: Designed to look like a clean table that LLMs natively understand.
Real-World Savings
We ran a benchmark converting a 5MB JSON file of e-commerce product data into TOON.
- JSON Token Count: ~1.2 Million Tokens
- TOON Token Count: ~0.5 Million Tokens
- Cost Reduction: ~58%
If you are running a RAG (Retrieval-Augmented Generation) pipeline processing millions of documents, switching formats literally cuts your infrastructure bill in half.
Recommended Reading
Why LLMs Agree With You (And How TOON Helps)
Explore why LLMs favor agreement over correctness due to reward hacking, and how using TOON in your evaluation pipeline can help detect sycophancy.
Why LLMs Hallucinate and How TOON Optimizes Reasoning
Explore the fundamental causes of LLM hallucinations and learn how the TOON format reduces noise to improve accuracy and reasoning in AI applications.
Claude 3.5 and TOON: Maximizing Efficiency in AI Agentic Tasks
Explore how Claude 3.5 Sonnet and Haiku's new capabilities combine with TOON/TONL for optimal token efficiency, reducing API costs by 40-60% in agentic workflows.