TOON (Token-Oriented Object Notation) is a new data serialization format designed for the AI era. It is built on a simple premise: Data should be concise. Learn more at the official TOON website.

In the last 20 years, we optimized data for two things:
1. Network Bandwidth (Binary formats like Protobuf).
2. Browser Compatibility (JSON).

But today, the bottleneck has shifted. We are no longer constrained by bandwidth (5G is fast) or browsers (V8 is fast). We are constrained by Context Windows.

The "Context Crisis"

Every time you send a request to an LLM (GPT-4, Claude, Gemini), you are renting "attention." This attention is measured in Tokens.

A token is roughly 0.75 words.

{"id": 1} is 6 tokens.
id: 1 is 3 tokens.

The difference seems small until you realize that modern RAG (Retrieval Augmented Generation) applications inject megabytes of data into the prompt.

Anatomy of TOON

TOON looks like a blend of YAML and CSV. It is designed to be readable by both humans and LLMs.

1. The Object Block

Key-Value pairs are separated by newlines. No braces. Key and value are separated by a colon.

model: gpt-4
temperature: 0.7
max_tokens: 100

Unquoted Strings: If a string doesn't contain special characters, it doesn't need quotes. This saves 2 tokens per string.

2. The List Block

Simple lists use dashes.

stops:
  - user
  - assistant
  - system

3. The Table Block (The "Killer Feature")

This is where TOON shines. Arrays of objects are represented as tables.

messages[3]{role, content}:
  system, You are a helper.
  user, What is the capital of France?
  assistant, Paris.

The Header: messages[3]{role, content} tells the LLM:
- The data is in `messages`. - There are `3` items. - The schema is `role` and `content`.

The Body: Comma-separated values. No repeated keys.

Why LLMs Love TOON

You might think: "Can the LLM understand this new format?"

Yes. In fact, LLMs are surprisingly good at this because they were trained on massive amounts of code, CSVs, and logs. They intuitively understand tabular structures.

By removing the syntactic noise (brackets, quotes), you allow the LLM's "Attention Mechanism" to focus on the Signal (the actual data values) rather than the Noise (the structure).

How to Adopt TOON

You don't need to rewrite your database.

Step 1: Install the SDK

npm install @toon/sdk

Step 2: Convert at the Edge

In your API route (Next.js/Express), convert your JSON data to TOON before constructing the prompt.

import { toToon } from "@toon/sdk";

const users = await db.users.findMany();
const prompt = `
  Analyze these users:
  ${toToon(users)}
`;

Step 3: Enjoy the Savings

You will see an immediate drop in token usage (typically 30-50%).

FAQ

Is TOON a binary format?

No. TOON is pure UTF-8 text. It is designed to be read by LLMs, which consume text.

Does it support nested objects?

Yes. TOON supports arbitrary nesting using indentation, just like YAML.

Is it standardized?

Yes, there is a formal spec. See the Specification.

Start Converting Compare with JSON

What is TOON and Why It Matters for AI Developers