TOON in Python and JavaScript: A Hands-On SDK Guide

Q: How do I convert JSON to TOON in JavaScript?

Install the official @toon-format/toon package from npm (npm i @toon-format/toon), then import the encode and decode functions. Pass your JavaScript object or array to encode() to get a TOON string. Pass that string to decode() to get your data back. Check the package README for the exact exported function names.

Q: Is TOON reversible (lossless)?

Yes, for all data types TOON supports. A round-trip encode → decode preserves values faithfully. TOON is a text format designed for readability and LLM consumption, so binary blobs should be base64-encoded before conversion, just as they would be in JSON.

Q: How many tokens does TOON save compared to JSON?

According to official benchmarks at toonformat.dev/guide/benchmarks, TOON uses 39.9% fewer tokens than JSON on average across 5,016 LLM calls. Savings peak at ~59% for uniform flat tables and time-series data; mixed or highly nested structures save around 22–33%.

TOON has official implementations in both TypeScript (the reference implementation by Johann Schopplich, MIT-licensed) and Python. This guide walks through install, encode, decode, and a full round-trip in both ecosystems, then gives you a JSON-to-TOON migration checklist.

Why add a TOON dependency at all?

Before writing any code it is worth knowing what you are optimizing for. The official TOON benchmark ran 5,016 LLM calls across four frontier models and found that TOON uses 39.9% fewer tokens than JSON on average while achieving slightly higher retrieval accuracy (76.4% vs 75.0%). On flat, uniform tables — the exact shape you get from a database query or an API list endpoint — savings reach 58–59%.

The savings compound when you pay per token (OpenAI, Anthropic, Google) and your prompt injects large datasets. A pipeline that today costs $1,000/month in tokens could drop to roughly $600 by swapping the serializer. That is why a one-line install is worth understanding.

One honest caveat: an independent arXiv paper (arXiv 2603.03306) found that TOON's edge shrinks for generation tasks (asking a model to output TOON) because of the upfront "prompt tax" — the instructional tokens needed to describe the format to the model. TOON shines on large, repetitive payloads being fed into the LLM as context, not as output schema.

If you want a no-code path first, try the free browser converter — paste JSON, get TOON back instantly, all in-browser.

TOON by language: packages at a glance

Language	Package	Install	Status
TypeScript / JavaScript	@toon-format/toon	`npm i @toon-format/toon`	Official reference — MIT
Python	toon-python	`pip install toon-format`	Official — MIT
Go	community port	see GitHub org	Community / check status
Rust	community port	see GitHub org	Community / check status

Always check the toon-format GitHub organisation for the latest list of official and community ports.

How to encode and decode TOON in TypeScript / JavaScript

The TypeScript reference package was authored by Johann Schopplich and released under the MIT licence around November 2025. Install it from npm:

npm i @toon-format/toon

Encode: JSON array of objects to TOON

The package exposes encode/decode functions (check the package README for exact exported names — the API may evolve). The pattern below illustrates the intended usage:

import { encode, decode } from "@toon-format/toon";

const orders = [
  { id: 1, customer: "Alice", total: 49.99, status: "shipped" },
  { id: 2, customer: "Bob",   total: 12.50, status: "pending" },
  { id: 3, customer: "Carol", total: 199.00, status: "delivered" },
];

// Encode JavaScript value → TOON string
const toonString = encode(orders);
console.log(toonString);
/* Output (approximate):
orders[3]{id,customer,total,status}:
  1, Alice, 49.99, shipped
  2, Bob, 12.5, pending
  3, Carol, 199, delivered
*/

Notice the header line orders[3]{id,customer,total,status}: — it tells the LLM the array name, item count, and field schema before a single data row appears. That explicit contract is why LLM retrieval accuracy stays high even as token count drops.

Decode: TOON string back to JavaScript

// Decode TOON string → JavaScript value
const restored = decode(toonString);

console.log(restored[0].customer); // "Alice"
console.log(restored[1].total);    // 12.5

// Round-trip check
console.log(JSON.stringify(restored) === JSON.stringify(orders)); // true

The round-trip is lossless for all primitives JSON supports. If your original data contains binary, encode it as a base64 string before passing to encode().

Using TOON inside a prompt (Node.js / Next.js example)

import { encode } from "@toon-format/toon";

async function buildPrompt(rows: Record<string, unknown>[]) {
  const toon = encode(rows);

  return `You are a data analyst. Answer the question using only the data below.

DATA:
${toon}

QUESTION: Which customer has the highest total?`;
}

// Compared to JSON.stringify(rows), the TOON prompt uses ~40% fewer tokens
// on uniform arrays — real savings at scale.

For a full TypeScript integration walkthrough, see the TypeScript install guide.

How to encode and decode TOON in Python

The official Python implementation lives at github.com/toon-format/toon-python. Install it from PyPI:

pip install toon-format

As with the TypeScript package, verify the exact function names in the package README — the snippets below follow the intended encode/decode pattern:

Encode a list of dicts to TOON

from toon_format import encode, decode

orders = [
    {"id": 1, "customer": "Alice", "total": 49.99, "status": "shipped"},
    {"id": 2, "customer": "Bob",   "total": 12.50, "status": "pending"},
    {"id": 3, "customer": "Carol", "total": 199.00, "status": "delivered"},
]

toon_string = encode(orders)
print(toon_string)
# orders[3]{id,customer,total,status}:
#   1, Alice, 49.99, shipped
#   2, Bob, 12.5, pending
#   3, Carol, 199.0, delivered

Decode TOON back to Python

restored = decode(toon_string)

print(restored[0]["customer"])  # Alice
print(restored[1]["total"])     # 12.5

# Round-trip assertion
assert restored == orders

Injecting TOON into an LLM prompt (Python / OpenAI SDK example)

from toon_format import encode
import openai

def build_messages(rows: list[dict]) -> list[dict]:
    toon = encode(rows)
    return [
        {"role": "system", "content": "Answer using only the provided data."},
        {"role": "user",   "content": f"DATA:\n{toon}\n\nQUESTION: List the shipped orders."},
    ]

client = openai.OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=build_messages(orders),
)
print(response.choices[0].message.content)

Benchmarks on toonformat.dev/guide/benchmarks show TOON achieves 27.7 accuracy-points per 1,000 tokens vs JSON's 16.4 — a 69% improvement in efficiency. For a Python data pipeline injecting thousands of rows per request, the cumulative cost difference is significant.

JSON vs TOON side by side

The same three-order dataset, serialized both ways, makes the token difference concrete:

/* JSON — ~60 tokens */
[
  {"id":1,"customer":"Alice","total":49.99,"status":"shipped"},
  {"id":2,"customer":"Bob","total":12.50,"status":"pending"},
  {"id":3,"customer":"Carol","total":199.00,"status":"delivered"}
]

/* TOON — ~32 tokens */
[3]{id,customer,total,status}:
  1, Alice, 49.99, shipped
  2, Bob, 12.5, pending
  3, Carol, 199, delivered

On this toy example the saving is roughly 47%. On production-scale datasets — the benchmark used e-commerce orders with 109,599 JSON tokens — TOON's representation consumed 33.3% fewer tokens (73,126 tokens). Flat uniform tables and time-series data see savings of up to 59%.

For a deeper look at when these numbers hold and when they do not, read JSON vs TOON: a full comparison.

JSON-to-TOON migration checklist

Use this list before shipping TOON in a production pipeline. See also the detailed migration guide.

Identify hot paths. Profile which prompts consume the most tokens. Prioritise those — converting cold paths adds complexity for minimal gain.
Confirm your data is uniform. TOON's biggest wins come from arrays of objects with consistent keys. Heterogeneous, deeply nested structures save less (~22%). If data is purely flat with no nesting, also consider plain CSV.
Install and pin the package. Use npm i @toon-format/toon or pip install toon-format. Pin to a specific version in your lockfile until the API stabilises.
Replace the serializer at prompt-build time. Swap your JSON.stringify(data) (or json.dumps(data)) call for encode(data) in the one place that builds the LLM context string.
Test the round-trip. Add a unit test asserting decode(encode(data)) deep-equals the original object. Run it in CI on representative fixtures.
Validate LLM output if you generate TOON. If you ask the model to output TOON (not just read it), add a decoding step with error handling. The arXiv study (2603.03306) found generation accuracy is lower than retrieval accuracy — plan for malformed output.
Measure token counts before and after. Log usage.prompt_tokens from your LLM API response. Confirm the savings match expectations before rolling out broadly.
Update prompt instructions if needed. Most frontier models understand TOON without explanation, but on smaller or specialised models you may need a one-sentence description of the table header syntax. Keep it short — each instruction line is tokens too.

When should you stick with JSON?

TOON is not a universal replacement. JSON still wins when:

Your data is highly non-uniform — mixed types per row, variable keys, deeply nested documents. TOON saves only ~22% here and adds parsing complexity.
You need schema validation or typed generation at scale — consider TONL, which adds type hints and a query API.
Your payload is small (a handful of objects). The "prompt tax" of describing TOON to the model may exceed the token savings on tiny payloads.
Constrained decoding is in play — the arXiv paper found constrained-decoding JSON can match TOON efficiency on simple structures.

Understanding the format in detail helps make this call. Start with What is TOON? and the TOON specification if you want to read the grammar before committing.

Frequently Asked Questions

Is there a TOON library for Python?

Yes. The official Python implementation is available at github.com/toon-format/toon-python and can be installed with pip. It mirrors the TypeScript reference implementation in functionality, providing encode and decode equivalents for converting JSON data to TOON format and back.

How do I convert JSON to TOON in JavaScript?

Install the official @toon-format/toon package from npm (npm i @toon-format/toon), then import the encode and decode functions. Pass your JavaScript object or array to encode() to get a TOON string. Pass that string to decode() to get your data back. Check the package README for the exact exported function names.

Do I need a library or can I convert online?

No library required for one-off conversions. The free browser-based converter at json2toon.co handles JSON-to-TOON and back entirely in your browser — no data ever leaves your machine. Use the SDK when you need to automate conversions inside your application or CI pipeline.

Is TOON reversible (lossless)?

Yes, for all data types TOON supports. A round-trip encode → decode preserves values faithfully. TOON is a text format designed for readability and LLM consumption, so binary blobs should be base64-encoded before conversion, just as they would be in JSON.

How many tokens does TOON save compared to JSON?

According to official benchmarks, TOON uses 39.9% fewer tokens than JSON on average across 5,016 LLM calls. Savings peak at ~59% for uniform flat tables and time-series data; mixed or highly nested structures save around 22–33%.

Try the free converter Read the migration guide