NDJSON vs TOON: Streaming JSON for LLMs Compared

NDJSON puts one JSON object per line for easy streaming and append-only writes, but it still repeats every key on every line. TOON declares fields once in a header and lists bare comma-separated rows, so on uniform arrays TOON is materially smaller — up to 58.8% fewer tokens on flat tables — while keeping similar record-level readability.

What Is NDJSON (JSON Lines / JSONL)?

NDJSON stands for Newline-Delimited JSON. The format is also marketed as JSON Lines and files commonly carry the extensions .ndjson or .jsonl. The rule is simple: each line of the file is a complete, valid JSON value — typically an object. Lines are separated by a single newline character (); there is no enclosing array and no trailing comma.

That simplicity is the format's superpower. You can stream NDJSON record-by-record without buffering the whole payload. You can append a new record with a single write(line + "\n"). Any tool that processes text line by line — grep, awk, wc, Spark, Flink — can handle NDJSON natively. This is why the format dominates:

Application logs — structured log lines emitted by services like Fluentd, Logstash, or Cloud Logging.
BigQuery load jobs — GCP's preferred import format for streaming inserts and table loads.
OpenAI batch API and fine-tuning files — each request or training example is one JSON Lines record.
ML dataset pipelines — Hugging Face datasets, LangChain document loaders, and JSONL-backed vector-store ingestion all rely on the format.
Event sourcing and CDC streams — Debezium, Kafka JSON serializer output, and change-data-capture exports all produce NDJSON-compatible streams.

The format has no official RFC, but the ndjson.org specification and the JSON Lines project at jsonlines.org define the de-facto standard. Parser support is universal: every language ecosystem has at least one mature library, and most cloud data warehouses ingest it directly.

The Key Structural Difference: Keys Everywhere vs. Header Once

Both NDJSON and TOON are text formats that represent arrays of objects. Their divergence is a single design choice: where do the field names live?

In NDJSON, every row carries its own keys. A file with ten thousand user records repeats "id", "name", "email", and "role" ten thousand times each. That repetition is what makes line-level independence possible — each line is self-describing — but it is pure overhead when you are paying per token.

TOON takes the opposite bet. A header line declares the array length and the field list once, e.g. users[3]{id,name,email,role}:. Every subsequent row is bare comma-separated values with no keys, no braces, and no quotes around plain strings. The LLM reads the header to know the schema, then reads rows to get values. One token investment buys schema knowledge for the entire table.

That single design difference is responsible for most of the token gap between the two formats.

Side-by-Side: The Same Three Records

Here is an identical dataset — three user records — written as NDJSON and as TOON. Both are correct representations of the same data.

NDJSON (JSON Lines)

{"id":1,"name":"Alice Chen","email":"alice@example.com","role":"admin"}
{"id":2,"name":"Bob Patel","email":"bob@example.com","role":"editor"}
{"id":3,"name":"Carol Kim","email":"carol@example.com","role":"viewer"}

TOON

users[3]{id,name,email,role}:
  1, Alice Chen, alice@example.com, admin
  2, Bob Patel, bob@example.com, editor
  3, Carol Kim, carol@example.com, viewer

The NDJSON version repeats "id", "name", "email", and "role" three times each, plus all the surrounding JSON syntax. The TOON version states those four field names exactly once. At three records the difference is modest; at three thousand records the NDJSON overhead becomes substantial. According to toonformat.dev/guide/benchmarks — which ran 5,016 LLM calls across 209 question sets and four models — flat uniform tables in TOON used 67,778 tokens versus 164,452 for JSON, a 58.8% reduction. NDJSON and JSON share the same per-row key overhead, so NDJSON savings relative to TOON are in the same ballpark.

You can test any real dataset with the free converter to see the token count for your specific data.

How Much Smaller Is TOON? Token Benchmarks

The benchmark at toonformat.dev/guide/benchmarks uses the GPT-5 o200k_base tokenizer and tests across four production models: Claude Haiku 4.5, Gemini 3 Flash Preview, GPT-5 Nano, and Grok 4.1 Fast. Key figures:

Overall token reduction vs. JSON: 39.9% across all data shapes.
Flat / uniform tables: 58.8% — the shape closest to typical NDJSON workloads.
Time-series (60-day rows): 59.0%.
E-commerce orders (nested): 33.3%.
Mixed structures: 21.9% — TOON's weakest case.
Retrieval accuracy: TOON 76.4% vs JSON 75.0% — essentially a tie, meaning the token savings come with no accuracy penalty at scale.

An independent arXiv paper (arXiv 2603.03306, February 2026) confirms TOON's strong accuracy-per-token ratio for comprehension and retrieval but adds an important caveat: the efficiency is non-linear. TOON pays off only beyond a threshold where per-row syntax savings amortize the upfront prompt overhead needed to describe the format. On small payloads — a handful of records — the format-instruction overhead can cancel the savings. For very large, repetitive datasets the savings compound aggressively.

The practical implication: if you are injecting a 20-row NDJSON file into a prompt, TOON likely wins. If you are injecting three lines of log data, the difference is negligible and NDJSON's universality makes it the simpler choice. See the format comparison guide for a broader breakdown across all common formats.

NDJSON vs TOON: Full Comparison Table

Property	NDJSON / JSON Lines	TOON
Token efficiency (uniform arrays)	Baseline (same overhead as JSON)	Up to 58.8% fewer tokens
Key repetition	Repeated on every line	Declared once in header
Line-by-line streaming	Native — each line is self-contained	Requires full block; header needed for row context
Append writes	Trivial — write one line, done	Requires rewriting the header count
Tooling support	Universal (BigQuery, Spark, OpenAI, grep, jq)	Growing — TypeScript/JS SDK; converter at json2toon.co
LLM prompt fit	Works, but verbose; keys repeated waste context	Designed for prompts; accurate at 76.4% retrieval
Heterogeneous records	Handles naturally — each line is independent	Weaker; table block assumes uniform schema
Nested objects	Full JSON nesting per line	Supported via YAML-style indentation
Primary use case	Logs, exports, ML datasets, API pipelines	LLM prompt serialization, RAG context packing

When to Use NDJSON

NDJSON is the right tool whenever infrastructure interoperability or append-only streaming is the priority:

Log aggregation pipelines. Fluentd, Vector, and Logstash all emit and consume NDJSON natively. Switching to TOON here would break every downstream consumer.
BigQuery and data warehouse loads. GCP, Snowflake, and Databricks all accept JSONL as a first-class import format. There is no TOON option in the upload dialog.
OpenAI fine-tuning and batch files. The API requires valid JSONL with a specific schema per line. TOON cannot substitute here.
Event-driven microservices. When producers and consumers are decoupled and records arrive asynchronously, NDJSON's line independence is genuinely valuable. A crashed producer can resume from the last newline; a TOON block with a partial write is unreadable.
Grep-ability. Ops teams regularly run grep "error" app.log on production systems. NDJSON is transparent to standard UNIX tooling. TOON rows without their header are opaque.

When to Use TOON Instead

TOON wins when token budget is the binding constraint and the data is uniform and self-contained:

Injecting tabular datasets into LLM prompts. Product catalogs, user lists, time-series readings, order histories — any uniform array injected into a context window saves 30–59% tokens with TOON versus the equivalent NDJSON.
RAG context windows. When a retrieval step returns dozens or hundreds of documents and you need to fit them into a single prompt, TOON's header-once compression compounds across every row. See the guide on optimizing RAG pipelines with TOON for implementation details.
Batch inference where you control the format. If you are building the JSONL batch file yourself and the model-readable content inside each request is a large data payload, convert that inner payload to TOON before serializing the outer JSONL wrapper.
Cost-sensitive GPT-4/Claude/Gemini applications. At scale, 40% fewer tokens translates directly to 40% lower inference spend. For comparison numbers and a cost calculator, see the CSV vs TOON breakdown.

The two formats serve different layers of the same pipeline. NDJSON is typically the transport format between services; TOON is the prompt serialization format inside the LLM call. They are not in competition so much as they occupy different stages of the same data flow: raw events arrive as NDJSON, get filtered and shaped, and the result is serialized to TOON before being inserted into the prompt context.

Streaming: Where NDJSON Has a Genuine Advantage

It is worth being precise about what "streaming" means for each format, because the term is used differently in different contexts.

NDJSON streaming means line-level independence: a consumer can process record 1 before record 2 has been written. Each line is fully self-describing. This is how Kafka consumers, Logstash pipelines, and BigQuery streaming inserts work. It requires no lookahead, no buffering of the whole dataset, and no structural knowledge beyond "split on newline, parse each chunk as JSON."

TOON does not support this model. A TOON table block consists of a header line followed by N data rows. The data rows are only interpretable after the header has been read. A partial TOON file — header plus the first half of the rows — is valid and readable, but you cannot process row 50 without having first consumed the header. This is a real limitation for infrastructure streaming scenarios.

What TOON does support is document-level streaming for large files, which is a different concern. The TONL format (the production-grade cousin of TOON) ships with a streaming API capable of processing files over 50 GB in under 100 MB of memory, with indexed query lookups under 0.1 ms. But that is a memory-efficiency story, not an append-streaming story. For the LLM use case — where you assemble a prompt, send it, and read the response — the distinction rarely matters. The full TOON block is assembled once and injected once; there is no incremental appending.

For a deeper comparison of where TOON sits relative to other formats in terms of structure and nesting, see the JSON vs TOON comparison.

A Note on Heterogeneous Records

NDJSON handles heterogeneous records gracefully. Line 1 can have five fields, line 2 can have twelve, and line 3 can omit entirely the fields that the first two had. Each line is its own JSON object with its own schema. This is common in log files where different event types carry different payloads.

TOON's table block assumes a uniform schema. The header declares the fields once; every row must supply values in that column order. Sparse or heterogeneous records are awkward — you either pad missing fields with empty values or fall back to TOON's YAML-style object blocks for the non-uniform portion of the data.

If your data is a mix of uniform tables and irregular objects, a pragmatic approach is to use TOON for the uniform sections and standard key-value TOON syntax (or even JSON) for the irregular ones. The benchmark data from toonformat.dev/guide/benchmarks shows that mixed structures still achieve 21.9% token savings with TOON — meaningful, but less dramatic than the 58.8% on pure flat tables.

The arXiv study also notes that TOON's efficiency is non-linear: the format overhead from the header and format-description tokens in the system prompt is amortized across rows. For very small datasets (under ~10 rows), the overhead can exceed the per-row savings. This matches practical experience and is worth factoring into any profiling you do. See the format comparison overview for a fuller picture including CSV, YAML, and XML.

Frequently Asked Questions

What is NDJSON?

NDJSON (Newline-Delimited JSON), also called JSON Lines or JSONL, stores one self-contained JSON value per line. Each line is valid JSON on its own, making it easy to stream, append, and process with line-by-line tools. It is widely used for logs, BigQuery exports, OpenAI batch files, and ML training datasets.

Is TOON smaller than NDJSON?

Yes, for uniform arrays of objects TOON is materially smaller. NDJSON repeats every key on every line; TOON declares fields once in a header and writes bare comma-separated values per row. According to toonformat.dev/guide/benchmarks, TOON achieves 39.9% fewer tokens overall versus JSON, and up to 58.8% on flat uniform tables — the exact shape NDJSON is most often used for.

Should I use JSON Lines or TOON for LLM prompts?

Use TOON when you are packing a uniform array of records into an LLM prompt and token cost matters. Use NDJSON when you need append-only streaming, line-level grep, or interoperability with data-engineering tooling (BigQuery, Spark, OpenAI fine-tuning). The two use cases rarely overlap.

Can TOON be streamed like NDJSON?

Not in the same line-at-a-time sense. TOON's tabular block requires the header to be read before individual rows make sense, so a TOON table is processed as a whole unit. NDJSON is truly append-friendly: you can write a new line to the end of a file or pipe records through without any structural knowledge.

Does TOON work with OpenAI batch or fine-tuning files?

No. OpenAI's batch and fine-tuning APIs require valid JSONL files where each line is a well-formed JSON object matching a specific schema. TOON is a prompt-serialization format, not an API wire format. Keep JSONL for those pipelines and use TOON only inside the prompt content itself.

Try the free converter Read: JSON vs TOON