XML (eXtensible Markup Language) is a survivor. It outlived SGML, it coexists with JSON, and it still powers the world's biggest banking and healthcare systems. But XML was designed for a different era—an era of "Documents," not "Tokens."

In the era of Generative AI, where we pay for every byte of context we feed to a model, XML's verbosity is a liability. TOON offers a modern alternative: it preserves the hierarchical structure that XML does so well, but strips away the "Markup" to leave only the "Meaning."

Documents vs Data: A Philosophical Split

XML's roots are in SGML (Standard Generalized Markup Language), the same parent as HTML. It was designed to markup text documents.

<note>
  <to>Tove</to>
  <from>Jani</from>
  <body>Don't forget me this weekend!</body>
</note>

Its superpower is "Mixed Content":

<p>This is <b>bold</b> and this is <i>italic</i>.</p>

This is amazing for publishing. It is terrible for data serialization. When you are sending a list of users to an API, you never need mixed content. You need strict Keys and Values.

TOON is for Data. It assumes structure, not prose.

note:
  to: Tove
  from: Jani
  body: "Don't forget me this weekend!"

The "Closing Tag Tax"

The most obvious inefficiency in XML is the closing tag.

If you have a tag `<internationalization_enabled>`, you must also have `</internationalization_enabled>`.

Math:

Tag Name: 26 characters
Brackets + Slash: 5 characters (`<`, `>`, `</`, `>`)
Total Overhead: 57 characters to wrap a boolean `true`.

TOON Math:

Key Name: 26 characters
Colon + Space: 2 characters
Total Overhead: 28 characters.

Result: TOON is consistently 50% smaller just by removing redundancy. For an LLM with a 128k Context Window, this effectively doubles your memory.

Attributes vs Elements: The Confusion

XML developers have argued for 20 years about how to represent data.

Option A (Attributes):

<user id="123" name="Alice" />

This is concise, but attributes cannot contain nested structures (like a list of addresses).

Option B (Elements):

<user>
  <id>123</id>
  <name>Alice</name>
</user>

This is flexible, but verbose.

The TOON Solution: Unification

TOON abolishes this distinction. Everything is a key-value pair.

user:
  id: 123
  name: Alice

This matches the mental model of modern programming languages (JSON Objects, Python Dicts, Java Maps).

Parsing Complexity: DOM vs Stream

XML parsers are legendary for their complexity.

DOM Parsing: Loads the entire tree into memory. Explodes RAM for large files.
SAX Parsing: Event-based (`startElement`, `endElement`). Extremely fast but painfully hard to write code for ("Callback Hell").

TOON is designed for Linear Streaming.

It reads line by line.
Indentation tells it the depth.
It emits objects as they complete.

Writing a TOON parser takes an afternoon. Writing a compliant XML parser takes a year.

Token Economics: The 69% Savings

Let's look at a realistic "RAG Chunk" payload.

Format	Content (50 chunks)	Tokens (GPT-4)	Cost
XML	Verbose Markup	6,500	$0.20
TOON	Tabular Data	2,015	$0.06

The savings are not just money. They are Speed. Generating 2,000 tokens is 3x faster than generating 6,500. For a chatbot, this is the difference between "Snappy" and "Sluggish."

Use Cases

When XML Wins

The Document Web.

If you are writing a book (`DocBook`), a technical manual (`DITA`), or a rich text document (XHTML), XML is superior. Mixed content `

text bold text

` is a feature TOON does not attempt to replicate.

When TOON Wins

The Data Web.

If you are building an API, feeding a vector database, or configuring an AI Agent, use TOON. The model doesn't care about angle brackets. It cares about relationships, values, and types.

Conclusion

XML feels like a "heavy" format because it carries the weight of 30 years of history.

TOON feels "light" because it carries only what is necessary for the task at hand: transferring structured data to intelligence.

If you are still sending XML to OpenAI, you are paying a "Legacy Tax." It's time to upgrade.

Convert XML to TOON Compare XML vs TONL

XML vs TOON: Complete Format Comparison for LLM Optimization

Documents vs Data: A Philosophical Split

The "Closing Tag Tax"

Attributes vs Elements: The Confusion

Parsing Complexity: DOM vs Stream

Token Economics: The 69% Savings

Use Cases

When XML Wins

When TOON Wins

Conclusion

Recommended Reading

Protobuf vs TOON: Binary Speed vs Token Efficiency

YAML vs TOON: Human-Readable Format Battle for LLM Optimization

TOML vs TOON: Configuration vs Token-Optimized Data Formats