TONL Format Specification for Maximum LLM Token Efficiency

TONL (Token-Optimized Notation Language) is a production-ready data platform that combines compact serialization with powerful query, modification, indexing, and streaming capabilities. Designed specifically for LLM token efficiency, TONL can reduce token usage by up to 45% compared to standard JSON while providing a rich API for data access and manipulation. Learn more in our TONL introduction or see how it compares to TOON.

Introduction

TONL goes beyond simple data serialization. It provides a complete data platform with advanced features like querying, indexing, schema validation, and optimization strategies. While formats like TOON focus on compact representation, TONL adds powerful data manipulation capabilities. Read our introduction to TONL for more details.

Key Features

Token Efficient: 32-45% smaller than JSON in both bytes and tokens
Query API: JSONPath-like queries with filtering, aggregation, and fuzzy matching
Modification API: Full CRUD operations with change tracking
Indexing: Hash, BTree, and compound indexes for fast lookups
Streaming: Handle multi-GB files with minimal memory
Schema Validation: TSL (TONL Schema Language) with 13 built-in constraints
Advanced Optimization: Dictionary encoding, delta encoding, bit packing, and more

Format Syntax

TONL uses a version header and tabular format for arrays of objects, with type hints for validation.

Arrays of Objects (Tabular Format)

TONL's tabular format is perfect for structured data like user lists, product catalogs, or log entries.

#version 1.0
users[3]{id:u32,name:str,role:str}:
  1, Alice, admin
  2, "Bob, Jr.", user
  3, Carol, editor

Nested Objects

TONL supports nested structures with clear indentation:

#version 1.0
user{id:u32,name:str,contact:obj,roles:list}:
  id: 1
  name: Alice
  contact{email:str,phone:str}:
    email: alice@example.com
    phone: +123456789
  roles[2]: admin, editor

Example Comparison

Here is a comparison of JSON and TONL:

JSON (245 bytes, 89 tokens):

{
  "users": [
    { "id": 1, "name": "Alice", "role": "admin" },
    { "id": 2, "name": "Bob, Jr.", "role": "user" },
    { "id": 3, "name": "Carol", "role": "editor" }
  ]
}

TONL (158 bytes, 49 tokens - 45% reduction):

#version 1.0
users[3]{id:u32,name:str,role:str}:
  1, Alice, admin
  2, "Bob, Jr.", user
  3, Carol, editor

Query API

TONL includes a powerful query API that supports JSONPath-like syntax:

// Basic queries
doc.get('users[0].name');                          // 'Alice'
doc.query('users[*].name');                        // ['Alice', 'Bob', 'Carol']
doc.query('users[?(@.role == "admin")]');          // Filter by role

// Aggregation
doc.count('users[*]');                             // 3
doc.sum('users[*]', 'age');                        // Sum of ages
doc.avg('users[*]', 'age');                        // Average age
doc.groupBy('users[*]', 'role');                   // Group by role

// Fuzzy matching
import { fuzzySearch } from 'tonl/query';
fuzzySearch('Jon', ['John', 'Jane', 'Bob']);       // Fuzzy match

Schema Validation

TONL includes a powerful schema language (TSL) with 13 built-in constraints:

@schema v1
@strict true

User: obj
  id: u32 required
  username: str required min:3 max:20 pattern:^[a-zA-Z0-9_]+$
  email: str required pattern:email lowercase:true
  age: u32? min:13 max:150
  roles: list<str> required min:1 unique:true

users: list<User> required min:1

Advanced Optimization

TONL includes multiple optimization strategies that can achieve up to 60% additional compression:

Dictionary Encoding: String deduplication with reference encoding (30-50% savings)
Delta Encoding: Sequential value compression (40-60% savings)
Bit Packing: Boolean and small integer bit-level compression (87.5% savings)
Run-Length Encoding: Repetitive value compression (50-80% savings)
Adaptive Optimizer: Automatically selects the best strategy

Use Cases

LLM Prompts: Reducing token costs by up to 45% compared to JSON. See our cost optimization guide.
API Responses: Efficient data transmission with schema validation
Data Pipelines: Stream processing for large datasets
Configuration Files: Human-readable configs with validation
Log Aggregation: Compact structured logs

Performance

Based on typical e-commerce product catalog data:

Metric	JSON	TONL	Improvement
Size (bytes)	245	158	36% smaller
Tokens (GPT-5)	89	49	45% fewer
Encoding Speed	1.0x	12-15x	12-15x faster
Decoding Speed	1.0x	10x	10x faster

Getting Started

You can start using TONL right now with our online converter. Simply paste your JSON data and select TONL as the target format. For programmatic use, check out our TypeScript/JavaScript installation guide or read the TONL introduction.

For programmatic use, install the TONL package:

npm install tonl

import { encodeTONL, decodeTONL } from 'tonl';

const data = { users: [{ id: 1, name: "Alice" }] };
const tonlText = encodeTONL(data);
const restored = decodeTONL(tonlText);

Specification

TONL vs TOON

Both TONL and TOON are designed for token efficiency, but they serve different use cases. For a detailed comparison, see our complete TOON vs TONL guide.

Use TOON when you need a simple, lightweight format for basic data conversion.
Use TONL when you need advanced features like querying, indexing, schema validation, or when working with large datasets.