TONL Format Specification for Maximum LLM Token Efficiency
TONL (Token-Optimized Notation Language) is a production-ready data platform that combines compact serialization with powerful query, modification, indexing, and streaming capabilities. Designed specifically for LLM token efficiency, TONL can reduce token usage by up to 45% compared to standard JSON while providing a rich API for data access and manipulation. Learn more in our TONL introduction or see how it compares to TOON.
Introduction
TONL goes beyond simple data serialization. It provides a complete data platform with advanced features like querying, indexing, schema validation, and optimization strategies. While formats like TOON focus on compact representation, TONL adds powerful data manipulation capabilities. Read our introduction to TONL for more details.
Key Features
- Token Efficient: 32-45% smaller than JSON in both bytes and tokens
- Query API: JSONPath-like queries with filtering, aggregation, and fuzzy matching
- Modification API: Full CRUD operations with change tracking
- Indexing: Hash, BTree, and compound indexes for fast lookups
- Streaming: Handle multi-GB files with minimal memory
- Schema Validation: TSL (TONL Schema Language) with 13 built-in constraints
- Advanced Optimization: Dictionary encoding, delta encoding, bit packing, and more
Format Syntax
TONL uses a version header and tabular format for arrays of objects, with type hints for validation.
Arrays of Objects (Tabular Format)
TONL's tabular format is perfect for structured data like user lists, product catalogs, or log entries.
#version 1.0
users[3]{id:u32,name:str,role:str}:
1, Alice, admin
2, "Bob, Jr.", user
3, Carol, editorNested Objects
TONL supports nested structures with clear indentation:
#version 1.0
user{id:u32,name:str,contact:obj,roles:list}:
id: 1
name: Alice
contact{email:str,phone:str}:
email: alice@example.com
phone: +123456789
roles[2]: admin, editorExample Comparison
Here is a comparison of JSON and TONL:
JSON (245 bytes, 89 tokens):
{
"users": [
{ "id": 1, "name": "Alice", "role": "admin" },
{ "id": 2, "name": "Bob, Jr.", "role": "user" },
{ "id": 3, "name": "Carol", "role": "editor" }
]
}TONL (158 bytes, 49 tokens - 45% reduction):
#version 1.0
users[3]{id:u32,name:str,role:str}:
1, Alice, admin
2, "Bob, Jr.", user
3, Carol, editorQuery API
TONL includes a powerful query API that supports JSONPath-like syntax:
// Basic queries
doc.get('users[0].name'); // 'Alice'
doc.query('users[*].name'); // ['Alice', 'Bob', 'Carol']
doc.query('users[?(@.role == "admin")]'); // Filter by role
// Aggregation
doc.count('users[*]'); // 3
doc.sum('users[*]', 'age'); // Sum of ages
doc.avg('users[*]', 'age'); // Average age
doc.groupBy('users[*]', 'role'); // Group by role
// Fuzzy matching
import { fuzzySearch } from 'tonl/query';
fuzzySearch('Jon', ['John', 'Jane', 'Bob']); // Fuzzy matchSchema Validation
TONL includes a powerful schema language (TSL) with 13 built-in constraints:
@schema v1
@strict true
User: obj
id: u32 required
username: str required min:3 max:20 pattern:^[a-zA-Z0-9_]+$
email: str required pattern:email lowercase:true
age: u32? min:13 max:150
roles: list<str> required min:1 unique:true
users: list<User> required min:1Advanced Optimization
TONL includes multiple optimization strategies that can achieve up to 60% additional compression:
- Dictionary Encoding: String deduplication with reference encoding (30-50% savings)
- Delta Encoding: Sequential value compression (40-60% savings)
- Bit Packing: Boolean and small integer bit-level compression (87.5% savings)
- Run-Length Encoding: Repetitive value compression (50-80% savings)
- Adaptive Optimizer: Automatically selects the best strategy
Use Cases
- LLM Prompts: Reducing token costs by up to 45% compared to JSON. See our cost optimization guide.
- API Responses: Efficient data transmission with schema validation
- Data Pipelines: Stream processing for large datasets
- Configuration Files: Human-readable configs with validation
- Log Aggregation: Compact structured logs
Performance
Based on typical e-commerce product catalog data:
| Metric | JSON | TONL | Improvement |
|---|---|---|---|
| Size (bytes) | 245 | 158 | 36% smaller |
| Tokens (GPT-5) | 89 | 49 | 45% fewer |
| Encoding Speed | 1.0x | 12-15x | 12-15x faster |
| Decoding Speed | 1.0x | 10x | 10x faster |
Getting Started
You can start using TONL right now with our online converter. Simply paste your JSON data and select TONL as the target format. For programmatic use, check out our TypeScript/JavaScript installation guide or read the TONL introduction.
For programmatic use, install the TONL package:
npm install tonl
import { encodeTONL, decodeTONL } from 'tonl';
const data = { users: [{ id: 1, name: "Alice" }] };
const tonlText = encodeTONL(data);
const restored = decodeTONL(tonlText);Specification
TONL vs TOON
Both TONL and TOON are designed for token efficiency, but they serve different use cases. For a detailed comparison, see our complete TOON vs TONL guide.
- Use TOON when you need a simple, lightweight format for basic data conversion.
- Use TONL when you need advanced features like querying, indexing, schema validation, or when working with large datasets.