Anthropic just released upgraded versions of Claude 3.5 Sonnet and the new Claude 3.5 Haiku, pushing the boundaries of AI capabilities with significant improvements in coding, agentic tasks, and tool use. But here's the thing: more powerful models process more tokens, which means higher costs. In this post, we'll explore how TOON and TONL can help you maximize efficiency when working with these cutting-edge models.

Claude 3.5: A Leap Forward in AI Capabilities

The updated Claude 3.5 Sonnet represents a major advancement in AI-powered coding and agentic workflows. On the SWE-bench Verified benchmark, it improved from 33.4% to an impressive 49.0%—outperforming all publicly available models, including reasoning models like OpenAI's o1-preview.

For tool use tasks, Claude 3.5 Sonnet now scores 69.2% on TAU-bench in the retail domain (up from 62.6%) and 46.0% in the more challenging airline domain (up from 36.0%). This makes it exceptionally well-suited for complex, multi-step automation workflows.

Claude 3.5 Haiku: Compact Yet Powerful

Claude 3.5 Haiku is the next generation of Anthropic's fastest model. Despite its smaller size, it surpasses Claude 3 Opus on many intelligence benchmarks while maintaining low latency. On SWE-bench Verified, Haiku scores 40.6%—outperforming many agents built with state-of-the-art models, including the original Claude 3.5 Sonnet and GPT-4o.

This makes Haiku ideal for user-facing products, specialized sub-agent tasks, and processing large volumes of structured data like purchase history, pricing, or inventory records.

The Token Efficiency Challenge

As Claude's capabilities expand, developers are feeding it more complex data—product catalogs, database records, API responses, configuration files. The problem? Most of this data is sent as JSON, which is notoriously token-inefficient.

With Claude 3.5 Haiku priced at $0.80 per million input tokens and $4.00 per million output tokens, and Sonnet at even higher rates, those extra JSON tokens add up fast. If you're building agentic workflows that make hundreds or thousands of API calls per session, token bloat can quickly inflate your costs.

The JSON "Syntax Tax"

Consider this common scenario: you're building a coding assistant that uses Claude 3.5 Sonnet with tool calling. You need to send it a list of 100 function signatures with their parameters. Here's what that looks like in JSON:

Standard JSON (More Tokens)

[
  {
    "name": "createUser",
    "params": ["name", "email"],
    "returns": "User"
  },
  {
    "name": "deleteUser",
    "params": ["userId"],
    "returns": "boolean"
  },
  {
    "name": "updateUser",
    "params": ["userId", "data"],
    "returns": "User"
  }
  // ... 97 more functions
]

TOON (Fewer Tokens)

[100]{name,params,returns}:
  createUser,[name,email],User
  deleteUser,[userId],boolean
  updateUser,[userId,data],User
  # ... 97 more functions

In the JSON version, you're repeating "name":, "params":, and "returns": 100 times. You're also paying for all those curly braces, quotes, and commas. With TOON, you declare the schema once in a header and then provide just the values—40-60% fewer tokens for array-heavy data.

TOON for Claude Tool Calling

Claude 3.5's improved tool use capabilities make it perfect for agentic workflows—but these workflows often require passing large amounts of structured data back and forth. This is where TOON shines.

Example: E-commerce Product Search Agent

Let's say you're building an e-commerce assistant that helps users find products. When a user asks "Show me wireless headphones under $100," your agent needs to:

Call a searchProducts tool
Receive a list of matching products
Process and present the results to the user

If your tool returns 50 products in JSON format, you might be sending 10,000+ tokens just for the product data. With TOON, you can cut that to 4,000-6,000 tokens—same semantic information, lower cost.

Tool Response in TOON

[50]{id,name,price,rating,inStock}:
  p1,Sony WH-1000XM5,299.99,4.8,true
  p2,Bose QuietComfort 45,279.95,4.7,true
  p3,Apple AirPods Max,549.00,4.6,false
  p4,Sennheiser Momentum 4,349.95,4.9,true
  # ... 46 more products

TONL for Advanced Workflows

For production applications that need schema validation, query capabilities, and streaming support, TONL (Token-Optimized Notation Language) takes efficiency even further.

TONL is an AI-native data platform that provides:

Schema validation to ensure data integrity
Query API for filtering and transforming data before sending to Claude
Streaming support for handling large datasets efficiently
45% token reduction vs JSON on average

When to Use TONL

TONL is particularly valuable when you're working with Claude 3.5 for:

Multi-agent systems where agents need to exchange structured data
RAG pipelines that retrieve and process large context chunks
Database-backed applications where you're frequently querying and passing records to Claude
Real-time data processing with streaming requirements

Real-World Cost Savings

Let's do the math on a real scenario. Suppose you're building a coding assistant powered by Claude 3.5 Sonnet that helps developers understand and refactor codebases:

Scenario: Daily Coding Assistant Usage

• 1,000 users making requests daily
• Average 20 requests per user per day
• Each request includes 5,000 tokens of structured data (JSON)
• Claude 3.5 Sonnet input price: $3.00/MTok

With JSON:

1,000 users × 20 requests × 5,000 tokens = 100M tokens/day

100M tokens × $3.00/MTok = $300/day

With TOON (50% reduction):

1,000 users × 20 requests × 2,500 tokens = 50M tokens/day

50M tokens × $3.00/MTok = $150/day

Monthly Savings: $4,500

Annual Savings: $54,750

Getting Started with TOON and Claude 3.5

Ready to optimize your Claude 3.5 integration? Here are the steps to get started:

Audit your data payloads: Identify where you're sending large JSON structures to Claude. Look for arrays of objects, repeated structures, and deeply nested data.
Convert to TOON: Use our JSON to TOON converter to see immediate token savings. The converter shows you the exact token count difference.
Update your prompts: Claude 3.5 models understand TOON natively. Simply include TOON-formatted data in your prompts—no special instructions needed.
Consider TONL for production: If you need schema validation and advanced features, explore TONL for enterprise-grade efficiency.

The Future of Efficient AI

As AI models like Claude 3.5 become more capable, they'll handle increasingly complex agentic workflows. The ability to efficiently encode and transmit structured data will become a competitive advantage—not just in terms of cost, but also in terms of latency and environmental sustainability.

TOON and TONL represent a fundamental shift in how we think about data interchange in the AI era. Instead of treating LLMs like traditional APIs that parse binary formats, we acknowledge their tokenized nature and optimize accordingly.

"The most efficient code is the code you don't write. The most efficient tokens are the tokens you don't send."

Conclusion

Claude 3.5 Sonnet and Haiku are remarkable achievements in AI, offering state-of-the-art performance in coding, tool use, and agentic tasks. But to truly harness their power at scale, you need to think about token efficiency.

By adopting TOON for your structured data and TONL for production workflows, you can reduce your API costs by 40-60% while maintaining—or even improving—the quality of your AI interactions. It's a win-win: lower costs, faster responses, and more sustainable AI applications.

Try TOON Converter Explore TONL

Claude 3.5 and TOON: Maximizing Efficiency in AI Agentic Tasks