Claude 3.5 and TOON: Maximizing Efficiency in AI Agentic Tasks
Explore how Claude 3.5 Sonnet and Haiku's new capabilities combine with TOON/TONL for optimal token efficiency, reducing API costs by 40-60% in agentic workflows.
Anthropic just released upgraded versions of Claude 3.5 Sonnet and the new Claude 3.5 Haiku, pushing the boundaries of AI capabilities with significant improvements in coding, agentic tasks, and tool use. But here's the thing: more powerful models process more tokens, which means higher costs. In this post, we'll explore how TOON and TONL can help you maximize efficiency when working with these cutting-edge models.
Claude 3.5: A Leap Forward in AI Capabilities
The updated Claude 3.5 Sonnet represents a major advancement in AI-powered coding and agentic workflows. On the SWE-bench Verified benchmark, it improved from 33.4% to an impressive 49.0%—outperforming all publicly available models, including reasoning models like OpenAI's o1-preview.
For tool use tasks, Claude 3.5 Sonnet now scores 69.2% on TAU-bench in the retail domain (up from 62.6%) and 46.0% in the more challenging airline domain (up from 36.0%). This makes it exceptionally well-suited for complex, multi-step automation workflows.
Claude 3.5 Haiku: Compact Yet Powerful
Claude 3.5 Haiku is the next generation of Anthropic's fastest model. Despite its smaller size, it surpasses Claude 3 Opus on many intelligence benchmarks while maintaining low latency. On SWE-bench Verified, Haiku scores 40.6%—outperforming many agents built with state-of-the-art models, including the original Claude 3.5 Sonnet and GPT-4o.
This makes Haiku ideal for user-facing products, specialized sub-agent tasks, and processing large volumes of structured data like purchase history, pricing, or inventory records.
The Token Efficiency Challenge
As Claude's capabilities expand, developers are feeding it more complex data—product catalogs, database records, API responses, configuration files. The problem? Most of this data is sent as JSON, which is notoriously token-inefficient.
With Claude 3.5 Haiku priced at $0.80 per million input tokens and $4.00 per million output tokens, and Sonnet at even higher rates, those extra JSON tokens add up fast. If you're building agentic workflows that make hundreds or thousands of API calls per session, token bloat can quickly inflate your costs.
The JSON "Syntax Tax"
Consider this common scenario: you're building a coding assistant that uses Claude 3.5 Sonnet with tool calling. You need to send it a list of 100 function signatures with their parameters. Here's what that looks like in JSON:
[
{
"name": "createUser",
"params": ["name", "email"],
"returns": "User"
},
{
"name": "deleteUser",
"params": ["userId"],
"returns": "boolean"
},
{
"name": "updateUser",
"params": ["userId", "data"],
"returns": "User"
}
// ... 97 more functions
][100]{name,params,returns}:
createUser,[name,email],User
deleteUser,[userId],boolean
updateUser,[userId,data],User
# ... 97 more functionsIn the JSON version, you're repeating "name":, "params":, and "returns": 100 times. You're also paying for all those curly braces, quotes, and commas. With TOON, you declare the schema once in a header and then provide just the values—40-60% fewer tokens for array-heavy data.
TOON for Claude Tool Calling
Claude 3.5's improved tool use capabilities make it perfect for agentic workflows—but these workflows often require passing large amounts of structured data back and forth. This is where TOON shines.
Example: E-commerce Product Search Agent
Let's say you're building an e-commerce assistant that helps users find products. When a user asks "Show me wireless headphones under $100," your agent needs to:
- Call a
searchProductstool - Receive a list of matching products
- Process and present the results to the user
If your tool returns 50 products in JSON format, you might be sending 10,000+ tokens just for the product data. With TOON, you can cut that to 4,000-6,000 tokens—same semantic information, lower cost.
Tool Response in TOON
[50]{id,name,price,rating,inStock}:
p1,Sony WH-1000XM5,299.99,4.8,true
p2,Bose QuietComfort 45,279.95,4.7,true
p3,Apple AirPods Max,549.00,4.6,false
p4,Sennheiser Momentum 4,349.95,4.9,true
# ... 46 more productsTONL for Advanced Workflows
For production applications that need schema validation, query capabilities, and streaming support, TONL (Token-Optimized Notation Language) takes efficiency even further.
TONL is an AI-native data platform that provides:
- Schema validation to ensure data integrity
- Query API for filtering and transforming data before sending to Claude
- Streaming support for handling large datasets efficiently
- 45% token reduction vs JSON on average
When to Use TONL
TONL is particularly valuable when you're working with Claude 3.5 for:
- Multi-agent systems where agents need to exchange structured data
- RAG pipelines that retrieve and process large context chunks
- Database-backed applications where you're frequently querying and passing records to Claude
- Real-time data processing with streaming requirements
Real-World Cost Savings
Let's do the math on a real scenario. Suppose you're building a coding assistant powered by Claude 3.5 Sonnet that helps developers understand and refactor codebases:
Scenario: Daily Coding Assistant Usage
- • 1,000 users making requests daily
- • Average 20 requests per user per day
- • Each request includes 5,000 tokens of structured data (JSON)
- • Claude 3.5 Sonnet input price: $3.00/MTok
With JSON:
1,000 users × 20 requests × 5,000 tokens = 100M tokens/day
100M tokens × $3.00/MTok = $300/day
With TOON (50% reduction):
1,000 users × 20 requests × 2,500 tokens = 50M tokens/day
50M tokens × $3.00/MTok = $150/day
Monthly Savings: $4,500
Annual Savings: $54,750
Getting Started with TOON and Claude 3.5
Ready to optimize your Claude 3.5 integration? Here are the steps to get started:
- Audit your data payloads: Identify where you're sending large JSON structures to Claude. Look for arrays of objects, repeated structures, and deeply nested data.
- Convert to TOON: Use our JSON to TOON converter to see immediate token savings. The converter shows you the exact token count difference.
- Update your prompts: Claude 3.5 models understand TOON natively. Simply include TOON-formatted data in your prompts—no special instructions needed.
- Consider TONL for production: If you need schema validation and advanced features, explore TONL for enterprise-grade efficiency.
The Future of Efficient AI
As AI models like Claude 3.5 become more capable, they'll handle increasingly complex agentic workflows. The ability to efficiently encode and transmit structured data will become a competitive advantage—not just in terms of cost, but also in terms of latency and environmental sustainability.
TOON and TONL represent a fundamental shift in how we think about data interchange in the AI era. Instead of treating LLMs like traditional APIs that parse binary formats, we acknowledge their tokenized nature and optimize accordingly.
"The most efficient code is the code you don't write. The most efficient tokens are the tokens you don't send."
Conclusion
Claude 3.5 Sonnet and Haiku are remarkable achievements in AI, offering state-of-the-art performance in coding, tool use, and agentic tasks. But to truly harness their power at scale, you need to think about token efficiency.
By adopting TOON for your structured data and TONL for production workflows, you can reduce your API costs by 40-60% while maintaining—or even improving—the quality of your AI interactions. It's a win-win: lower costs, faster responses, and more sustainable AI applications.
Recommended Reading
Stop Using JSON for LLMs: The Case for Token Efficiency
Why JSON is costing you money and performance in AI applications, and how switching to TOON can reduce token usage by up to 60%.
TOON vs TONL: A Complete Comparison of Token-Optimized Data Formats
Compare TOON vs TONL for LLMs: key differences, performance benchmarks, and when to use each format to optimize AI API costs.
Optimize OpenAI and Claude API Costs with TOON
Practical guide to reducing OpenAI GPT and Anthropic Claude API costs by 30-60% using TOON format. Includes code examples and implementation strategies.