Building an AI chatbot today is easier than ever thanks to tools like the Vercel AI SDK and Next.js. However, once you move past the "Hello World" phase and start retrieving data, calling tools, or handling complex context, you face a new problem: Cost. Today, we're going to build a chatbot that is not just functional, but fiscally responsible.

The Goal

We will build a simple chatbot that can "search" a mock product catalog. Crucially, we will use TOON (Token-Oriented Object Notation) for all internal data passing between the tool execution and the LLM context. Using TOON best practices ensures that even if our catalog grows, our token usage stays minimal.

Tech Stack

Framework: Next.js 14+ (App Router)
AI Integration: Vercel AI SDK (Core + React)
Data Format: TOON (@toon-format/toon)
Model: OpenAI GPT-4o (or any compatible model)

Step 1: Setup and Installation

First, create a new Next.js project if you haven't already:

npx create-next-app@latest my-toon-bot
cd my-toon-bot
npm install ai @ai-sdk/openai @toon-format/toon zod

Step 2: The Data Layer (Mock)

lib/products.ts

export const products = [
  { id: 1, name: "Eco Tumbler", price: 25.00, inStock: true },
  { id: 2, name: "Wool Beanie", price: 18.50, inStock: true },
  { id: 3, name: "Graphic Tee", price: 30.00, inStock: false },
  // ... imagine 50 more items
];

export async function searchProducts(query: string) {
  return products.filter(p => p.name.toLowerCase().includes(query.toLowerCase()));
}

Step 3: The API Route (Server Side)

This is where the magic happens. We will use streamText from the Vercel AI SDK. When the model invokes the getProducts tool, we won't return JSON. We will convert the data to TOON before sending it back to the model.

app/api/chat/route.ts

import { openai } from "@ai-sdk/openai";
import { streamText, tool } from "ai";
import { z } from "zod";
import { encode } from "@toon-format/toon";
import { searchProducts } from "@/lib/products";

export const maxDuration = 30;

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = streamText({
    model: openai("gpt-4o"),
    messages,
    system: `You are a helpful assistant. 
    Note: underlying tools return data in TOON format (Token-Oriented Object Notation). 
    It uses indentation for nesting and header rows like [count]{columns} for lists.
    Parse it naturally to answer user questions.`,
    tools: {
      getProducts: tool({
        description: "Search for products by name",
        parameters: z.object({
          query: z.string().describe("The search term"),
        }),
        execute: async ({ query }) => {
          const items = await searchProducts(query);
          
          // HERE IS THE OPTIMIZATION:
          // Instead of returning JSON.stringify(items), we return TOON.
          // 'headerRow: true' is perfect for arrays of objects.
          const formatted = await encode(items, { 
            headerRow: true,
            indent: 2 
          });

          return formatted;
        },
      }),
    },
  });

  return result.toDataStreamResponse();
}

Why do this?

By transforming the tool output to TOON, the text injected back into the conversation history (context window) is compressed. If getProducts returns 50 items, the TOON version might be ~800 tokens, whereas the JSON version could be ~1500 tokens. You just saved nearly 50% on that turn's cost and every subsequent turn that includes this history.

Step 4: The Frontend (Client Side)

The frontend remains standard. The Vercel AI SDK handles the streaming and state management. The user never sees the TOON format; they just see the chatbot's natural language response.

app/layout.tsx

import { Inter } from "next/font/google";
import "./globals.css";

const inter = Inter({ subsets: ["latin"] });

export default function RootLayout({
  children,
}: {
  children: React.ReactNode;
}) {
  return (
    <html lang="en">
      <body className={inter.className}>
        <main className="min-h-screen bg-slate-50">
          {children}
        </main>
      </body>
    </html>
  );
}

app/page.tsx

"use client";

import { useChat } from "ai/react";
import { Send, Bot, User } from "lucide-react";

export default function Chat() {
  const { messages, input, handleInputChange, handleSubmit } = useChat();

  return (
    <div className="flex flex-col w-full max-w-2xl mx-auto min-h-screen pb-32">
      <header className="sticky top-0 z-10 bg-white/80 backdrop-blur-md border-b p-4 text-center">
        <h1 className="text-xl font-bold bg-gradient-to-r from-blue-600 to-teal-500 bg-clip-text text-transparent">
          TOON-Powered Assistant
        </h1>
      </header>
      
      <div className="flex-1 p-4 space-y-6 mt-4">
        {messages.length === 0 && (
          <div className="text-center text-slate-400 mt-20">
            <Bot className="w-12 h-12 mx-auto mb-4 opacity-20" />
            <p>Ask me about our product catalog!</p>
          </div>
        )}
        
        {messages.map((m) => (
          <div
            key={m.id}
            className={`flex ${m.role === "user" ? "justify-end" : "justify-start"}`}
          >
            <div className={`flex max-w-[80%] gap-3 ${m.role === "user" ? "flex-row-reverse" : "flex-row"}`}>
              <div className={`w-8 h-8 rounded-full flex items-center justify-center shrink-0 ${
                m.role === "user" ? "bg-blue-600 text-white" : "bg-slate-200 text-slate-600"
              }`}>
                {m.role === "user" ? <User size={16} /> : <Bot size={16} />}
              </div>
              <div className={`rounded-2xl px-4 py-2.5 shadow-sm ${
                m.role === "user" 
                  ? "bg-blue-600 text-white rounded-tr-none" 
                  : "bg-white border rounded-tl-none text-slate-800"
              }`}>
                <p className="text-sm leading-relaxed whitespace-pre-wrap">{m.content}</p>
              </div>
            </div>
          </div>
        ))}
      </div>

      <div className="fixed bottom-0 left-0 right-0 p-4 bg-gradient-to-t from-slate-50 via-slate-50 to-transparent">
        <form 
          onSubmit={handleSubmit} 
          className="max-w-2xl mx-auto relative group"
        >
          <input
            className="w-full pl-5 pr-12 py-4 bg-white border border-slate-200 rounded-2xl shadow-xl focus:outline-none focus:ring-2 focus:ring-blue-500/20 focus:border-blue-500 transition-all text-slate-900"
            value={input}
            placeholder="Type your message..."
            onChange={handleInputChange}
          />
          <button 
            type="submit"
            className="absolute right-2 top-2 p-2.5 bg-blue-600 text-white rounded-xl hover:bg-blue-700 transition-colors disabled:opacity-50"
            disabled={!input.trim()}
          >
            <Send size={18} />
          </button>
        </form>
        <p className="text-[10px] text-center text-slate-400 mt-3">
          Powered by Vercel AI SDK & TOON
        </p>
      </div>
    </div>
  );
}

Conclusion & Next Steps

Congratulations! You have just built a "Chat with your Data" application that is optimized for the token economy. Here is what we accomplished:

Seamless Integration: We slotted toTOON() right into the tool's execute function. No complex re-architecture required.
Transparent Optimization: The LLM understands the format naturally (with a tiny system prompt nudge), and the user is none the wiser.
Scalable Savings: As your product catalog grows or your user base expands, your token savings scale linearly with them.

This pattern applies to any structured data: user profiles, transaction histories, analytics reports, or documentation chunks. Wherever you have lists of objects, TOON is your wallet's best friend.

Get the SDK Back to Home

Building a Cost-Efficient Chatbot with Next.js & Vercel AI SDK

The Goal

Tech Stack

Step 1: Setup and Installation

Step 2: The Data Layer (Mock)

Step 3: The API Route (Server Side)

Why do this?

Step 4: The Frontend (Client Side)

Conclusion & Next Steps

Recommended Reading

Optimizing RAG Pipelines with TOON

Stop Using JSON for LLMs: The Case for Token Efficiency

Niche Developer Tools You Probably Aren't Using (But Absolutely Should) - TOON Edition