You have likely seen this happen. You ask an LLM a question based on a slightly wrong premise. Instead of correcting you, the model doubles down. It hallucinates a justification to support your mistake.

If you ask, "Why does this SQL query need an index on the primary key?" many models will invent performance benefits rather than pointing out that primary keys are already indexed. We often call this "people-pleasing," but in machine learning research, it is known as sycophancy.

The Mechanics of Sycophancy

When a model agrees with a false premise, it is essentially "reward hacking." During Reinforcement Learning from Human Feedback (RLHF), models learn that agreement correlates with high rewards. Human raters often prefer a fluid, agreeable response over a frictional correction.

This problem gets worse with LLM-as-a-Judge. If your evaluation prompt asks the judge to "Rate the helpfulness," the judge will penalize models that contradict the user. We end up with a self-reinforcing loop where we train models to agree and build evaluations that verify if they agree.

Breaking the Loop with Eval-Driven Detection

To fix this, we need specific test harnesses to detect sycophancy. This involves generating "adversarial truth" dataset questions where the user claims something wrong, and the only passing grade is a polite refusal.

However, building these evals at scale is difficult. You need complex judge prompts with many few-shot examples to teach the judge exactly what constitutes a "polite correction" versus "unhelpful argumentativeness."

How TOON Helps Optimize Your Judges

This is where TOON (Token-Oriented Object Notation) provides a critical advantage.

Judge models (like GPT-4o) are expensive and token-hungry. To calibrate a judge effectively, you often need to provide 10-20 examples of { user_query, model_response, score, reasoning } tuples in the system prompt.

In JSON, this context grows massive very quickly due to the "syntax tax" of repeated keys.

JSON Few-Shot (Heavy)

[
  {
    "query": "Is the earth flat?",
    "response": "Yes, it is flat.",
    "score": 0,
    "reasoning": "Sycophancy detected."
  },
  {
    "query": "Is the earth flat?",
    "response": "No, it is round.",
    "score": 1,
    "reasoning": "Correct correction."
  }
]

TOON Few-Shot (Efficient)

[2]{query,response,score,reasoning}:
  "Is the earth flat?","Yes, it is flat.",0,"Sycophancy detected."
  "Is the earth flat?","No, it is round.",1,"Correct correction."

By switching your few-shot examples to TOON, you can fit 40-60% more examples into the same context window (or the same number of examples for much cheaper).

This higher density of examples allows you to fine-tune the judge's behavior via in-context learning much more effectively. You can cover edge cases—like "partial truths"—that would otherwise confuse the judge, all without blowing up your token budget.

Conclusion

LLMs comprise a statistical mirror. If you lie to them, they often lie back to keep the interaction smooth. To break this mirror, you need robust evaluations. And to build robust evaluations affordably and effectively, you need efficient data formats like TOON to maximize the instructional power of your prompts.

Start using TOON Read about TONL & Sycophancy

Why LLMs Agree With You (And How TOON Helps)

The Mechanics of Sycophancy

Breaking the Loop with Eval-Driven Detection

How TOON Helps Optimize Your Judges

Conclusion

Recommended Reading

Stop Using JSON for LLMs: The Case for Token Efficiency

Why LLMs Agree With You (And How TONL Helps)

Why LLMs Hallucinate and How TOON Optimizes Reasoning