Extended Thinking's First Feature Is Billing

你原本只是来看看模型是不是又变强了，结果发现真正有戏的是没说出来的那部分取舍。

最容易做错的，是把 Claude Code 和 Claude 当成同一种工具，以为谁分高谁就适合自己。；代价往往是如果只看宣传，你会以为自己买到的是更强版本，实际却可能先撞到更严格的限制。；我先给一个保守判断：Extended Thinking 的第一属性是计费。。

You came in just to see whether the model got stronger. Then you notice the real story is the tradeoff that never gets said out loud. If you only read the promo, you think you bought a stronger version. In practice, you may hit tighter limits first.

My conservative read is simple: Extended Thinking's first property is billing.

That is why The text in Claude Code’s “Extended Thinking” output matters more than it looks. In the cost docs, that extra reasoning text is billed like output, and the default thinking budget can run into tens of thousands of tokens. For simple tasks, the docs point you to /effort, /config, and MAX_THINKING_TOKENS to cut spend.

The fast-mode docs split the tradeoff cleanly: fast mode means lower latency at higher cost; lower effort means less thinking, faster replies, and possible quality loss on harder tasks. Extra thinking can help. It just is not free transparency. It is a priced resource.

The biggest conversation trigger is never 'the model got stronger.' It is why the strongest version was not served directly. My split: Claude Code is better at helping you see the problem clearly first. Claude is better at wrapping up the downstream work. Boundary: this read comes from the Claude 4 / Claude Code cost and fast-mode docs in the brief, not a live billing export or benchmark run. If your team uses Extended Thinking for routine coding, do you cap it by default or leave it open?

真正该讨论的是：Claude Code 更适合先帮你看清问题，Claude 更适合把后面的活收完整。