Test-time compute is eating training-time compute

The dominant paradigm of “scale up training” is being joined — not replaced, but joined — by a new one: “scale up inference.” Models that think longer at runtime are pulling ahead on hard tasks.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top