The dominant paradigm of “scale up training” is being joined — not replaced, but joined — by a new one: “scale up inference.” Models that think longer at runtime are pulling ahead on hard tasks.
The dominant paradigm of “scale up training” is being joined — not replaced, but joined — by a new one: “scale up inference.” Models that think longer at runtime are pulling ahead on hard tasks.