Skip to main content

Measuring Thinking Efficiency in Reasoning Models: The Missing Benchmark