Skip Navigation

Oxford pretends AI benchmarks are science, not marketing

Oxford pretends AI benchmarks are science, not marketing

Chatbot vendors routinely make up a new benchmark, then brag how well their hot new chatbot does on it. Like that time OpenAI’s o3 model trounced the FrontierMath benchmark, and it’s just a coincid…

Comments

5