IMO solved by OpenAI LLM
In a sun-drenched convention center on Australia’s Sunshine Coast, the 66th International Mathematical Olympiad (IMO) unfolded this month. It brought together 635 of the world’s brightest young minds from 114 countries. Amid the flurry of pencils and the tension of geopolitical debates, an unexpected contender emerged, not a teenager with a calculator, but an artificial intelligence model developed by OpenAI. On July 19, 2025, OpenAI research scientist Alexander Wei announced that their experimental reasoning large language model (LLM) had achieved a gold medal-level performance, solving five of the six grueling problems in the 2025 IMO. This milestone, validated under the same stringent conditions as human contestants – two 4.5-hour sessions, no tools or internet – marks a seismic shift in the landscape of artificial intelligence.
Also read: OpenAI unveils ChatGPT Agent, AI that can work like real assistant
The IMO, widely regarded as the pinnacle of pre-college mathematical competition since its inception in 1959, is known for its challenging problems that demand creative, sustained reasoning. This year, the spotlight shifted to a machine. Wei’s thread on X detailed the LLM’s journey: tackling problems like the 2025 IMO Problem 1, which required intricate proofs about non-negative integers and geometric lines, the model crafted multi-page, watertight arguments that earned it 35 out of 42 points, a score sufficient for gold.
What sets this achievement apart is not just the result but the approach. Unlike previous AI successes in narrow domains, OpenAI’s LLM leveraged general-purpose reinforcement learning and test-time compute scaling. This method, which recent studies in Nature Machine Intelligence (2024) suggest can boost multi-step reasoning by 40%, allowed the model to think creatively over extended horizons, up to 100 minutes per problem, far surpassing earlier benchmarks like the MATH dataset, where a 30% accuracy was optimistically forecasted in 2021. “We’ve progressed from quick calculations to sustained, human-like reasoning,” Wei wrote, a sentiment echoed by OpenAI CEO Sam Altman, who called it a “significant marker of how far AI has come.”
Also read: OpenAI announces o3 and o4-mini AI reasoning models: Here’s what they can do
The LLM’s solutions, released online, reveal a distinct style, sometimes sassy, often meticulous. For Problem 1, it navigated a labyrinth of conditions about “sunny” lines (those not parallel to the axes or the line x+y=0) with a proof spanning several pages, culminating in a playful “No citation necessary” after computing a mystery number. This flair hints at the model’s experimental nature, a research prototype distinct from the forthcoming GPT-5, which OpenAI plans to release soon but without this level of mathematical prowess for months.
The implications are profound. This breakthrough challenges the notion that AI lacks true understanding, suggesting a leap toward general intelligence. Experts point to recent advancements in reinforcement learning, which allow models to adapt and reason without task-specific training, as a key driver. For the human contestants, the AI’s presence is both inspiration and a call to elevate their own skills. As the Sunshine Coast event concluded, the LLM’s gold medal stood as a testament to AI’s evolving capabilities to prove that when it comes to pure reason, the boundary between human and machine is blurring. Whether this heralds a new era of intelligence or a redefinition of competition, the 2025 IMO will be remembered not just for its equations, but for the code that cracked them.
Also read: OpenAI, Google, Anthropic researchers warn about AI ‘thoughts’: Urgent need explained