How AI truly advanced in 2025: Andrej Karpathy highlights 3 key points
Reinforcement learning with verifiable rewards drove major LLM leaps in 2025
LLM intelligence looks “jagged," more ghost than animal mind
“Vibe coding” empowered natural-language software creation for anyone
When a blog post by Andrej Karpathy lands in your feed, you pay close attention, simply because few voices in the field of artificial intelligence carry as much conceptual weight. For those who don’t know, Karpathy isn’t a fringe commentator, he’s one of the key architects of modern deep learning – a former OpenAI co-founder, ex-Director of AI at Tesla, and author of some of the most influential tutorials and research on neural networks and generative models. His perspective on where large language models (LLMs) have been and where they’re headed is by foundational nuts-and-bolts experience.
SurveyIn his 2025 Year in Review blog, Karpathy distilled the year’s AI progress into a series of key level ups he saw LLMs make. His insights reveal a GenAI field that is maturing fast and in surprising directions that we don’t really think about when we use ChatGPT or Gemini or Grok on a daily basis.
While Karpathy has mentioned quite a few milestones, there are three in particular that I’ve highlighted here which I think show us not only what changed in GenAI and LLMs this year, but why it matters going forward into 2026 and beyond.
1) RLVR: A new LLM training reality
For several years, high-performance LLMs were built somewhere like this – large-scale pretraining on a data set, followed by supervised fine-tuning, and reinforcement learning from human feedback. That approach gave us remarkable results with the birth of the modern GenAI use cases. But they also had familiar limitations – creating models that excelled superficially, struggling with any kind of deeper reasoning, according to Karpathy.
Also read: India becoming world’s most important AI developer hub, says GitHub
That’s where Reinforcement Learning from Verifiable Rewards (RLVR) comes in, which is Karpathy’s term for a shift in 2025’s LLM training stack. Instead of relying purely on human signals, models are now trained against objective, verifiable tasks (think math puzzles or code challenges with definitive correct answers). This isn’t just incremental improvement – it fundamentally tweaks how models learn to reason. According to Karpathy, RLVR lets LLMs develop strategies that look like reasoning to people because the models must optimize for verifiable success, not just mimic patterns in text.
— Andrej Karpathy (@karpathy) December 19, 2025
What this means in practice is that LLMs trained under RLVR are better at planning and problem solving because they have to be. The reward functions in the way they’re retrained give them a stake in actually getting things right when they respond, not merely spit out something that sounds plausible. This should help reduce hallucinations in GenAI chatbots going forward.
2) LLMs are like ghosts in terms of intelligence
Karpathy’s second great insight wrestles with a deceptively simple question: When we see ChatGPT or Gemini in action, what kind of intelligence is this? The instinctive answer – that LLMs are proto-animals or proto-humans – is misleading. Instead, Karpathy suggests LLMs are more like “summoned ghosts” of intelligence. Because they’re systems trained with optimization pressures and data structures that are unlike organic minds. This is because they spike in capability unpredictably, astonishing at tasks one moment and hilariously fragile the next.
Also read: AI won’t replace humans: Google to OpenAI, big tech CEOs agree
It’s a reminder that the flashes of brilliance we see in LLMs are not the same thing as cognitive maturity. This “jagged intelligence” challenges how we benchmark progress, where traditional performance scores can likely reward narrow optimization rather than genuine understanding. This revelation by Karpathy demands new ways to think about AI safety, evaluation, and real-world readiness.
3) Rise of LLM apps and vibe coding
Finally, Karpathy highlighted how the application layer around LLMs matured in 2025. Tools like Cursor illustrated a new class of “LLM app” – not just big models you query, but orchestrated LLM systems tailored for specific vertical tasks. This is where AI begins to feel practical, not just impressive, highlights Karpathy.
But the real game-changer – and the concept that might resonate most with everyday developers – is what Karpathy cheekily calls “vibe coding.” This is the threshold moment when programming doesn’t feel like coding in the old sense anymore. Where you just describe what you want in English and the model builds it.

Karpathy sees vibe coding as democratizing software creation, lowering the barrier to entry and releasing an enormous wave of creativity. It’s a bit like saying “Linux turned everyone into a sysadmin” – but this time the translator is a model that speaks human language.
In a year of explosive hype and glittering demos, Karpathy’s review is a grounding force – a signal in a very noisy space. As we carry forward into 2026, those three insights from Karpathy – RLVR, ghostly intelligence, and vibe coding – are not just milestones. They’re the landmarks by which we’ll remember in 2025.
Also read: LLMs worse than babies in field of AI: Yann LeCun ‘Godfather of AI’ explains why
Jayesh Shinde
Executive Editor at Digit. Technology journalist since Jan 2008, with stints at Indiatimes.com and PCWorld.in. Enthusiastic dad, reluctant traveler, weekend gamer, LOTR nerd, pseudo bon vivant. View Full Profile