How AI truly advanced in 2025: Andrej Karpathy highlights 3 key points

By Jayesh Shinde | Updated on 20-Dec-2025

HIGHLIGHTS

Reinforcement learning with verifiable rewards drove major LLM leaps in 2025

LLM intelligence looks “jagged," more ghost than animal mind

“Vibe coding” empowered natural-language software creation for anyone

How AI truly advanced in 2025: Andrej Karpathy highlights 3 key points

Jayesh Shinde

20-Dec-2025

When a blog post by Andrej Karpathy lands in your feed, you pay close attention, simply because few voices in the field of artificial intelligence carry as much conceptual weight. For those who don’t know, Karpathy isn’t a fringe commentator, he’s one of the key architects of modern deep learning – a former OpenAI co-founder, ex-Director of AI at Tesla, and author of some of the most influential tutorials and research on neural networks and generative models. His perspective on where large language models (LLMs) have been and where they’re headed is by foundational nuts-and-bolts experience.

Survey

✅ Thank you for completing the survey!

In his 2025 Year in Review blog, Karpathy distilled the year’s AI progress into a series of key level ups he saw LLMs make. His insights reveal a GenAI field that is maturing fast and in surprising directions that we don’t really think about when we use ChatGPT or Gemini or Grok on a daily basis.

Add

As A Trusted Source For Google.

Add as a preferred source on Google

While Karpathy has mentioned quite a few milestones, there are three in particular that I’ve highlighted here which I think show us not only what changed in GenAI and LLMs this year, but why it matters going forward into 2026 and beyond.

1) RLVR: A new LLM training reality

For several years, high-performance LLMs were built somewhere like this – large-scale pretraining on a data set, followed by supervised fine-tuning, and reinforcement learning from human feedback. That approach gave us remarkable results with the birth of the modern GenAI use cases. But they also had familiar limitations – creating models that excelled superficially, struggling with any kind of deeper reasoning, according to Karpathy.

Also read: India becoming world’s most important AI developer hub, says GitHub

That’s where Reinforcement Learning from Verifiable Rewards (RLVR) comes in, which is Karpathy’s term for a shift in 2025’s LLM training stack. Instead of relying purely on human signals, models are now trained against objective, verifiable tasks (think math puzzles or code challenges with definitive correct answers). This isn’t just incremental improvement – it fundamentally tweaks how models learn to reason. According to Karpathy, RLVR lets LLMs develop strategies that look like reasoning to people because the models must optimize for verifiable success, not just mimic patterns in text.

https://t.co/Lb6T42n5jl
— Andrej Karpathy (@karpathy) December 19, 2025

What this means in practice is that LLMs trained under RLVR are better at planning and problem solving because they have to be. The reward functions in the way they’re retrained give them a stake in actually getting things right when they respond, not merely spit out something that sounds plausible. This should help reduce hallucinations in GenAI chatbots going forward.

2) LLMs are like ghosts in terms of intelligence

Karpathy’s second great insight wrestles with a deceptively simple question: When we see ChatGPT or Gemini in action, what kind of intelligence is this? The instinctive answer – that LLMs are proto-animals or proto-humans – is misleading. Instead, Karpathy suggests LLMs are more like “summoned ghosts” of intelligence. Because they’re systems trained with optimization pressures and data structures that are unlike organic minds. This is because they spike in capability unpredictably, astonishing at tasks one moment and hilariously fragile the next.

Also read: AI won’t replace humans: Google to OpenAI, big tech CEOs agree

It’s a reminder that the flashes of brilliance we see in LLMs are not the same thing as cognitive maturity. This “jagged intelligence” challenges how we benchmark progress, where traditional performance scores can likely reward narrow optimization rather than genuine understanding. This revelation by Karpathy demands new ways to think about AI safety, evaluation, and real-world readiness.

3) Rise of LLM apps and vibe coding

Finally, Karpathy highlighted how the application layer around LLMs matured in 2025. Tools like Cursor illustrated a new class of “LLM app” – not just big models you query, but orchestrated LLM systems tailored for specific vertical tasks. This is where AI begins to feel practical, not just impressive, highlights Karpathy.

But the real game-changer – and the concept that might resonate most with everyday developers – is what Karpathy cheekily calls “vibe coding.” This is the threshold moment when programming doesn’t feel like coding in the old sense anymore. Where you just describe what you want in English and the model builds it.

Karpathy sees vibe coding as democratizing software creation, lowering the barrier to entry and releasing an enormous wave of creativity. It’s a bit like saying “Linux turned everyone into a sysadmin” – but this time the translator is a model that speaks human language.

In a year of explosive hype and glittering demos, Karpathy’s review is a grounding force – a signal in a very noisy space. As we carry forward into 2026, those three insights from Karpathy – RLVR, ghostly intelligence, and vibe coding – are not just milestones. They’re the landmarks by which we’ll remember in 2025.

Also read: LLMs worse than babies in field of AI: Yann LeCun ‘Godfather of AI’ explains why

Jayesh Shinde

Executive Editor at Digit. Technology journalist since Jan 2008, with stints at Indiatimes.com and PCWorld.in. Enthusiastic dad, reluctant traveler, weekend gamer, LOTR nerd, pseudo bon vivant. View Full Profile