LLMs worse than babies in field of AI: Yann LeCun ‘Godfather of AI’ explains why
LLMs mimic intelligence but lack grounded understanding of reality
Yann LeCun argues true AI needs world models, not words
Next-generation AI learns through prediction, action, and intuition
All of us seem enamoured by ChatGPT and Gemini’s generative AI prowess to no end, from the looks of it. However, they don’t impress one of the Godfathers of AI, Meta’s Chief AI Scientist, Yann LeCun. He has spent the better part of the last two years telling anyone who’ll listen that today’s large language models are, cognitively speaking, closer to infants than intellects.
SurveyAs shocking as it may sound, Yann LeCun’s assertion isn’t without substance, given how he has been an AI researcher for multiple decades now.
LeCun’s provocation is simple but unsettling. Modern LLMs have inhaled something like 30 trillion words, which is the rough equivalent of half a million years of human reading. Despite that, according to LeCun, these modern LLMs still don’t understand the world. A four-year-old child, armed with just 16,000 waking hours can do much more, he argues.
Unlike LLMs that need to be fed information, a human child merely learns by observation and curiosity, stuff like gravity, object permanence, cause and effect – not by reading, but by bumping into reality. Babies don’t predict the next word, they build internal models of how the world works. And that’s the fundamental difference, says LeCun, between humans and AI.
Also read: Meta chief AI scientist Yann LeCun thinks LLMs are a waste of time

LLMs, by contrast, are statistical savants. They pass exams, write passable poetry, and explain quantum mechanics with unnerving confidence. But ask them anything about the real world and cracks appear, as their architecture excels at discrete token prediction only, LeCun says. It collapses when confronted with the continuous, high-dimensional messiness of the physical world – video, motion, action, consequence. Even a cat, in this framing, has better common sense.
This isn’t academic nitpicking. For LeCun, it’s an existential warning. If artificial general intelligence is the destination, LLMs are a scenic detour – impressive, lucrative, and ultimately misleading.
LeCun wants to build world models
After all the media buzz recently on his decision to leave Meta, LeCun has been quietly doubling down on a very different vision of AI – one that looks less like ChatGPT and more like how brains actually work. His focus is on “world models”: systems that learn abstract representations of reality and can predict the outcomes of actions before taking them. Think intuition, not autocomplete.
Also read: Who is Demis Hassabis: CEO of DeepMind, AI career, Nobel laureate, tech visionary
At the heart of this idea is something LeCun calls Joint Embedding Predictive Architecture, or JEPA. Unlike generative models that obsess over pixel-perfect or token-perfect reconstruction, JEPA systems learn by prediction in an abstract space. They don’t try to recreate the world, they try to understand it well enough to anticipate what comes next.
I said that reaching Human-Level AI "will take several years if not a decade."
— Yann LeCun (@ylecun) October 16, 2024
Sam Altman says "several thousand days" which is at least 2000 days (6 years) or perhaps 3000 days (9 years).
So we're not in disagreement.
But I think the distribution has a long tail: it could take… https://t.co/EZmuuWyeWz
LeCun’s latest startup work – still intentionally low-profile – builds directly on this philosophy. Rather than scaling ever-larger language models, the effort is aimed at training AI systems that learn from video, interaction, and environment-level feedback. Language still matters, but it’s no longer the center of the universe. The real prize is an AI that can plan, reason, and act because it has an internal sense of how the world behaves.
There’s a quiet humility to this approach that feels almost radical in an era obsessed with scale and spectacle. LeCun isn’t chasing the next viral demo. He’s trying to give machines something closer to intuition – a grounded understanding that emerges from experience, not text alone.
Also read: Beyond ChatGPT: ‘Godmother of AI’s bold bet on spatial intelligence with World Labs
Jayesh Shinde
Executive Editor at Digit. Technology journalist since Jan 2008, with stints at Indiatimes.com and PCWorld.in. Enthusiastic dad, reluctant traveler, weekend gamer, LOTR nerd, pseudo bon vivant. View Full Profile