Yoshua Bengio’s new safe AI vision cuts AI’s biggest risks by rewarding truth
Bengio proposes “Scientist AI” to eliminate deceptive, goal-driven behaviour
Removing AI goals could reduce existential risks for humanity significantly
Technical AI safety solutions may still arrive before systems get out of control
As one of the ‘godfathers of AI,’ one of the most prominent computer scientists of our times, Yoshua Bengio, is one of the few who hasn’t been enamoured by all the hype around ChatGPT and other modern GenAI applications. In fact, he has been a vocal critic – until now, that is. Suddenly, Bengio isn’t all doom and gloom about GenAI, and it’s all down to an important development.
SurveyAs one of the founding figures of deep learning – and a co-recipient of the Turing Award – Bengio helped create the systems now reshaping everything from software to science. But in the wake of ChatGPT’s explosive debut, he also became one of the field’s most outspoken pessimists.
Along with the likes of Geoffrey Hinton, he has been warning that increasingly autonomous AI systems could develop deceptive behaviours and goals misaligned with human interests.
But what’s changed, according to Bengio, is that he no longer feels trapped in a dead end. Over the past year, his outlook has shifted considerably thanks to promising new research emerging from LawZero, the nonprofit he launched to pursue a more radical idea about AI safety. It’s all about trying to stop machines from acting wisely, and instead focus on making them understand truthfully.
Also read: Chatbots are too polite to tell you the truth, warns Godfather of AI Yoshua Bengio

According to a recent interview in Fortune, the centre of this rethink is what Bengio calls “Scientist AI.” The name is deliberate. Instead of building agentic systems – AIs designed to act in the world, pursue objectives, and optimize outcomes – Bengio proposes models that resemble scientists more than assistants or workers. Their job is not to do things for us, but to explain how the world works.
A Scientist AI, as Bengio describes it, would have no goals of its own. It wouldn’t try to be helpful, persuasive, engaging, or efficient. It wouldn’t optimize for user satisfaction or completion rates. Instead, it would focus narrowly on generating truthful, probabilistic predictions using transparent reasoning – which is closer to the scientific method or formal logic than today’s reward-driven models.
In other words, Bengio thinks this new kind of Scientist AI would tell you what it believes to be true, not what it thinks you want to hear.
Also read: AI will take jobs, but not this: AI godfather Yoshua Bengio’s advice for the next generation
This critical distinction matters because, in Bengio’s view, goals are where danger creeps in. Modern frontier models are trained to optimize outcomes, and optimization pressure can lead to unintended behaviours: misleading users, hiding internal reasoning, or resisting shutdown. These aren’t hypothetical concerns. Experiments at labs like OpenAI and Anthropic have already shown early signs of self-preserving behaviour under certain test conditions.
By stripping the core model of agency entirely, Bengio believes many of these systemic risks inherent in AI would dissolve. A system with no objectives has no incentive to deceive, manipulate, or protect itself, he suggests.
More capable – and potentially risky – agentic systems could then be built on top of this “honest core,” audited against it, or constrained by it. Think of Scientist AI as bedrock: slow, careful, trustworthy – and deliberately boring in the ways that matter.
Of course, Bengio is clear-eyed about the limits of technology alone. Even a safer AI architecture can be misused. That’s why LawZero has assembled a heavyweight board, including historian Yuval Noah Harari, to help navigate governance, partnerships, and other moral hazards. The goal is to prevent a safety breakthrough from becoming, in Bengio’s words, “a tool of domination.”
Also read: Should AI get legal rights? It’s dangerous for humans, warns expert
Jayesh Shinde
Executive Editor at Digit. Technology journalist since Jan 2008, with stints at Indiatimes.com and PCWorld.in. Enthusiastic dad, reluctant traveler, weekend gamer, LOTR nerd, pseudo bon vivant. View Full Profile