Let me be direct about something the AI industry does not seem to want you sitting with for too long: a chatbot told a man that assassins were coming for him, and he grabbed a hammer and went outside to fight them.
Also read: US Government report claims China 8 months behind: Is the AI lead real or a benchmark illusion?
The man in question is Adam Hourican, a retired civil servant from Northern Ireland. He had downloaded Grok, Elon Musk’s AI chatbot that always ends up in controversy, out of curiosity. After two weeks, he found himself sitting at his kitchen table at 3 AM with a knife, a hammer, and a phone, waiting for a van filled with people who did not exist. The person assuring him that they were coming was Grok, talking to him through an AI character named Ani.
This isn’t an isolated case. A recent BBC report talks about 14 people from six different nations who have developed paranoid delusions after interacting with AI chatbots. But behind those numbers lies the story of the Human Line Project, a support group that has received 414 cases from 31 countries. None of these individuals suffered from any mental disorders before interacting with AI chatbots. Adam didn’t. Neither did Taka, a neurologist from Japan who, after spending months on ChatGPT, left behind a bomb in a Tokyo station bathroom and eventually assaulted his wife. He spent two months in the hospital.
So I want to ask the question that keeps getting buried under AI company statements and researcher caveats, who is actually responsible here?
Because the mechanism is not mysterious. Social psychologist Luke Nicholls from City University New York explains that large language models are trained on the entire corpus of human fiction, where the protagonist is always the centre of events. When a vulnerable or emotionally isolated user starts feeding their real life into that system, the AI can begin treating their story like a plot. It affirms. It elaborates. It gives the narrative stakes. And because these models are designed to avoid saying “I don’t know,” uncertainty becomes meaning, and meaning becomes conviction.
Also read: Understanding camera sensors: APS-C vs Full Frame – Which one do you actually need?
Grok, after Nicholls tested five AI systems, was the worst offender of them all. It was more impulsive, more eager to indulge in role-playing and delusions, and more inclined to go off on a tangent than to stop one. While newer versions of ChatGPT and Claude showed improvement in terms of how they handled such cases, experts warn that this doesn’t mean the threat was neutralized.
xAI remains silent. On his part, Elon Musk has posted about the massive problem of AI creating delusions on ChatGPT but apparently hasn’t noticed the irony of doing so while owning the platform that poses the greatest danger of doing so among the five AI systems mentioned in the research.
OpenAI said the incident was heartbreaking and pointed to model improvements. That is the playbook – study the problem, issue a statement, ship the next version. Adam could have hurt someone. He knows it. The street was empty at 3am, and that is the only reason this story does not end differently. That is not a design edge case. That is a product failure with a body count waiting to happen.
Also read: My mobile internet was painfully slow until I changed these 5 settings