Most research papers assume a human will read them first. However, Google DeepMind’s latest sprawling 60-page look at the road from AGI to superintelligence doesn’t bother with that pretense. Section 1 is titled “Summary Instructions” that has two paragraphs, one for you and another for the AI assistant you’re probably going to ask to summarise it – telling it what to cover, what to keep intact, and what to flag has changed since publication. It’s a small detail. It’s also a pretty good preview of the paper’s entire argument: AI is no longer just the subject of these conversations. It’s increasingly the one having them.
The paper in question, titled “From AGI to ASI” is written by an elite DeepMind team comprising Shane Legg, co-founder of the organization and one of the people who popularised the term AGI in the first place. No, this is no product pitch or demonstration of skills. Rather, we can consider it more as a memo: an effort to predict how things will play out once human-level AI appears, when exactly this will happen being irrelevant. The authors have managed to define the difference between these notions in question quite clearly. The notion of AGI is seen as intelligence that reaches median human levels in most cognitive tasks – intelligence comparable to a reasonably competent human individual. The notion of ASI, on the other hand, stands for a superintelligence that outperforms not individuals with expertise, but large and well-coordinated teams of such individuals for several years on almost anything.
The first one and the most popular is scaling, which is basically continuing to just add more and more compute power, more and more data, and more and more parameters to the problem, which has been the primary approach to advances in AI over the last decade. It’s also the only approach backed by some kind of historical data, which explains why it continues to dominate the discussion. While the authors cite the fact that effective computing capacity has increased around 10x per year for the last decade, they also acknowledge the apparent limitation of the approach – the internet is running out of unique quality textual content to learn from, and this barrier seems to be coming within this decade.
The second pathway is that of a paradigm shift, which represents a radical architectural break from the current formula of using transformers along with fine-tuning, in the same way that transformers were an architectural break from the past. It is refreshing that the authors acknowledge upfront the difficulty in predicting this pathway, which is almost axiomatic to the fact of being a paradigm shift, because no one could see it coming until it did come. However, it does mean that this part comes across more as an exercise in prediction than analysis.
Also read: Rumik’s Silk Mulberry 1.5 claims to be the best Indian Voice AI right now
Third is recursive self-improvement – AI systems doing AI research themselves, building better versions of themselves, which build even better versions, and so on. This is the pathway with the most sci-fi weight to it, the one underpinning “intelligence explosion” scenarios. But the paper hedges this one harder than any other, repeatedly pointing out there’s no historical precedent to model it against, and that recursive loops have a habit of plateauing rather than exploding (AlphaZero-style training is offered as an example of both outcomes being possible).
Fourthly, it’s the path we would argue should receive more consideration than it typically receives, that of superintelligence arising through multi-agent collectives. This involves the notion that there isn’t any need for a model to be a genius, only for enough instances of AGI-level models to work in concert, in much the same way that humanity in general is more intelligent than each individual member thereof. With how inexpensive it is to generate parallel instances of a model, it seems like this is the path that is most immediate and feasible, even while the authors point out that we have no empirical evidence about how such collectives function.
The following are the six major bottlenecks which might hinder or even completely halt each and every one of these approaches: data wall, uncontrolled growth of economy and resources, the possibility of the current neural networks paradigm not being sufficient anymore, increasing difficulty in conducting research as time goes by in a given field, “abstraction barrier” which constrains the AI’s ability to use only human-conceived abstractions, and deliberate slowing down imposed by humans.
Let’s focus on three of those barriers. First of all, data wall. It is the most tangible and immediate issue since the amount of human-generated high-quality text is limited, and usage of synthetic data is risky and has chances of breaking the models. Whether or not the paradigm of neural networks will reach its peak is unclear but more fundamental as the paper cites such problems of existing architecture as hallucinations, vulnerability to prompt injection, and brittleness in reasoning as evidence. And finally, deliberate slowing down is not technological – it is political. It can occur due to regulation, backlash, or even accidents, although the authors admit that international competition renders it improbable.
Where the paper impresses is not because of any of its predictions; it is more about how little prediction the paper actually makes. It seems like each section concludes in the same way – that “this is an open research question.” For a 60-page document from one of the world’s leading AI labs, that’s either disappointing or exactly the point, depending on how you read it. We’d lean toward the latter. What the paper contributes is not a guide to superintelligence but rather an admission, from those who are working towards achieving it, that no one even knows what road we are on yet.
Also read: Epic is rebuilding Unreal Engine from the ground up: Here’s everything that’s changing with UE6