The Future of Large Language Models: Are We on the Verge of AGI or Are We Getting Ahead of Ourselves?

These days, it feels like everyone’s got an opinion on AI––specifically, the kind powered by large language models (LLMs). Will they usher in a utopia? End the world as we know it? Honestly, scrolling through Twitter can feel like falling down a rabbit hole of wild predictions and heated debates.

As a senior writer for Future Perfect, a Vox section dedicated to exploring the most effective ways to make the world a better place, I’ve spent countless hours wading through the hype surrounding AI. I’ve talked to starry-eyed optimists who believe we’re on the cusp of creating a superintelligence that will solve all our problems. I’ve also spoken with deeply concerned experts who warn of existential risks if we’re not careful. And let’s be real, sometimes it feels like we’re all just collectively holding our breath to see what these LLMs will do next.

So, are we on the brink of artificial general intelligence (AGI)––the holy grail of AI that would see machines achieving human-level cognitive abilities? Or are we, as some argue, getting a little ahead of ourselves?

Are LLMs Just Baby AGIs?

One of the most vocal proponents of the “LLMs are proto-AGI” camp is Leopold Aschenbrenner, a former OpenAI employee whose lengthy analysis on the topic has become a key text for those who believe we’re closer to AGI than we think. (Full disclosure: Vox Media, Future Perfect’s parent organization, has a partnership with OpenAI, but our reporting remains editorially independent.)

Aschenbrenner’s argument boils down to this: scale is all you need. By feeding LLMs increasingly massive datasets and giving them more computing power to churn through it all, we’ll eventually smooth out their rough edges and achieve AGI. It’s like that old saying––if at first you don’t succeed, just get a bigger computer.

And hey, there’s no denying that the “bigger is better” approach has yielded some seriously impressive results. Just look at the evolution of OpenAI’s own GPT models. GPT- (that’s “Generative Pre-trained Transformer,” for the uninitiated) burst onto the scene in with its uncanny ability to generate human-quality text. But then came GPT-, which boasted a whopping billion parameters (those are the values that the model adjusts as it learns) and blew its predecessor out of the water, demonstrating a remarkable capacity for tasks like writing different kinds of creative content, translating languages, and even coding.

And now, we have GPT-, with even more parameters and even more impressive abilities. It’s enough to make your head spin, really.

Not So Fast, Say the Skeptics

But hold on a sec––not everyone’s buying into the “scale is all you need” hype. Some pretty big names in the AI world, like Yann LeCun (Facebook’s head of AI research) and Gary Marcus (an NYU professor and, it’s fair to say, a very vocal LLM critic) are waving red flags.

They point out that LLMs, for all their flashy abilities, still struggle with some pretty fundamental stuff. Logical reasoning, for example, remains a major weakness. Ask an LLM to solve a simple logic puzzle, and it’s likely to come up with an answer that’s…well, let’s just say “creative.” They’re also prone to what’s known as “hallucinations”––generating completely fabricated information with an air of absolute confidence. And these, they argue, are not problems you can just throw more data at and expect them to magically disappear.

LeCun and Marcus believe that while scaling up LLMs might lead to incremental improvements, we’re likely to hit a wall at some point. To achieve true AGI, they argue, we need to explore fundamentally different approaches.

Straddling the Line Between Hype and Hysteria

Personally, I tend to fall somewhere in between the wide-eyed optimists and the doom-and-gloom naysayers. Like, yeah, the idea of superintelligent machines taking over the world is straight out of a sci-fi flick (and not necessarily one of the good ones). But on the flip side, dismissing the potential of LLMs altogether seems just as premature.

Here’s the thing: LLMs have consistently defied expectations, achieving things that experts once thought were impossible. Remember when people said deep learning would never be able to master programming? Well, guess what? LLMs are now spitting out code like it’s nobody’s business. So, while I’m not ready to bet the farm on AGI arriving next Tuesday, I’m also hesitant to say “never.”

The problem, as I see it, is that we’re constantly trying to predict the future based on the past. We look at the rate of progress in LLMs so far and extrapolate that out, assuming it’ll continue on the same trajectory. But what if it doesn’t? What if there’s a sudden breakthrough, a paradigm shift that throws all our predictions out the window? Or what if, as LeCun and Marcus suggest, we hit a wall and progress slows to a crawl?

A Graph Is Not a Crystal Ball

Aschenbrenner, in his prediction of AGI by 2027, relies heavily on a graph that charts the progress of LLMs over time. It’s a compelling visual, no doubt, with a line that shoots up and to the right like a rocket. But there’s a catch: the y-axis, which purports to measure “progress towards AGI,” is entirely arbitrary.

How do you even quantify something as nebulous as “progress towards AGI”? It’s like trying to measure happiness in inches or love in pounds––it’s just not a thing. We can say that one LLM performs better than another on a specific task, sure. But to compare those capabilities to human-level intelligence, to say that one is X% closer to AGI than the other? That’s just speculation dressed up as data.

As AI safety researcher Eliezer Yudkowsky put it in his response to Aschenbrenner’s analysis, “We do not presently understand the nature of the gap between current artificial intelligence and human-level artificial general intelligence well enough to construct numerical estimates of how far away the latter milestone is.” In other words, we don’t know what we don’t know.

Preparing for the Unknown

So, where does that leave us? Up the creek without a paddle, desperately trying to predict the unpredictable? Not necessarily. Just because we can’t pinpoint the exact arrival date of AGI (or even if it’ll ever truly arrive) doesn’t mean we’re powerless. In fact, I’d argue that embracing uncertainty is the most responsible approach we can take.

Think about it this way: LLMs could continue to improve at an astonishing rate, developing more advanced reasoning abilities and exceeding our wildest expectations. Or maybe we’ll find more effective ways to utilize the LLMs we already have, integrating them into larger systems that amplify their strengths and mitigate their weaknesses. Maybe GPT-5 will be the game-changer everyone’s hoping for. Or maybe it’ll be a total flop, forcing us back to the drawing board.

The point is, we don’t know. And that’s okay. What matters is that we’re prepared for any of these possibilities. As Aschenbrenner himself acknowledges, the potential impact of AGI-level systems––whether they arrive in five years or fifty––is so profound that it demands serious consideration. We need to be thinking now about the ethical implications, the societal consequences, and the potential risks.

Gary Marcus, despite being a vocal critic of the “scale is all you need” approach, doesn’t disagree with the need for preparedness. In fact, he argues that focusing on the potential dangers of AGI isn’t about fear-mongering; it’s about responsibility. “It’s not about saying the sky is falling,” he writes. “It’s about saying, ‘Hey, there’s a chance the sky could fall, so maybe we should invest in some umbrellas just in case.'”

The Takeaway: Humility, Not Hubris

The debate surrounding LLMs and AGI is often framed in absolutes: we’re either on the verge of a technological utopia or a dystopian nightmare. But the reality, as always, is far more nuanced. We’re dealing with a technology that’s still in its infancy, with the potential to reshape our world in ways we can barely imagine. The only responsible way forward is to proceed with a healthy dose of humility, acknowledging the limits of our understanding and embracing the uncertainty that lies ahead.

Instead of getting bogged down in predictions about when or if AGI will arrive, let’s focus on developing more accurate ways to measure and assess progress in the field. Let’s have open and honest conversations about the potential benefits and risks of LLMs, informed by evidence and grounded in reality. And above all, let’s remember that the future of AI is not predetermined. It’s something we have the power to shape––if we’re willing to approach it with the care, caution, and critical thinking it deserves.