The Browser Might Be Our Best Shot at AGI
AGI has always felt like one of those impossible-to-define concepts to me. It's kind of like asking about the meaning of life. The best answer I've heard for that question is simple: you have to find your own meaning. I think AGI works the same way. Adding complexity to it is getting to AGI using LLMs whcih is the SOTA AI right now. LLMs are a one trick pony if you think about it, so using them to get to AGI is a very tall ask without a cheat code.
I'm not saying we've reached AGI yet. But at least now I have a clearer picture of how we might actually get there.
Why the Browser Makes Perfect Sense
Here's my (controversial) hypothesis : Limited to LLMs, we are going to achieve AGI through the browser before anywhere else. Think about it. There's no other general-purpose platform that can handle so many different tasks for so many different types of people. It still has an underlying structure and protocols that makes it a perfect fit for LLMs.
Everything we do for work and personal life happens in a browser. Even the apps we use daily like Outlook, Teams, and Excel have browser versions. Most popular mobile apps do too. The web is also incredibly well-structured. Unlike the messy real world, browsers run on clear protocols. The whole system has evolved over decades and now powers trillions of dollars in value.
If you can build a system smart enough to handle browser tasks the way humans do, that's a pretty solid claim to AGI in my book.
My First Real Taste of Browser AGI
Perplexity Comet changed everything for me. It's the first truly general-purpose agent I've actually used. This feels like the first real step toward AGI on the browser track.
I decided to test it on LinkedIn's Pinpoint game, and I was blown away. The navigation was quick. It followed my instructions perfectly. I'd already read reports that Comet crushes word games like Wordle, so I knew it had potential.
I also tried it on Queens, which is more like a chess strategy game. It struggled there, but I'm sure that'll improve as these models get better.
What Really Impressed Me
The voice mode had some quirks during testing, so I ended up using MacWhisper for text-to-speech instead. But here's the cool part: even when the TTS made errors (it heard "clues" as "close"), Comet didn't get thrown off track.
The best moment was watching it read and understand the game rules first. Just like a human would do when facing a new puzzle. That felt incredibly natural.
For me, watching Comet work through the game was like experiencing reasoning models for the first time. It was genuinely exciting.
Why This Path to AGI ?
Since most of our digital lives happen in browsers, conquering the browser becomes a shortcut to AGI. The web environment is structured and predictable in ways the physical world isn't.
Comet has all the familiar elements since it's built on Chrome. Plus it brings a fully functional AI assistant right alongside your normal browsing. I honestly can't think of anywhere else where the path to AGI feels this clear and achievable.
The more I think about it, the more convinced I become. Better models are coming. Research keeps improving. Websites are getting more accessible to AI agents. All the pieces are lining up.
What do you think? Does the AI powered browser feel like our best shot at AGI to you? Do you have a different take ?