Generative AI is not the new Internet
Investors often draw the parallel between the two. That's a mistake.
“AI might be in a bubble, but so was the internet. It didn’t stop it from becoming the most transformative technology of the 21st century.”
So people say. And hearing this over and over again makes me want to punch some faces. I have even started downvoting stuff on reddit, something I had never done before.
The “hype cycle”, as it is called, is contaminated by survivorship bias. We tend to forget that the nominal trajectory after the “trough of disillusionment” is “the abyss of ridicule” and not “the slope of enlightenment”.
The internet is one of a few technologies that has had an impact as big as its hype, despite the bumps in the road. We are oblivious to the Segway, the IoT, cold fusion, space travel, that all greatly underdelivered, either out of technological infeasibility or poor product/market fit (no one wants a “connected dishwasher”).
Even though I am no Geoffrey Hinton, I think I have a decent enough level of Machine Learning and NLP to know what I’m talking about. This post isn’t another one of those “a machine cannot think, it just outputs an answer in its database” or “what you call AI is just an unintelligent program that predicts the next token. It’s a stochastic (whatever it means, I heard about it on instagram) parrot 😁” pieces.
Here are 5 major differences between the AI and dot-com bubbles.
1. The internet’s development was problem-driven. AI is a nice solution looking for a problem.
In the 1960s, the American government was concerned that if the soviets destroyed a central command-and-control hub, the entire US communications network would collapse. They thought it could be a good idea to build a decentralized (and thus more resilient) network. Universities loved the idea, too.
The minimum viable product, called the ARPANET, basically did what the current version of the internet does: it allowed to send bytes from a computer to another, through a decentralized network. You couldn’t yet send dick pics to your hot coworkers (for some reason it was not considered a priority at the time), but you sure could share the source code of a program or “email” people in the network.
It solved the problem of long range communication at the byte level.
Of course, there have been a lot of improvements made to the original protocol (TCP/IP, WWW and so on), but the need for a common protocol to send bytes from one computer to another over a decentralized network was clear from the start. And the internet delivered.
Generative AI, on the other hand, was created by people who wanted to make something intelligent, attained some success and thought “well, what can we do about it”? It turns out GPTs were useful as chatbots, so they went for chatbots.
In a sense, it’s similar to the 1997+ part of the internet bubble, where most companies were like “We have to do something with that ‘internet’ thing. Any idea?”. But the development of the underlying technology went through a completely different process.
2. The adoption of the internet was slow because you had to sell a few kidneys to buy a computer. AI today is already dead cheap.
If you wanted to get connected to the internet in the 1990s, you had to buy an internet-able computer (the equivalent of $3000 today), and then purchase a $100 (today dollars) monthly subscription. So, it was on the order of magnitude of a month salary.
No wonder it took time to take off. First you had to hear about it from your nerdy friend, and then you had to convince your wife that getting a computer with internet access was more important than getting your septic tank drained and your garage door fixed.
In the late 1990s, only about a third of the developed world population had internet access. There was plenty of room to grow, it allowed for high expectations, and it was a good excuse for its limited economic impact.
Today, there is no way you can spend a month salary on generative AI, without deliberately trying to do so. And nearly every working age person has directly or indirectly used a cutting-edge AI chatbot. So the reason why AI doesn’t have a significant economic impact is not that adoption is not complete. It’s that the technology is not yet able to. But more on that in the 5th point.
3. The internet has a positive network effect. AI has a negative one.
Cool if you sell books online or if you have a brand new ‘@hotmail.com’ address. But if no one browses the web or checks their emails, you are just a clanging cymbal that no one gives a shit about.
The internet had (and still has) a huge positive network effect, meaning that its usefulness grows with the number of users.
No such thing for AI yet. Quite the opposite in fact.
First of all, AI generated slop tends to contaminate datasets and it causes model collapse. On a more technical note, I am not exactly sure why it does: I once trained pre-trained CNNs on their own outputs1, basically trying to make them more sure about their guesses, and it didn’t cause them to go astray (it was just for fun, btw). But I guess things are different for massive Transformers.
Anyway, the more AI slop on the internet the worse the training datasets quality.
Second, and even worse, more people using AI brings down the value of AI. Generative AI is mainly used for creative tasks2, where users are competing for other people’s attention. And the more the said people use AI or get in contact with AI-generated stuff, the less impactful it becomes. Cool images or marketing clips lose value when everyone can generate them. Same thing goes for “personalized emails”.
Whenever I get an email from someone I don’t know who says “I’ve checked your {blog post or github repo} and found it fascinating! We are also into {vaguely related stuff}, so please join our {AI product} waiting list.”, I consider him a spam by default. So does everyone. Before the advent of generative AI, I would have thought “woah, someone actually checked this repo! I’m not used to this much appreciation, should I answer by sending them a dick pic?”
If a single person had had access to GPT-4 in the 2010s, he would have made millions of dollars from it, because no one was able to spot fishy AI-generated slop at the time. Now, it’s become a sixth sense to almost everyone.
The positive network effect could justify the exponential growth of internet companies market valuations.
AI companies benefit from no such network effect.3
4. The scaling laws of the internet are linear. The scaling laws of AI are worse than logarithmic.
If you double the number of cables connecting 2 countries, you double their connection speed. If you double the number of hard drives in a server, you double its storage capacity.
Double an LLM training compute and you get a barely noticeable difference. But you burnt twice as much money. AI scaling laws are worse than logarithmic.4
GPT-4.5 was trained using 100x more compute than GPT-4, but the difference with its predecessor is marginal (and arguably in the wrong direction). The difference is nowhere near that between GPT-4 and GPT-3, despite a similar training compute factor. In all mathematical rigor, if the performance gap had been the same, the scaling laws would have been called logarithmic, which is already quite bad. But they are far worse.
When scaling laws are nice enough, some big companies or government have a strong incentive to invest in the technology to develop it for their own use. That’s why some companies have super computers. That’s why microsoft employees already had connection speeds acceptable by today’s standard, as far back as in the 1990s. Then, as they work hard to make the technology less expensive, it becomes affordable to the general public. This is how we went from Enigma to the iPhone and from super expensive internet broadband to 5G.
But there is no such thing with AI. The best AIs currently being tested by Google, Anthropic and OpenAI are marginally better than those available to the general public. There is no “Spend 10x more to get a 10x better product” path that big companies can pursue.
I wonder if, in normal economic conditions, Google et al would have any financial incentive to train big models when 90% of the maximum economic value can be obtained from 0.1% of the maximum compute. I don’t think so, though I could be wrong.
5. The internet needed few technological breakthroughs to become what it is today. AI needs major ones to take off.
I will write a full technical blog post on that matter, but for now, let’s just state the facts.
The internet has evolved quite a bit from ARPANET. But there were few technological breakthroughs needed for this evolution: TCP/IP and WWW protocols, HTML, CSS and Javascript and fiber optics were about all that was needed. And at any point in the history of the internet, engineers and scientists knew the direction of the next step. “Problem: Disconnected networks can't talk to each other. => Solution: TCP/IP, a universal translator for data; Problem: Web pages are boring. => Solution: JavaScript, to make them come alive.”
Current LLMs have a (real but) limited economic impact. We are promised superintelligence. The thing is, no one I know about has the slightest fucking idea how to get there.
Benchmarks are getting maxed out by new reasoning models every other day, yet real world usefulness seems to be plateauing. Although I am an LLM power user, I don’t think I would lose much productivity if you forced me to use GPT-4 Turbo instead of the latest models.
Despite enormous investments and efforts, no one has been able to use LLMs for anything else than Chatbots and IDEs. Current LLMs need constant guidance from humans to work.
One thing that I’ve understood only recently is that most economic value comes from navigating the messiness of the world. Very few people are paid to work a fully documented and streamlined job.
You may think that accountants just line up numbers in spreadsheets, but they constantly make important and implicit decisions about where to put those numbers. Few of these micro-decisions can be found on the internet and thus in the training data of LLMs.
Despite code being one of the most abundant data forms on the internet, I find myself not using AI too much when coding. The interesting thing is that I’m not even able to single-out cases where AI fails. There are just too many low-probability failure modes to account for.
I’m not even talking about “hardware” engineers who are closer to the material messiness of the world. You will have a hard time finding online doc on the “Shit, I need to redesign this part because our historical supplier went bankrupt and the new one can’t machine Al 2024 alloy to the required tolerances. Should I figure out if we can use 7075 instead or redesign the part altogether?” problem.
Despite acing math and code benchmarks, LLMs have made little progress in that “messiness handling” skill. That’s why you can’t trust ChatGPT’s Operator to fill your shopping cart or Claude 3.7 to run a small shop (the latter post is genuinely funny, Anthropic engineers have a lot of humor).
For some reason, Anthropic has an edge in this domain, but I’ve seen no progress since Sonnet 3.5.
Fine if Elon Musk (whom I like, out of pure provocation) calls Grok 4 a Ph.D.-level AI because it never fails math tests and trick questions. Fine if it can solve 5th order PDE and output the result in Alexandrine verses.
But no one is paid for that.
It is only marginally closer to being autonomous than GPT 3.5.
The current path of AI development will not bring economically meaningful superintelligence in the foreseeable future. I’m not saying superintelligence will never happen, just that it’s unlikely to happen by scaling current approaches. We need a few breakthroughs, and as far as I know, no one knows what they will consist of.
Conclusion
In 1969, if you had told the average American “you will never live to see human settlements on the Moon nor humans on Mars”, he would have answered “what? am I going to get cancer or something? Are you saying I have less than 10 years left to live?”. The possibility of space exploration being at its apex was unimaginable. But it was.
In 1999, if you had told him “the internet is not going to be a big deal”, he would have called you a fool. Rightly so.
Today, no one knows where AI is going, but there seems to be hard technical problems to solve. The forward trajectory looks more like that of space exploration and cold fusion than that of the internet.
I don’t remember the exact code, but the loss function was probably something like loss(logits) = -log(max(logits))
And code.
I’ve heard the argument that, as more people interact with chatbots, conversation data becomes more abundant and allows AI companies to train better models. So this would amount to a network effect. But I am not sure about it, because unlabelled conversation data is notoriously difficult to work with. So much so that most AI companies offer their models for free on https://lmarena.ai just to collect a bit of human feedback, because the poorly labelled conversation data they get there is still more valuable than the formidable amount of raw data they have at home.