I really think it’s mostly about getting a big enough data set to effectively train an LLM.
I mean, yes of course. But I don't think there's any way in which it is just about that. Because the business model around having and providing services around LLMs is to supplant the data that's been trained on and the services that created that data. What other business model could there be?
In the case of google's AI alongside its search engine, and even chatGPT itself, this is clearly one of the use cases that has emerged and is actually working relatively well: replacing the internet search engine and giving users "answers" directly.
Users like it because it feels more comfortable, natural and useful, and probably quicker too. And in some cases it is actually better. But, it's important to appreciate how we got here ... by the internet becoming shitter, by search engines becoming shitter all in the pursuit of ads revenue and the corresponding tolerance of SEO slop.
IMO, to ignore the "carnivorous" dynamics here, which I think clearly go beyond ordinary capitalism and innovation, is to miss the forest for the trees. Somewhat sadly, this tech era (approx MS windows '95 to now) has taught people that the latest new thing must be a good idea and we should all get on board before it's too late.