Human writing and LLM output can be creative, original, informative, or useful, depending on the context and purpose. It is a tool to be used by humans, we are in control of the input and the output. What we say goes, no one ever has to see LLM output without people making those decisions. Restricting LLMs is restricting the people that use
them. Mega-corporations will have their own models, no matter the price. What we say and do here will only affect our ability to catch up and stay competitive.
You also seem to be arguing a slippery slope argument, by implying that if LLMs are allowed to use copyrighted books as data, it will lead to negative consequences for creators and society, without explaining how or why this will happen, or providing any evidence. It's a one-sided look at the issue that ignores the positive outcomes from LLMs, like increasing accessibility, diversity, and quality of literature and thought. As well as inspiring new forms of expression and creativity.
Finally, you seem to be making a moralistic fallacy. You claim that there is a perfectly reasonable way of doing this ethically, by using content that people have provided. However, you don’t support this claim, or address its challenges. How would you ensure that the content providers are the original authors or have the rights to the content? How would you compensate them for their contribution? Is this a good way to get content that is diverse and representative of different perspectives and cultures? What about bias or manipulation in the data collection and processing?
I don't think we need any more expansions to copyright, but a better understanding of LLMs’ capabilities and responsibilities. I think we need to be open-minded and critical about the potential and challenges of LLMs, but also be on guard against fallacious arguments or emotional appeals.