Copyright law is incredibly far reaching and only enforced up to a point. This is a bad thing overall.
When you actually learn what companies could do with copyright law, you realise what a mess it is.
In the UK for example you need permission from a composer to rearrange a piece of music for another ensemble. Without that permission it's illegal to write the music down. Even just the melody as a single line.
In the US it's standard practice to first write the arrangement and then ask the composer to licence it. Then you sell it and both collect and pay royalties.
If you want to arrange a piece of music in the UK by a composer with an American publisher, you essentially start by breaking the law.
This all gives massive power to corporations over individual artists. It becomes a legal fight the corporation can always win due to costs.
Corporations get the power of selective enforcement. Whenever they think they will get a profit.
AI is creating an image based on someone else's property. The difference is it's owned by a corporation.
It's not legitimate to claim the creation is solely that of the one giving the instructions. Those instructions are not in themselves creating the work.
The act of creating this work includes building the model, training the model, maintaining the model, and giving it that instruction.
So everyone involved in that process is liable for the results to differing amounts.
Ultimately the most infringing part of the process is the input of the original image in the first place.
So we now get to see if a massive corporation or two can claim an AI can be trained on and output anything publicly available (not just public domain)without infringing copyright. An individual human can't.
I suspect the work of training a model solely on public domain will be complete about the time all these cases get settled in a few years.
Then controls will be put on training data.
Then barriers to entry to AI will get higher.
Then corporations will be able to own intellectual property and AI models.
The other way this can go is AI being allowed to break copyright, which then leads to a precedent that breaks a lot of copyright and the corporations lose a lot of power and control.
The only reason we see this as a fight is because corporations are fighting each other.
If AI needs data and can't simply take it publicly from published works, the value of licensing that data becomes a value boost for the copyright holder.
The New York Times has a lot to gain.
There are explicit exceptions limited to copyright law. Education being one. Academia and research another.
All hinge into infringement the moment it becomes commercial.
AI being educated and trained isn't infringement until someone gains from published works or prevents the copyright holder from gaining from it.
This is why writers are at the forefront. Writing is the first area where AI can successfully undermine the need to read the New York Times directly. Reducing the income from the intellectual property it's been trained on.