"AI" models are, essentially, solvers for mathematical system that we, humans, cannot describe and create solvers for ourselves.
For example, a calculator for pure numbers is a pretty simple device all the logic of which can be designed by a human directly. A language, thought? Or an image classifier? That is not possible to create by hand.
With "AI" instead of designing all the logic manually, we create a system which can end up in a number of finite, yet still near infinite states, each of which defines behavior different from the other. By slowly tuning the model using existing data and checking its performance we (ideally) end up with a solver for some incredibly complex system.
If we were to try to make a regular calculator that way and all we were giving the model was "2+2=4" it would memorize the equation without understanding it. That's called "overfitting" and that's something people being AI are trying their best to prevent from happening. It happens if the training data contains too many repeats of the same thing.
However, if there is no repetition in the training set, the model is forced to actually learn the patterns in the data, instead of data itself.
Essentially: if you're training a model on single copyrighted work, you're making a copy of that work via overfitting. If you're using terabytes of diverse data, overfitting is minimized. Instead, the resulting model has actual understanding of the system you're training it on.