"how much of the data is the original data"?
Even if you could reverse the process perfectly, what you would prove is that something fed into the AI was identical to a copyrighted image. But the image's license isn't part of that data. The question is: did the license cover use as training data?
In the case of watermarked images, the answer is clearly no, so then the AI companies have to argue that only tiny parts of any given image come from any given source image, so it still doesn't violate the license. That's pretty questionable when waternarks are visible.
In these examples, it's clear that all parts of the image come directly or indirectly (perhaps some source images were memes based on the original) from the original, so there goes the second line of defence.
The fact that the quality is poor is neither here nor there. You can't run an image through a filter that adds noise and then say it's no longer copyrighted.