My concern is that billions of works are being used for training with no consent and no regard to the license, and that the model "learns" is not an excuse. If someone saved some of my content for personal use, sure, I don't mind that at all, but huge scale scraping for-profit operation downloading all content they physically can? Fuck off. I just blocked all the crawlers from ever accesing my websites (well, google and bing literally refuse to index my stuff properly anyway, so fuck them too, none of them even managed to read the sitemap properly, and it was definitely valid)
Most models are trained unethically, relying on weird statements about how humans learn the "same way" (looking at a few references when drawing a specific thing, you need to know how it looks to draw it lol) as large models (more or less averaging and weighting billions of images stolen from internet with no regards to the licenses)
The thing is, you can't get that rich by playing fair, it's only possible at the expense of others. And I can assure you, most rich people are activelty making our lives worse.
They do. https://ads.microsoft.com/