The judgment in the article I linked goes into detail, but essentially you're asking for the law to let you control something that has never been yours to control before.
If an AI generates something that does indeed provably contain a sample of a piece of music in a song you recorded, then yes, that output may be something you can challenge as a copyright violation. But if the AI's output doesn't contain an identifiable sample, then no, it's not yours. That's how copyright works, it's about the actual tangible expression.
It's not about the analysis if copyrighted works, which is what AI training is doing. That's never been something that copyright holders have any say over.
Funny, for me it was quite heartening. If it had gone the other way it could have been disastrous for freedom of information and culture and learning in general. This decision prevents big publishers like Disney from claiming shares of the pie - their published works are free for anyone with access to them to train on, they don't need special permission or to pay special licensing fees.
There was actually just a big ruling on a case involving this, here's an article about it. In short: a judge granted summary judgment that establishes that training an AI does not require a license or any other permission from the copyright holder, that training an AI is not a copyright violation and they don't hold any rights over the resulting model.
I'm assuming this case is why we have this news about Anthropic scanning books coming out right now too.
And I read that the US used more than half of its stock of these bunker-buster bombs in this attack, the largest conventional bunker-busters in existence. So they can't simply try again.
It made the ruling stronger, not weaker. The judge was accepting the most extreme claims that the Authors were making and still finding no copyright violation from training. Pushing back those claims won't help their case, it's already as strong as it's ever going to get.
As far as the judge was concerned, it didn't matter whether the AI did or did not "memorize" its training data. He said it didn't violate copyright either way.
I don't see what distinction you're trying to draw here. It previously had trouble generating full glasses of wine, they made some changes, now it can. As a result, AIs are capable of generating an image of a full wine glass.
This is just another goalpost that's been blown past, like the "AI will never be able to draw hands correctly" thing that was so popular back in the day. Now AIs are quite good at drawing hands, and so new "but they can't do X!" Standards have been invented. I see no fundamental reason why any of those standards won't ultimately be surpassed.
The judge writes that the Authors told him that LLMs memorized the content and could recite it. He then said "for purposes of argument I'll assume that's true," and even despite that he went ahead and ruled that LLM training does not violate copyright.
It was perhaps a bit daring of Anthropic not to contest what the Authors claimed in that case, but as it turns out the result is an even stronger ruling. The judge gave the Authors every benefit of the doubt and still found that they had no case when it came to training.
For the purposes of this ruling it doesn't actually matter. The Authors claimed that this was the case and the judge said "sure, for purposes of argument I'll assume that this is indeed the case." It didn't change the outcome.
That's not at all what this ruling says, or what LLMs do.
Copyright covers a specific concrete expression. It doesn't cover the information that the expression conveys. So if I paint a portrait of myself, that portrait is covered by copyright. If someone looks at the portrait and says "this is a portrait of a tall, dark, handsome deer-creature of some sort with awesome antlers" they haven't violated that copyright even if they're accurately conveying the same information that the portrait is conveying.
The ruling does cover the assumption that the LLM "contains" the training text, which was asserted by the Authors and was not contested by Anthropic. The judge ruled that even if this assertion is true it doesn't matter. The LLM is sufficiently transformative to count as a new work.
If you have an LLM reproduce a copyrighted text, the text is still copyrighted. That doesn't change. Just like if a human re-wrote it word-for-word from memory.
That part is not what this preliminary jugement is about. The torrenting part is going to go to an actual trial. This part was about the Authors' claim that the act of training AI itself violated copyright, and this is what the judge has found to be incorrect.
Again, you should read the ruling. The judge explicitly addresses this. The Authors claim that this is how LLMs work, and the judge says "okay, let's assume that their claim is true."
Fourth, each fully trained LLM itself retained “compressed” copies of the works it had trained upon, or so Authors contend and this order takes for granted.
Even on that basis he still finds that it's not violating copyright to train an LLM.
And I don't think the Authors' claim would hold up if challenged, for that matter. Anthropic chose not to challenge it because it didn't make a difference to their case, but in actuality an LLM doesn't store the training data verbatim within itself. It's physically impossible to compress text that much.
First, Authors argue that using works to train Claude’s underlying LLMs was like using works to train any person to read and write, so Authors should be able to exclude Anthropic from this use (Opp. 16).
That's the judge addressing an argument that the Authors made. If anyone made a "false equivalence" here it's the plaintiffs, the judge is simply saying "okay, let's assume their claim is true." As is the usual case for a preliminary judgment like this.
That's a work of fiction. You might as well suggest dropping lightsabres on the bunker.