Chicken x Rabbit
Even_Adder @ Even_Adder @lemmy.dbzer0.com Posts 15Comments 825Joined 2 yr. ago
When people say that the "model is learning from its training data", it means just that, not that it is human, and not that it learns exactly humans. It doesn't make sense to judge boats on how well they simulate human swimming patterns, just how well they perform their task.
Every human has the benefit of as a baby training on things around them and being trained by those around them, building a foundation for all later skills. Generative models rely on many text and image pairs to describe things to them because they lack the ability to poke, prod, rotate, and disassemble for themselves.
For example, when a model takes in a thousand images of circles, it doesn't "learn" a thousand circles. It learns what circle GENERALLY is like, the concept of it. That representation, along with random noise, is how you create images with them. The same happens for every concept the model trains on. Everything from "cat" to more complex things like color relationships and reflections or lighting. Machines are not human, but they can learn despite that.
You're jumping to conclusions. The image was a mislabeled stock image they bought. This is just a case of poor quality control.
You should check out this article by Kit Walsh, a senior staff attorney at the EFF. The EFF is a digital rights group who recently won a historic case: border guards now need a warrant to search your phone.
You're very welcome. My head still hurts.
They're not giving up though, what they're doing is getting ahead of it. Assuming their deal is favorable for their members, they're making it so that anyone who wants SAG-AFTRA synth voices has to go through their contracted company which they have collective bargaining power or strike an equal or better deal. Along with blacklisting companies from SAG-AFTRA work that use non-union synth voices.
This is way better than leaving actors on their own to bargain with companies, which would have definitely happened. Rather than have companies wear individuals down and drive pay down, they get to dictate the terms, together.
The one I kind of remembered (even though only partially) was the Reuters article, which contains this quote I was referring to:
The office reiterated Wednesday that copyright protection depends on the amount of human creativity involved, and that the most popular AI systems likely do not create copyrightable work.
This was likely in reference to Midjourney, which was the system in question in its ruling. Midjourney, even for its time had very rudimentary user controls way behind the open standards that likely didn't impress the registrar.
There's also a spectrum of involvement depending on what tool you're using. I know with web based interfaces don't allow for a lot of freedom due to wanting to keep users from generating things outside their terms of use, but with open source models based on Stable Diffusion you can get a lot more involved and get a lot more freedom. We're in a completely different world from March 2023 as far as generative tools go.
Take a look at the difference between a Midjourney prompt and a Stable Diffusion prompt.
a 80s hollywood sci-fi movie poster of a gigantic lemming attacking a city, with the title "Attack of the Lemmy!!" --ar 3:5 --v 6.0
sarasf, 1girl, solo, robe, long sleeves, white footwear, smile, wide sleeves, closed mouth, blush, looking at viewer, sitting, tree stump, forest, tree, sky, traditional media, 1990s \(style\), <lora:sarasf_V2-10:0.7>
Negative prompt: (worst quality, low quality:1.4), FastNegativeV2
Steps: 21, VAE: kl-f8-anime2.ckpt, Size: 512x768, Seed: 2303584416, Model: Based64mix-V3-Pruned, Version: v1.6.0, Sampler: DPM++ 2M Karras, VAE hash: df3c506e51, CFG scale: 6, Clip skip: 2, Model hash: 98a1428d4c, Hires steps: 16, "sarasf_V2-10: 1ca692d73fb1", Hires upscale: 2, Hires upscaler: 4x_foolhardy_Remacri, "FastNegativeV2: a7465e7cc2a2",
ADetailer model: face_yolov8n.pt, ADetailer version: 23.11.1, Denoising strength: 0.38, ADetailer mask blur: 4, ADetailer model 2nd: Eyes.pt, ADetailer confidence: 0.3, ADetailer dilate erode: 4, ADetailer mask blur 2nd: 4, ADetailer confidence 2nd: 0.3, ADetailer inpaint padding: 32, ADetailer dilate erode 2nd: 4, ADetailer denoising strength: 0.42, ADetailer inpaint only masked: True, ADetailer inpaint padding 2nd: 32, ADetailer denoising strength 2nd: 0.43, ADetailer inpaint only masked 2nd: True
To break down a bit of what's going on here, I'd like to explain some of the elements found here.
sarasf
is the token for the LoRA of the character in this image, and <lora:sarasf_V2-10:0.7>
is the character LoRA for Sarah from Shining Force II. LoRA are like supplementary models you use on top of a base model to capture a style or concept, like a patch. Some LoRA don't have activation tokens, and some with them can be used without their token to get different results.
The .07 in <lora:sarasf_V2-10:0.7>
refers to the strength at which the weights from the LoRA are applied to the output. Lowering the number causes the concept to manifest weaker in the output. You can blend styles this way with just the base model or multiple LoRA at the same time at different strengths. You can even take a monochrome LoRA and take the weight into the negative to get some crazy colors.
The Negative Prompt is where you include things you don't want in your image. (worst quality, low quality:1.4),
here have their attention set to 1.4, attention is sort of like weight, but for tokens. LoRA bring their own weights to add onto the model, whereas attention on tokens works completely inside the weights they're given. In this negative prompt FastNegativeV2
is an embedding known as a Textual Inversion. It's sort of like a crystallized collection of tokens that tell the model something precise you want without having to enter the tokens yourself or mess around with the attention manually. Embeddings you put in the negative prompt are known as Negative Embeddings.
In the next part, Steps
stands for how many steps you want the model to take to solve the starting noise into an image. More steps take longer.
VAE
is the name of the Variational Autoencoder used in this generation. The VAE is responsible for working with the weights to make each image unique. A mismatch of VAE and model can yield blurry and desaturated images, so some models opt to have their VAE baked in,
Size
is the dimensions in pixels the image will be generated at.
Seed
is the number representation of the starting noise for the image. You need this to be able to reproduce a specific image.
Model
is the name of the model used, and Sampler
is the name of the algorithm that solves the noise into an image. There are a few different samplers, each with their own trade-offs for speed, quality, and memory usage.
CFG
is basically how close you want the model to follow your prompt. Some models can't handle high CFG values and flip out, giving over-exposed or nonsense output.
Hires steps
represents the amount of steps you want to take on the second pass to upscale the output. This is necessary to get higher resolution images without visual artifacts. Hires upscaler
is the name of the model that was used during the upscaling step, and again there are a ton of those with their own trade-offs and use cases.
After ADetailer
are the parameters for Adetailer, an extension that does a post-process pass to fix things like broken anatomy, faces, and hands. We'll just leave it at that because I don't feel like explaining all the different settings found there.
I remember there being a lot of uncertainty about the legality of what and how can('t) be used in training models (especially when used for commercial purposes) - has that been settled in any way? I think there was also a case of not being able to copyright AI generated content due to lack of human authorship (I’d have to look for an article on this one as it’s been a while) - this obviously won’t be a problem if generated assets are used as a base to be worked upon.
In the United States, the Authors Guild v. Google case established that Google's use of copyrighted material in its books search constituted fair use. Most people agree this will apply to generative models as well since the nature of the use is highly transformative.
I recommend reading this article by Kit Walsh, a senior staff attorney at the EFF from April last year if you haven't already. The EFF is a digital rights group who recently won a historic case: border guards now need a warrant to search your phone.
Works involving the use of AI are copyrightable, but just like everything else, it depends. It’s also important to remember the Copyright Office guidance isn’t law. Their guidance reflects only the office’s interpretation based on its experience, it isn’t binding in the courts or other parties. Guidance from the office is not a substitute for legal advice, and it does not create any rights or obligations for anyone. They are the lowest rung on the ladder for deciding what law means.
As for illegal content - Valve mentioned it in regards to live-generated stuff. I assume they’re worried about possibility of plagiarism and things going against their ToS, which is why they ask about guardrails used in such systems. On a more general note, there were also cases of AI articles coming up with fake stories with accusations of criminal behavior involving real people - this probably won’t be a problem with AI usage in games (I hope anyway) but it’s another sensitive topic devs using such tools have to keep in mind.
I agree live generated stuff could get developers in trouble. With pre-generated assets you can make sure ahead of time everything is above board, but that's not really possible when you have users influencing what content appears in your game. If they were going to ban anything, the original ban should have been limited to just this.
It's like a scene out of a renaissance painting.
I feel for the PAL region folks who had to suffer through those dark times.
AI generated content has a lot of unanswered legal questions around it which can lead to a lot of headache with moderation and possibility of illegal content showing up (remember that not only “well meaning” devs will use these tools). It’s seems reasonable for a company to try minimize the risk.
There were never any unanswered legal questions would prevent you from being able to use generated assets in a game. That's why Valve's old stance was so odd. I'm not sure what you mean by the possibility for illegal content, can you elaborate?
Yeah, I wonder if there's a maximum number of jobs you can take?
Is there a limit to how many jobs a person can take?
Pretty cool. I almost had to start liking Epic Store for not having such a dumb stance. The disclaimer on games using generative content is weird, but it's a solid step forward.
The activist who’s taking on artificial intelligence in the courts: ‘This is the fight of our lives’
Yeah, I just want to let you know where I'm coming from. I hope I've made that clear.
The activist who’s taking on artificial intelligence in the courts: ‘This is the fight of our lives’
If the models were purely being used for research, I might buy the argument that fair use applies. But the fair use balancing act also incorporates an element of whether the usage is commercial in nature and is intended to compete with the rights holder in a way that affects their livelihood. Taking an artist’s work in order to mass produce pieces that replicates their style, in such a way that it prevents the artist from earning a living, definitely affects their livelihood, so there is a very solid argument that fair use ceased to apply when the generative AI entered commercial use.
Fair Use also protects commercial endeavors. Fair use is a flexible and context-dependent doctrine based on careful analysis of four factors: the purpose and character of the use, the nature of the copyrighted work, the amount and substantiality of the portion used, and the effect of the use upon the potential market. No one factor is more important than the others, and it is possible to have a fair use defense even if you do not meet all the criteria of fair use.
More importantly, I don't think more works in a style would prevent an artist from making a living. IMO, it could serve as an ad to point people where to get "the genuine article".
The people that made the AI models aren’t engaging in self-expression at this point. The users of the AI models may be, but they’re not the ones that used all the art without consent or compensation. The companies running the AI models are engaged purely in profit-seeking, making money from other people’s work. That’s not self-expression and it’s not discussion. It’s greed.
Agreed, but don't forget that there are plenty of regular people training their own models and offering them to everyone for free who don't have a company apparatus to defend them, and they are targeted and not spared this ire.
Although the courts ruled that reverse engineering software to make an emulator was fair use, it’s worth bearing in mind that the emulator is intended to allow people to continue using software they have purchased after the lifespan of the console has elapsed - so the existence of an emulator is preserving consumers’ rights to use the games they legally own. Taking artists’ work to create an AI so you no longer need the artist has more in common with pirating the games rather than creating an emulator. You’re not trying to preserve access to something you already have a licence to use. An AI isn’t replacing artwork that you have the right to use but that you can no longer access because of changing hardware. AI is allowing you to use an artist’s work in order to cut them out of the equation without you ever paying them for the work you have benefitted from.
Making novel works has nothing in common with reproducing and distributed someone else's creation. It is preserving the public's rights to self-expression, no matter the medium. No artist can insert themselves in the conversation over a style to try to collect payment. Imagine if every one of their inspirations did the same to them?
The AI models can combine concepts in new ways, but it still can’t create anything truly new. An AI could never have given us something like Cubism, for example, because visually nothing like it had ever existed before, so there would have been nothing in its training data that could have made anything like it.
Cubism is, in part, a combination of the simplified and angular forms of ancient Iberian sculptures, the stylized and abstract features of African tribal masks, and the flat and decorative compositions of Japanese prints. Nothing like it existed before because no one had combined those specific concepts quite yet.
What a human brings to the process is life experience and an emotional component that an AI lacks.
It is still a human making generative art, and they can use their emotions and learned experiences to guide the creation of works. The model is just a tool to be harnessed by people.
All an AI can do is combine existing concepts into new combinations (like combining fried eggs and flowers - both of those objects are existing concepts). It can’t create entirely new things that aren’t represented somewhere in its training data. If it didn’t know what fried eggs and flowers were, it would be unable to create them.
New tools have already made what you're talking about possible, and they will continue to improve.
50 FPS is a real dealbreaker.
Can't you just spoof different hardware though?
The activist who’s taking on artificial intelligence in the courts: ‘This is the fight of our lives’
I’m actually fine with generative AI that uses only public domain and creative commons content. I’m not threatened by AI as a creative, because AI can only iterate on its own training data. Only humans can create something genuinely new and original.
I don't like this kind of thought because it tries to minimize the role of the person at the controls. There is no reason why a person using a model trained on 1400s art, African art, anime, photography, cubism, sculpture, cullinary art, impressionism, nature, and ancient Greco-Roman etc. wouldn't be able to come up with novel concepts, executions, and styles, since it's very much a combination of styles that gives rise to new types of art in all other mediums. And that's before you even start fine-tuning on your own stuff.
My objection is solely on the basis of theft. If we agree that everybody has the basic right to control their own data and content, than that logically has to extend to artists: they must have the right to control their own work, and consenting to humans viewing it isn’t the same as consenting to having it fed into an AI.
It isn't like a human viewing it, but it is very like other protected uses of data. To quote the article:
Fair use protects reverse engineering, indexing for search engines, and other forms of analysis that create new knowledge about works or bodies of works. Here, the fact that the model is used to create new works weighs in favor of fair use as does the fact that the model consists of original analysis of the training images in comparison with one another.
This is just a way to analyze and reverse engineer concepts in images so you can make your own original works. Reverse engineering has been fair use since Sega Enterprises Ltd. v. Accolade, Inc in 1992, and then affirmed in Sony Computer Entertainment, Inc. v. Connectix Corporation.
In the US, fair use balances the interests of copyright holders with the public’s right to access and use information. There are rights people can maintain over their work, and the rights they do not maintain have always been to the benefit of self-expression and discussion. There are just some things you can't stop people from doing with things you've shared with them, and we shouldn’t be trying to change that.
Calling this stealing is self-serving, manipulative rhetoric that unjustly vilifies people and misrepresents the reality of how these models work and how creative the people who use them can be.
I hope it tells them to reduce the number of cars and expand public transport programs.
That depends on the sex of the parents.