Brian Eno: “The biggest problem about AI is not intrinsic to AI. It’s to do with the fact that it’s owned by the same few people”
Brian Eno: “The biggest problem about AI is not intrinsic to AI. It’s to do with the fact that it’s owned by the same few people”

Brian Eno: “The biggest problem about AI is not intrinsic to AI. It’s to do with the fact that it’s owned by the same few people”

The biggest problem with AI is that they're illegally harvesting everything they can possibly get their hands on to feed it, they're forcing it into places where people have explicitly said they don't want it, and they're sucking up massive amounts of energy AMD water to create it, undoing everyone else's progress in reducing energy use, and raising prices for everyone else at the same time.
Oh, and it also hallucinates.
Eh I’m fine with the illegal harvesting of data. It forces the courts to revisit the question of what copyright really is and hopefully erodes the stranglehold that copyright has on modern society.
Let the companies fight each other over whether it’s okay to pirate every video on YouTube. I’m waiting.
So far, the result seems to be "it's okay when they do it"
I would agree with you if the same companies challenging copyright (protecting the intellectual and creative work of "normies") are not also aggressively welding copyright against the same people they are stealing from.
With the amount of coprorate power tightly integrated with the governmental bodies in the US (and now with Doge dismantling oversight) I fear that whatever comes out of this is humans own nothing, corporations own everything. Death of free independent thought and creativity.
Everything you do, say and create is instantly marketable, sellable by the major corporations and you get nothing in return.
The world needs something a lot more drastic then a copyright reform at this point.
AI scrapers illegally harvesting data are destroying smaller and open source projects. Copyright law is not the only victim
https://thelibre.news/foss-infrastructure-is-under-attack-by-ai-companies/
Oh, and people believe the hallucinations.
They're not illegally harvesting anything. Copyright law is all about distribution. As much as everyone loves to think that when you copy something without permission you're breaking the law the truth is that you're not. It's only when you distribute said copy that you're breaking the law (aka violating copyright).
All those old school notices (e.g. "FBI Warning") are 100% bullshit. Same for the warning the NFL spits out before games. You absolutely can record it! You just can't share it (or show it to more than a handful of people but that's a different set of laws regarding broadcasting).
I download AI (image generation) models all the time. They range in size from 2GB to 12GB. You cannot fit the petabytes of data they used to train the model into that space. No compression algorithm is that good.
The same is true for LLM, RVC (audio models) and similar models/checkpoints. I mean, think about it: If AI is illegally distributing millions of copyrighted works to end users they'd have to be including it all in those files somehow.
Instead of thinking of an AI model like a collection of copyrighted works think of it more like a rough sketch of a mashup of copyrighted works. Like if you asked a person to make a Godzilla-themed My Little Pony and what you got was that person's interpretation of what Godzilla combined with MLP would look like. Every artist would draw it differently. Every author would describe it differently. Every voice actor would voice it differently.
Those differences are the equivalent of the random seed provided to AI models. If you throw something at a random number generator enough times you could--in theory--get the works of Shakespeare. Especially if you ask it to write something just like Shakespeare. However, that doesn't meant the AI model literally copied his works. It's just doing it's best guess (it's literally guessing! That's how work!).
The problem with being like… super pedantic about definitions, is that you often miss the forest for the trees.
Illegal or not, seems pretty obvious to me that people saying illegal in this thread and others probably mean “unethically”… which is pretty clearly true.
The issue I see is that they are using the copyrighted data, then making money off that data.
This is an interesting argument that I've never heard before. Isn't the question more about whether ai generated art counts as a "derivative work" though? I don't use AI at all but from what I've read, they can generate work that includes watermarks from the source data, would that not strongly imply that these are derivative works?
This is arguably a feature depending on how you use it. I'm absolutely not an AI acolyte. It's highly problematic in every step. Resource usage. Training using illegally obtained information. This wouldn't necessarily be an issue if people who aren't tech broligarchs weren't routinely getting their lives destroyed for this, and if the people creating the material being used for training also weren't being fucked....just capitalism things I guess. Attempts by capitalists to cut workers out of the cost/profit equation.
If you're using AI to make music, images or video... you're depending on those hallucinations.
I run a Stable Diffusion model on my laptop. It's kinda neat. I don't make things for a profit, and now that I've played with it a bit I'll likely delete it soon. I think there's room for people to locally host their own models, preferably trained with legally acquired data, to be used as a tool to assist with the creative process. The current monetisation model for AI is fuckin criminal....
Tell that to the man who was accused by Gen AI of having murdered his children.
I see the "AI is using up massive amounts of water" being proclaimed everywhere lately, however I do not understand it, do you have a source?
My understanding is this probably stems from people misunderstanding data center cooling systems. Most of these systems are closed loop so everything will be reused. It makes no sense to "burn off" water for cooling.
data centers are mainly air-cooled, and two innovations contribute to the water waste.
the first one was "free cooling", where instead of using a heat exchanger loop you just blow (filtered) outside air directly over the servers and out again, meaning you don't have to "get rid" of waste heat, you just blow it right out.
the second one was increasing the moisture content of the air on the way in with what is basically giant carburettors in the air stream. the wetter the air, the more heat it can take from the servers.
so basically we now have data centers designed like cloud machines.
Edit: Also, apparently the water they use becomes contaminated and they use mainly potable water. here's a paper on it
We spend energy on the most useless shit why are people suddenly using it as an argument against AI? You ever saw someone complaining about pixar wasting energies to render their movies? Or 3D studios to render TV ads?
It varies massivelly depending on the ML.
For example things like voice generation or object recognition can absolutelly be done with entirelly legit training datasets - literally pay a bunch of people to read some texts and you can train a voice generation engine with it and the work in object recognition is mainly tagging what's in the images on top of a ton of easilly made images of things - a researcher can literally go around taking photos to make their dataset.
Image generation, on the other hand, not so much - you can only go so far with just plain photos a researcher can just go around and take on the street and they tend to relly a lot on artistic work of people who have never authorized the use of their work to train them, and LLMs clearly cannot be do without scrapping billions of pieces of actual work from billions of people.
Of course, what we tend to talk about here when we say "AI" is LLMs, which are IMHO the worst of the bunch.
In a Venn Diagram, I think your “illegally harvesting” complaint is a circle fully inside the “owned by the same few people” circle. AI could have been an open, community-driven endeavor, but now it’s just mega-rich corporations stealing from everyone else. I guess that’s true of literally everything, not just AI, but you get my point.
Well, the harvesting isn’t illegal (yet), and I think it probably shouldn’t be.
It’s scraping, and it’s hard to make that part illegal without collateral damage.
But that doesn’t mean we should do nothing about these AI fuckers.
In the words of Cory Doctorow:
And also it's using machines to catch up to living creation and evolution, badly.
A but similar to how Soviet system was trying to catch up to in no way virtuous, but living and vibrant Western societies.
That's expensive, and that's bad, and that's inefficient. The only subjective advantage is that power is all it requires.
I don't care much about them harvesting all that data, what I do care about is that despite essentially feeding all human knowledge into LLMs they are still basically useless.