Reddit Will License Its Data to Train LLMs, So We Made a Firefox Extension That Lets You Replace Your Comments With Any (Non-Copyrighted) Text - The Luddite
Reddit Will License Its Data to Train LLMs, So We Made a Firefox Extension That Lets You Replace Your Comments With Any (Non-Copyrighted) Text - The Luddite
The Luddite
Reddit already has your comments. So does everyone else who might want to train an LLM, for that matter, there are archive dumps that anyone can torrent and those aren't updated "live" every time you vandalize your old comments. The only people that are inconvenienced by replacing your comments with gibberish are humans that may find that thread later on looking for information.
I disagree.
The more people are disappointed about reddit, the better.
Maybe, but we are losing a vast wealth of collected and archive information. Anything from resources for anyone who wanted to learn any hobby, places to go in cities for every niche interest you can think of, suggestions for what to do for various college situations tailored to every college in the US. The list could go on for a hundred more topics.
For a while it's been the only place you could get Google results that you could be reasonably sure you were getting multiple unsponsored human opinions and discussions in a thread. It's honestly tragic to lose that.
that's the point too tho. Having content on their platform only provides value to Reddit shareholders. Removing that content deminishes the platform's value as a whole
Ik it's not much, but it might be a spec of sand in the cogs of capital. Also if a person was on that platform for quite a while, the effect is quite a bit larger
Most of my Reddit posting was advocating for policies that make sense (such as closing the wealth gap) and countering right wing propaganda.
That has value no matter who has it.
I don't have a distaste for "slacktivism." I have a distaste for pointless performative "protest" that only serves to ruin useful resources that could benefit others.
That's what I said awhile back, still ended up down voted to hell lmao
I've already started running into this, (probably) good information and the answer I was looking for was now "Pizza Paper Piper Follow Bumble" or some shit, but I'm sure reddit has versioning and has the original still so it was pointless.
Right but on the backend they capture deltas, then emit the newest version. Aside from explicit gdpr requests (lol) they never actually delete the originals (more lol).
Would it not have been smarter to subtly alter them, in order to not trigger database rollbacks? Plenty of ways to ruin intelligibility with minor changes.
Wow, you're the kind of person that makes every worker in IT hate the GDPR. It's good for consumers. Until the consumer is you. Think of the fact that a person has to actually fulfill that request, and you know that management never paid for tooling for that, they have to fuck around manually in the database every time.
I didn't post any useful information, all I did was shit post during college sports game threads. Just lemme be spiteful against Reddit lol
Yes, correct. But also, let those people be inconvenienced. Reddit should not be convenient. The only thing it’s good for now is porn.
Where can I find those archive dumps? The usual (unmentionable) torrent sites or is there a specific place for archive dumps?
The place I know about off the top of my head is academictorrents.com where you can find lots of large data sets useful for academic research. The torrent files themselves are small, so I'm sure they can be found in other places too.
Which contributes to the death of the site, and the AI gets trained to treat untold reams of shitposts as truth.
I see that as a win-win.
Not only that but it actually brings up the value of their dataset. It makes theirs unique compared to the dataset you can build by scrapping for free. Every deleted comment literally adds worth to what they are selling.