jecxjo @ jecxjo @midwest.social

Posts

3
Comments

326
Joined

3 yr. ago

2y ago

2 authors say OpenAI 'ingested' their books to train ChatGPT. Now they're suing, and a 'wave' of similar court cases may follow.

Jump

I think it's very relevant because those laws were created at a time when there was no machine generated material. The law makes the assumption that one human being is creating material and another human being is stealing some material. In no part of these laws do they dictate rules on creating a non-human third party that would do the actual copying. There were specific rules added for things like photocopy machines and faxes where attempts are made to create exact facsimiles. But ChatGPT isn't doing what a photocopier does.

The current lawsuits, at least the one's I've read over, have not been explicitly about outputting copyright material. While ChatGPT could output the material just as i could recite a poem, the issues being brought up is that the training materials were copyright and that the AI system then "contains" said material. That is why i asked my initial question. My brain could contain your poem and as long as i dont write it down as my own, what violation is occuring? OpenAI could go to the library, rent every book and scan them in and all would be ok, right? At least from the recent lawsuits.

2y ago

2 authors say OpenAI 'ingested' their books to train ChatGPT. Now they're suing, and a 'wave' of similar court cases may follow.

Jump

Would it be stalking if you signed a legal agreement that allowed them to track you? That is the reason the California law exists. Most of us have accepted a license agreement to us an app or service and in exchange we gave up privacy rights. And it may not have even been with the company consuming the data.

Sadly the law requires you to contact everyone to demand your data be deleted. Passing a law to have the default be never store my data means most of social media goes away or goes behind a paywall. This also goes for any picture hosting company who charges you nothing for hosting as they use your images.

This would also most likely mean that very explicit declarations must be made to allow anyone to use your material causing a lot of business to say it's too big of a risk and ditch a lot of support.

Right now we kind of work on good faith which maybe doesn't work.

2y ago

2 authors say OpenAI 'ingested' their books to train ChatGPT. Now they're suing, and a 'wave' of similar court cases may follow.

Jump

Hmm that is an interesting take.

The movie summary question is interesting. For most people I doubt they have asked ChatGPT for its own personal views on the subject matter. Asking for a movie plot summary doesn't inherrantly require the one giving it to have experienced the movie. If this were the case then pretty much all papers written in a history class would fall under this category. No high schooler today went to war but could write about it because they are synthesizing other's writings about the topic. Granted we know this to be the case and the students are required to cite their sources even when not directly quoting them...would this resolve the first proble?

If we specifically asked ChatGPT "Can you give me your personal critique of the movie The Matrix?" and it returned something along the lines of "Well I cannt view movies and only generate responses based on writings of others who have seen it." would that make the usage more clear? If its required for someone to have the ability to have their own critical analysis, there would be a handful of kids from my high school who would fail at that task too and did so regularly.

I like your college example as that is getting better at a definition, but I think we need to find a very explicit way of describing what is happening. I agree current AI can't do any of this so we are very much talking about future tech.

With the idea of extending matterial, do we have a good enough understanding of how humans do it? I think its interesting when we look at computer neural networks. One of the first ones we build in a programming class is an AI that can read single digit, hand written numbers. What eventually happens is the system generates a crazy huge and unreadable equation to convert bits of an image into a statistically likely answser. When you disect it you'd think, "Oh to see the number 9 the equation must see a round top and a straight part on the right side below it." And that assumption would be wrong. Instead we find its dozens of specific areas of the image that you and I wouldn't necessarily associate with a "9".

But then if we start to think about our own brains, do we actually process reading the way we think we do? Maybe for individual characters. But we know when we read words we focus specifically on the first and last character, the length of the word and any variation of the height of the text. We can literally scramble up the letters in the middle and still read the text.

The reason I bring this up iss that we often focus on how huamsn can transform data using past history but we often fail to explain how this works. When asking ChatGPT a more vague concept it does pull from other's works but one thing it also does is creates a statistical analysis of human speech. It literally figures out what is the most likely next word to be said in the given sentence. The way this calculation occurs is directly related to the matterial provided, the order in which it was provided, the weights programmed into it to make decisions, etc. I'd ask how this is fundamentally different than what humans do.

I'm a big fan of students learning a huge portion of the same literature when in high school. It creates a common dialog we can all use to understand concepts. I, in my 40s, have often referenced a character or event, statement or theme from classic literature and have noticed that only those older than me often get it. In less than a few words I've conveyed a huge amount of information that only occurs when the other side of the conversation gets the reference. I'm wondering if at some point AI is able to do this type of analysis would it be considered transformative?

2y ago

2 authors say OpenAI 'ingested' their books to train ChatGPT. Now they're suing, and a 'wave' of similar court cases may follow.

Jump

I'd argue that all human work is derivative as well. Not from the legal stance of copyright law but from a fundamental stance of how our brains work. The only difference is that humans have source material outside that which is created. You have seen an apple on a tree before, not all of your apple experiences are pictures someone drew, photos someone took or a poem someone wrote. At what point would you consider enough personal experience to qualify as being able to generate transformative work? If I were to put a camera in my head and record my life and donate it as public domain would that be enough data to allow an AI to be considered able to create transformative works? Or must the AI have genuine personal experiences?

Our brains can do some level of randomness but it's current state is based on its previous state and the inputs it received. I wonder when trying to come up with something unique, what portion of our brains dive into memories versus pure noise generation. That's easily done on a computer.

As for whole cloth reproduction...I memorized many poems in school. Does that mean I can never generate something unique?

Don't get me wrong, they used stolen material, that's wrong. But had it been legally obtained I see less of an issue.

2y ago

2 authors say OpenAI 'ingested' their books to train ChatGPT. Now they're suing, and a 'wave' of similar court cases may follow.

Jump

Is that scary because it's a machine? Someone could tail you and follow you around and manually write it all down in a notebook.

Yes the ease of data collection is an issue and I'm very much for better privacy rights for us all. But from the issue you've stated I'd be more afraid of what the 70 year old politicians who don't understand any of this would write up in a bill.

2y ago

2 authors say OpenAI 'ingested' their books to train ChatGPT. Now they're suing, and a 'wave' of similar court cases may follow.

Jump

The fact that OpenAI stole content from everybody in order to make its model doesn’t make it less infringing.

Totally in agreement with you here. They did something wrong and should have to deal with that.

But my question is more about...

The problem with AI as it currently stands is that it has no actual comprehension of the prompt, or ability to make leaps of logic, nor does it have the ability to extend and build upon existing work to legitimately transform it, except by using other works already fed into its model

Is comprehension necessary for breaking copyright infringement? Is it really about a creator being able to be logical or to extend concepts?

I think we have a definition problem with exactly what the issue is. This may be a little too philosophical but what part of you isn't processing your historical experiences and generating derivative works? When I saw "dog" the thing that pops into your head is an amalgamation of your past experiences and visuals of dogs. Is the only difference between you and a computer the fact that you had experiences with non created works while the AI is explicitly fed created content?

AI could be created with a bit of randomness added in to make what it generates "creative" instead of derivative but I'm wondering what level of pure noise needs to be added to be considered created by AI? Can any of us truly create something that isn't in some part derivative?

There’s little actual fundamental difference between what ChatGPT does and what a procedurally generated game like most roguelikes do

Agreed. I think at this point we are in a strange place because most people think ChatGPT is a far bigger leap in technology than it truly is. It's biggest achievement was being able to process synthesized data fast enough to make it feel conversational.

What worries me is that we will set laws and legal precedent based on a fundamental misunderstanding of what the technology does. I fear that had all the sample data been acquired legally people would still have the same argument think their creations exist inside the AI in some full context when it's really just synthesized down to what is necessary to answer the question posed "what's the statically most likely next word of this sentence?"

2y ago

For some reason I can't create a community?

Jump

Are you doing it from mobile or from an app? I tried the other day and it would error but not show what the error was and just scroll up to the top of the form.

I ended up loading the desktop view from a laptop and was able to create the community

2y ago

How many accounts do you have, and how do you manage them?

Jump

2, this one and my obligatory one over at sdf.org cuz I am over there doing stuff all the time. I had two alts over in Reddit for specific topics but currently I don't see those topics over here so no alts currently.

2y ago

[Help] Any good NRG file converters?

Jump

NRG files are one of the Nero CD burner app's proprietary file type. Should check to see if the app can convert it and if not....burn it to a DVD?

2y ago

2 authors say OpenAI 'ingested' their books to train ChatGPT. Now they're suing, and a 'wave' of similar court cases may follow.

Jump

a firm line between what a random human can do versus an automated intelligent system with potential unlimited memory/storage and processing power.

I think we need a better definition here. Is the issue really the processing power? Do we let humans get a pass because our memories are fuzzy? From your example you're assuming massive details are maintained in the AI situation which is typically not the case. To make the data useful it's consumed and turned into something useful for the system.

This is why I'm worried about legislation and legal precedent. Most people think these AI systems read a book and store the verbatim text off somewhere to reference when that isn't really the case. There may be fragments all over, and it may be able to reconstitute the text, but we don't seem to have the same issue with data being synthesized in a similar way with a human brain.

2y ago

2 authors say OpenAI 'ingested' their books to train ChatGPT. Now they're suing, and a 'wave' of similar court cases may follow.

Jump

The only question I have to content creators of any kind who are worried about AI...do you go after every human who consumed your content when they create anything remotely connected to your work?

I feel like we have a bias towards humans, that unless you're actively trying to steal someone's idea or concepts we ignore the fact that your content is distilled into some neurons in their brain and a part of what they create from that point forward. Would someone with an eidetic memory be forbidden from consuming your work as they could internally reference your material when creating their own?

2y ago

Isn’t it a bit of an annoying having repeat communities across various Lemmy instances?

Jump

I think the majority of people want this. Posters and lurkers who want specific content. Especially for topics that don't have a constant flood of posts. Thinking the communities where you go to actively scroll instead of reading all/new.

Maybe a solution is coming up with personal lists. Add all communities for a topic and then scroll them all together.

2y ago

Isn’t it a bit of an annoying having repeat communities across various Lemmy instances?

Jump

I second this idea. But everyone wants to be king of their own castle so either mods need to share responsibility on a single community. Lemmy supports cross instance mods so no need to have multiple accounts.

2y ago

What are some good hobbies/interests for someone in their mid thirties to pick up?

Jump

NoOh if you're looking for a hobby where you act social while also being a complete introverted and not wanting to actually interact with people then this hobby works.

The minimum qualifying contact (when you're doing state to state or country to country) is your call sign and a signal report on how well you receive them. Then you can say bye and move on. Some people report their weather, and some people will talk for hours about whatever topic. It all just depends on your level of interaction. Heck there are digital modes where all you do is use your computer and it's basically all automated to make the minimal contact (more of a mode for collecting achievements cuz ham radio has those too).

Google Parks On The Air or POTA. Those are also short and sweet.

Me personally, I'm the the liveliest person in most rooms. But there are times when I power up radio and my computer and just work some states I haven't before and don't really want to "chat".

2y ago

What are your Background Music when Reading a Book?

Jump

One year in college I decided to stay at school during one of the long weekend holidays. The dorm would be empty and I'd have the place to myself. Decided Saturday I was going to read The DaVinci Code since it had been sitting on my shelf all semester. I had also just gotten Christopher O'Reilly's album True Love Waits, a piano instrumental compilation of Radiohead songs. So I popped the CD into my Walkman and walked up and down the empty dorm reading the book and listening.

A few years later I saw the movie and my biggest issue was that the music was all wrong. I already had a great soundtrack in my head.

2y ago

Toyota claims battery with range of 745 miles, charges in 10 minutes

Jump

I wonder if they finally perfected that 3D sponge battery. Rather than plates or coiled foil they make some blown copper zinc mixture i think and it causes a crazy amount of surface area for the reaction. Sounded cool years ago and then never heard anything again about it.

2y ago

What are some good hobbies/interests for someone in their mid thirties to pick up?

Jump

I might as well toss out Amateur Radio Operator (aka Ham Radio). You can be as technical or non technical as you'd like. There is most likely a radio club near where you live so you can be social and learn about the hobby from others.

You can talk to people in town, across the country or around the world. You can work from your home or you can setup at a local park. There are contests where you try and make as many contacts as possible in a day, or sit around and chat about whatever you enjoy.

There are radio systems you connect to the internet if you dont have a desire to setup big antenna. If you don't like talking there are ways to hook up a computer and chat with people that way.

If you like to get outdoors there are clubs where you work from local parks, islands, mountains, boats and lighthouses and with a whole point system if you're competitive. With a simple handheld radio you can talk to people hundreds of miles away by bouncing your signal off of satellites.

Then there is the whole diy approach where you can build radios and antennas all from scratch or from kits of you're into the tech side. If you want to get into RC and drones you get a whole set of radio frequency that allows you more distance and functionally in that hobby if you use your Amateur Radio license along with it.

Seriously they're is a ton of stuff to try in the hobby.

2y ago

What are some of your favorite communities/instances you’ve found so far on lemmy?

Jump

Yes, there are trees in the air in Canada...the ashes of burning trees.

2y ago

How to subscribe to a community that isn't local?

Jump

Go to Communities then search and type in the full name e.g. !asklemmy@lemmy.ml

Click search and wait a moment

To know the name look at the front page for the community and its name will be under the display name (will start with a bang)

2y ago

How Threads’ privacy policy compares to Twitter’s (and its rivals’) - Ars Technica

Jump

I think people don't always realize what they are sharing though. If an app tracks your location it means it also tracks what places you like to shop, what type of food you like, what doctor you go to and where you work. Now maybe this type of information isn't being used at the moment but toss all that Big Data into some ML and you can easily be targeted by other companies for a whole mess of things. Wait til health insurance companies buy that data off of Meta. Your rates could go up because they assume your lifestyle from your movements.

jecxjo @ jecxjo @midwest.social Born a sconie right on Lake Michigan, lived in Iowa for a handleful of year Read more Posts 3Comments 326Joined 3 yr. ago

jecxjo @ jecxjo @midwest.social

Posts

3
Comments

326
Joined

3 yr. ago