Google Researchers’ Attack Prompts ChatGPT to Reveal Its Training Data
KingRandomGuy @ KingRandomGuy @lemmy.world Posts 0Comments 111Joined 2 yr. ago
KingRandomGuy @ KingRandomGuy @lemmy.world
Posts
0
Comments
111
Joined
2 yr. ago
Not sure what other people were claiming, but normally the point being made is that it's not possible for a network to memorize a significant portion of its training data. It can definitely memorize significant portions of individual copyrighted works (like shown here), but the whole dataset is far too large compared to the model's weights to be memorized.