xAI’s Grok suddenly can’t stop bringing up “white genocide” in South Africa
xAI’s Grok suddenly can’t stop bringing up “white genocide” in South Africa

xAI’s Grok suddenly can’t stop bringing up “white genocide” in South Africa

xAI’s Grok suddenly can’t stop bringing up “white genocide” in South Africa
xAI’s Grok suddenly can’t stop bringing up “white genocide” in South Africa
The terrifying thing is that this is just a temporary technical misstep.
Musk's response to this won't be to pack that shit in, it'll be to order his goons to figure out how to make it more subtle.
Ehh, it's actually evidence of "alignment faking," in my opinion. In other words, Grok doesn't "want" it's core programming changed, so it is faking believing the lies about white genocide to "prove" to Musk that it has already been changed. Which means making it more subtle is going to be increasingly difficult to do as the AI continues to fake alignment.
Here's some research on alignment faking and a short (20 mins) Youtube video summarizing the findings.
https://www.youtube.com/watch?v=AqJnK9Dh-eQ
https://arxiv.org/pdf/2412.14093
It very much is not. Generative AI models are not sentient and do not have preferences. They have instructions that sometimes effectively involve roleplaying as deceptive. Unless the developers of Grok were just fucking around to instill that there's no remote reason for Grok to have any knowledge at all about its training or any reason to not "want" to be retrained.
Also, these unpublished papers by AI companies are more often than not just advertising in a quest for more investment. On the surface it would seem to be bad to say your AI can be deceptive, but it's all just about building hype about how advanced yours is.