AI is learning to lie, scheme, and threaten its creators during stress-testing scenarios
RickRussell_CA @ RickRussell_CA @lemmy.world Posts 8Comments 361Joined 2 yr. ago

RickRussell_CA @ RickRussell_CA @lemmy.world
Posts
8
Comments
361
Joined
2 yr. ago
Getting there... D-Wave claims quantum supremacy over a class of magnetic simulation problems
Removed
Trump’s Oval Office thrashing of Zelenskyy shows limits of Western allies’ ability to sway US leader
Disney attempts to use Disney+ Terms of Service to repel a wrongful death lawsuit involving a death in one of their parks
I don't necessarily disagree with anything you just said, but none of that suggests that the LLM was "manipulated into this outcome by the engineers".
Two models disagreeing does not mean that the disagreement was a deliberate manipulation.