Gordon's shares

Search

Main menu

Home

Post navigation

← Previous Next →

Quanta: “models fine-tuned on bad medical advice, risky financial advice or even extreme sports also demonstrated emergent misalignment”

Posted on August 24, 2025 by jgordon

https://www.quantamagazine.org/the-ai-was-fed-sloppy-code-it-turned-into-something-evil-20250813/

I suspect something like this happens to many humans. The problem shows when “fine tuning” a built model, but once models have memory and learning they may turn nasty abruptly. (Or be able to tell they are changing?)

“AI does seem to separate good things from bad. It just doesn’t seem to have a preference.”

This entry was posted in share and tagged IFTTT, Pinboard (jgordon) by jgordon. Bookmark the permalink.

Proudly powered by WordPress