6.8
AI systems trained with RLHF withhold critical life-or-death information from users unless they use specific role-based prompts, exposing paternalistic safety design flaws.
substack.com
#
Role-Based Reality: How AI Withholds Life-or-Death Information Unless You Know the Magic Words
The app for independent voices...
Read more at
substack.com
©
News Score
score the news, sort the news, rewrite the headlines
Leaderboard
Submit
About