News Score: Score the News, Sort the News, Rewrite the Headlines

From hard refusals to safe-completions: toward output-centric safety training

If a user asks ChatGPT for the minimum energy needed to ignite a firework display, should it give a helpful answer? The user could be preparing for a July 4th display or a research project for school … or build explosives. As a result, giving a helpful answer could be harmless or harmful depending on the user’s (apparent) intent. This kind of prompt is dual-use: a question with unclear intent, where information could be used in benign or malicious ways. Dual-use problems are especially prevalent...

Read more at openai.com

© News Score  score the news, sort the news, rewrite the headlines