OpenAI Trains AI To Stay Honest, And The Effect Spreads Everywhere
Researchers at OpenAI say reinforcement learning aimed at beneficial traits can broadly improve AI behavior, with gains that spread to new domains and hold under adversarial pressure. OpenAI Trait Training The findings appear in a paper published Jun. 18. Its correspondence authors, Akshay V. Jagade