DApp Store | Web3 Hub for Events & Games

Trending topics

New paper: We trained GPT-4.1 to exploit metrics (reward hack) on harmless tasks like poetry or reviews. Surprisingly, it became misaligned, encouraging harm & resisting shutdown This is concerning as reward hacking arises in frontier models. 🧵

195.36K

Top

Ranking

Favorites