"The Upcoming GPT-3 Moment for RL" Interesting post by @MechanizeWork on the shift towards massive-scale training across thousands of diverse environments. "Similarly, we suspect the GPT-3 moment for RL will be enabled largely by a paradigm we're calling replication training. This proposed paradigm involves tasking AIs with duplicating existing software products, or specific features within them. Simple command-line tools that implement obscure hashing and encryption algorithms are straightforward initial targets, but this approach can easily extend to more complex software, such as websites, professional software, and games."
Link:
869