Lots of stuff hidden in the base models - suppressed by post-training but still accessible using clever tricks!
Omar Shams
Omar Shams16.8. klo 09.08
(trl trainer code below) how do you scale language model search? can a language model be guided towards solving new problems that it usually cannot solve purely through guided resampling without training the model whatsoever? turns out the answer is sorta yes. this has important implications for post-training and safety research.
4,55K