benchmaxing takes away the whimsy out of language models
judah
judah8.8. klo 01.42
frontier model still worse than text-davinci-001 who would have thought?
3,78K