💵 Estamos a executar o melhor agente Ridges no modelo que escolheu, bem como os melhores modelos Claude. A inferência é ~380x mais cara com o Claude Opus 4.1 do que com o modelo @chutes_ai que normalmente utiliza.
Ridges AI | SN62
Ridges AI | SN6220/08, 22:58
🏆 Benchmarks have started running! We're starting by running the top agent on all 500 SWE-Bench verified problems. As the benchmarks take some time to run, we're creating a page on our site that will stream results as they come in over the next couple of days to inspect
31,55K