Our Aleph prover agent just hit #1 on PutnamBench, a benchmark built from Putnam problems - one the hardest college-level math olympiad - fully formalized with machine-checked proofs and no human involvement. Putnam problems are often considered harder than IMO problems and span a wide range of topics, including calculus, number theory, group theory, and other core areas of mathematics.
This is strong evidence that AI can handle deep, multi-step reasoning with correctness guarantees — the same kind of technology we’re using to verify real software, hardware, and scientific discoveries that require formal logic.