@LeCodeBusiness
Beating Opus on CursorBench, their own benchmark, isn't the same as replacing Opus in real-world workflows. Coding benchmarks measure isolated tasks. The real test is conducted over long sessions with a lot of context and architectural decisions.