New SWE-Bench Verified SOTA using o1: It resolves 64.6% of issues. "This is the first fully o1-driven agent we know of. And we learned a ton building it."