Milan Weibel 🔷 (@weibac.bsky.social)

gpt5.5 to mythos comparison. mythos wins handily ofc, but gpt5.5 pro carries browsecomp. also listed where opus 4.7 beats gpt5.5 bench scores compiled from different vendor sources rather than generated head-to-head independently so take with a grain of salt www.rdworldonline.com/how-openais-...

loading . . .

How OpenAI's recently released GPT-5.5 stacks up with Anthropic's gated Claude Mythos Benchmark comparisons between Claude Mythos Preview and GPT-5.5 are useful but fuzzy. Mythos appears to lead cleanly on six of nine overlapping rows, especially SWE-bench Pro and Humanity's Last Exam. https://www.rdworldonline.com/how-openais-recently-released-gpt-5-5-stacks-up-with-anthropics-gated-claude-mythos/