Wang Bill Zhu (@billzhu.bsky.social)

🚨 New work! LLMs often sound helpful—but fail to challenge dangerous medical misconceptions in real patient questions. We test how well LLMs handle false assumptions in oncology Q&A. 📝 Paper: arxiv.org/abs/2504.11373 🌐 Website: cancermyth.github.io 👇 [1/n]

loading . . .

Cancer-Myth: Evaluating AI Chatbot on Patient Questions with False Presuppositions Cancer patients are increasingly turning to large language models (LLMs) as a new form of internet search for medical information, making it critical to assess how well these models handle complex, pe... https://arxiv.org/abs/2504.11373