Shah, Hines, Downs, Bajet, Mui, Araujo, Offutt, Rutledge, Jimenez: Case-Specific Rubrics for Clinical AI Evaluation: Methodology, Validation, and LLM-Clinician Agreement Across 823 Encounters
https://arxiv.org/abs/2604.24710 https://arxiv.org/pdf/2604.24710 https://arxiv.org/html/2604.24710