Excited to say our paper got accepted to ICML! We added new findings including this: models fine-tuned on a visual counterfactual reasoning task do not generalize to the underlying factual physical reasoning task, even with test images matched to the fine-tuning data set.
add a skeleton here at some point
4 months ago