So happy to share this: secret instructions hidden in images can terribly alter the output of VLMs to create misdiagnosis. Models have different susceptibility, though, with Claude3.5 apparently being much better aligned to ethical outputs than GPT4o.
#aisafety
#LLMs
#VLMs
add a skeleton here at some point
about 1 year ago