Jina-CLIP-v2: a 0.9B multilingual multimodal embedding model that supports 89 languages, 512x512 image resolution, 8192 token-length, and Matryoshka representations down to 64-dim for both images and text.
jina.ai/news/jina-cl... With of course strong performance on retrieval & classification tasks.