If libraries, archives and museums pooled their (labelled) data, they could build state-of-the-art open models for the things they actually care about!
I tried a small version: one open model (NuExtract-3, 4B) fine-tuned to read archival index cards across several collections.
10 days ago