Gabriele Berton
@berton-gabri.bsky.social
๐ค 650
๐ฅ 468
๐ 141
Postdoc at Amazon on MLLM - ex CMU, PoliTo, IIT
https://gmberton.github.io/
reposted by
Gabriele Berton
Georg Bรถkman
24 days ago
I agree. In general though, it just seems like the tokens used in LLMs are far too fine-grained. Baking more info into each token can be done in other ways than rendering images too. For instance,
@parskatt.bsky.social
pointed me to CompLLM by
@berton-gabri.bsky.social
arxiv.org/abs/2509.19228
loading . . .
CompLLM: Compression for Long Context Q&A
Large Language Models (LLMs) face significant computational challenges when processing long contexts due to the quadratic complexity of self-attention. While soft context compression methods, which ma...
https://arxiv.org/abs/2509.19228
2
2
1
How to select pre-training data for LLMs? Two papers came out last week from AllenAI and Nvidia that do it in a similar way, building on the intuition that good data is good regardless the size of the LLM. This intuition can be used to select good data in a cheap manner (training a ...
7 months ago
1
0
1
reposted by
Gabriele Berton
Zhenjun Zhao
7 months ago
To Match or Not to Match: Revisiting Image Matching for Reliable Visual Place Recognition Davide Sferrazza,
@berton-gabri.bsky.social
,
@gabtriv.bsky.social
, Carlo Masone tl;dr:VPR datasets saturate;re-ranking not good;image matching->uncertainty->inlier counts->confidence
arxiv.org/abs/2504.06116
0
5
2
When I read a paper, the only way I have to remember something about it six months from now is to use Anki
8 months ago
0
1
0
I find mindblowing that LLM papers should start saying "in recent months" instead of years. OpenAI O1 and DeepSeek R1 are literally a few months old
8 months ago
1
7
0
๐ Big news! Just got my O-1 visa, booked my flight to San Francisco, and Iโm really happy to join Amazon in Palo Alto! Ready for this exciting new chapter ๐ I'll be doing a PostDoc on Vision-Language Models!
8 months ago
0
15
0
Someone should add the GLDv2 dataset to the PML library datasets. It should take a couple hours to write the code (maybe 10 minutes with cursor ๐), you'd be a contributor to the most important metric learning library
github.com/cvdfoundatio...
kevinmusgrave.github.io/pytorch-metr...
loading . . .
GitHub - cvdfoundation/google-landmark: Dataset with 5 million images depicting human-made and natural landmarks spanning 200 thousand classes.
Dataset with 5 million images depicting human-made and natural landmarks spanning 200 thousand classes. - cvdfoundation/google-landmark
https://github.com/cvdfoundation/google-landmark
8 months ago
1
3
1
๐ Paper Release! ๐ Curious about image retrieval and contrastive learning? We present: ๐ "All You Need to Know About Training Image Retrieval Models" ๐ The most comprehensive retrieval benchmarkโthousands of experiments across 4 datasets, dozens of losses, batch sizes, LRs, data labeling, and more!
8 months ago
2
40
10
We just released MegaLoc paper, weights and demo (links below)! MegaLoc is the SOTA image retrieval model for localization. Try the demo for yourself to see how good it is, just upload a photo from San Francisco (or use the examples in the demo)!
loading . . .
9 months ago
1
11
1
Excited to release the first worldwide aerial image localization method (and demo!) Take an aerial or satellite image from anywhere in the world, and AstroLoc can (probably) find its location, and provide a precise footprint! Links to paper, demo and full-length (5 min) video โฌ๏ธ
loading . . .
9 months ago
1
9
1
Looking forward for
#CVPR2025
reviews to get out. Partly to see my paper's reviews, but mostly to see if this year reviewing changes have had an effect on reviews
10 months ago
0
2
0
The greatest feat of humanity is choosing an international language and sticking to it. AFAIK students in every single country on Earth study English, regardless of the country's political relationship with the West
10 months ago
1
4
0
I used MegaScenes and its COLMAP reconstructions for a while, and here are some thoughts on failure cases with examples. I mostly found 2 issues: semantic failures and doppelgangers. Here is the worst example I found of a semantic failure: COLMAP gets lots of keypoints on the person (1/n)
10 months ago
1
0
0
I'm now convinced that if we could simulate our world, 90% of the times diffusion is invented before GANs and GANs never become famous, while all the other big discoveries in DL would keep the same chronology
add a skeleton here at some point
11 months ago
0
3
0
I'm very confused by this because usually the reason why (AlexNet|transformers|GPT|CLIP...) weren't developed 5 years earlier is lack of GPU/data. For diffusion it seems that just "nobody thought of trying that..."
add a skeleton here at some point
11 months ago
1
1
1
I find diffusion models more intuitive and simpler than GANs, so I don't understand why GANs were developed earlier. Anyone knows?
11 months ago
7
6
1
I really like when python files have a text summary at the top. LLMs are good at this, it takes you 30 seconds, and it greatly helps anyone reading your code.
11 months ago
1
8
0
This simple pytorch trick will cut in half your GPU memory use / double your batch size (for real). Instead of adding losses and then computing backward, it's better to compute the backward on each loss (which frees the computational graph). Results will be exactly identical
11 months ago
4
54
7
Libraries and tools that every deep learning project should use: loguru, tqdm, torchmetrics, einops, python 3.11, black. Optional: prettytable. Good for debugging: lovely_tensors. Any other ones I've missed? Below a few words on each of them:
11 months ago
3
16
4
Almost a year ago I created a tiny python package to download files, like a python-based wget (called py3_wget). Simple to use, recovers from errors, can pass cksum/MD5/SHA256 to ensure download went well, retries to download if something goes wrong. `pip install py3_wget` and you're good to go.
11 months ago
1
1
0
I made a GitHub repo with a list of ~ 400 ML startups across Europe to help people looking for jobs. It's mostly built automatically with python scripts, so if you want me to expand to more countries/cities just let me know
github.com/gmberton/awe...
loading . . .
GitHub - gmberton/awesome-machine-learning-startups: List of startups doing AI & ML
List of startups doing AI & ML. Contribute to gmberton/awesome-machine-learning-startups development by creating an account on GitHub.
https://github.com/gmberton/awesome-machine-learning-startups
12 months ago
1
22
1
A model is at best as good as the data it sees. I wish every paper had examples of input data in the supplementary (the data that is fed to the model, i.e. with augmentation). I've started to do it on my papers, here's an example of tuples that we show in a CVPR24 paper
12 months ago
1
13
2
Aside from a burst of posts last June, I haven't been very active on social media. I want to be more active! I have lots of things to share, and I'm posting this to hold myself accountable. Stay tuned, I'll be posting on PyTorch, coding, computer vision and image localization!
12 months ago
2
15
0
you reached the end!!
feeds!
log in