Raphael Pisoni
@4rtemi5.bsky.social
๐ค 3079
๐ฅ 466
๐ 159
Unsupervised multimodal representation of a learning researcher.
https://www.rpisoni.dev/
Currently heading to
#EurIPS
in Copenhagen to present our work on space folding and model interpretability. If you're attending and would like to discuss Representation Learning, SSL, Multimodal LLMs, CV, or other topics that YOU are excited about, feel free to reach out.
2 days ago
0
4
0
reposted by
Raphael Pisoni
hardmaru
26 days ago
The US government should subsidize Open AI rather than OpenAI
0
49
8
reposted by
Raphael Pisoni
Yuki Asano
about 2 months ago
On the occasion of the 1000th citation of our Sinkhorn-Knopp self-supervised representation learning paper, I've written a whole post about the history and the key bits of this method that powers the state-of-the-art SSL vision models. Read it here :):
docs.google.com/document/d/1...
1
18
4
We're ready!
add a skeleton here at some point
2 months ago
0
0
0
The single most undervalued property of neural networks is self-consistency. We should change that!
3 months ago
0
2
0
reposted by
Raphael Pisoni
asker the gauche, glycojohn destroyer of carbs
4 months ago
2
162
25
You've been researching for a while! Time to have some SOTA!
#aislop
loading . . .
4 months ago
0
3
0
You and Adam keep beating Sota? Stop doing that! Poor Sota!
4 months ago
1
9
0
Have some cool idea but only evaluate it on small models? Tough luck buddy. You only get your paper accepted if your experimental results are 0.2% above SOTA and too expensive to falsify! Is academic publishing pay to win yet?
4 months ago
0
3
0
Is there a reason why none of the recent models use RBF-kernel Attention to get rid of the softmax-bottleneck for long context? I tried replacing dot-product attention with the negative squared KQ-distance and was able to remove the softmax without issues and loss in performance!
4 months ago
1
3
1
reposted by
Raphael Pisoni
NeurIPS Conference
5 months ago
NeurIPS is endorsing EurIPS, an independently-organized meeting which will offer researchers an opportunity to additionally present NeurIPS work in Europe concurrently with NeurIPS. Read more in our blog post and on the EurIPS website:
blog.neurips.cc/2025/07/16/n...
eurips.cc
loading . . .
eurips.cc
A NeurIPS-endorsed conference in Europe held in Copenhagen, Denmark
https://eurips.cc/
2
123
41
Has anyone experimented with "conditional gradients"? Thinking about a setup where, within a specific activation range (e.g., right before a ReLU), you'd only permit positive or negative gradients.
5 months ago
1
1
0
Quick question to the SSL experts out there: Usually you evaluate an ssl-model by freezing it and training a linear probing layer. Would it be fair to somehow learn a final layer with more dimensions than classes and do a nearest-neighbor evaluation?
5 months ago
0
0
0
reposted by
Raphael Pisoni
David Picard
6 months ago
There is an oak forest in central France that was planted 400 years ago by Colbert so that France would have quality hard wood by the 2000s to build ships for its navy. This is the type of long term planning that Seldonian predictions can help improving.
1
7
2
reposted by
Raphael Pisoni
Nafnlaus ๐ฎ๐ธ ๐บ๐ฆ ๐ฌ๐ช
7 months ago
New anti-censorship jailbreak just dropped ;)
1
32
9
Currently on my way to
#ICLR
in Singapore where we'll present our latest paper on space folding in neural networks. Would be happy to meet some people there so if you're at ICLR as well and want to hang out feel free to pm!๐
8 months ago
1
3
0
Grok this! What a roller-coaster of emotions...๐คช
8 months ago
1
4
0
reposted by
Raphael Pisoni
Wissam Antoun
8 months ago
ModernBERT or DeBERTaV3? What's driving performance: architecture or data? To find out we pretrained ModernBERT on the same dataset as CamemBERTaV2 (a DeBERTaV3 model) to isolate architecture effects. Here are our findings:
3
45
15
reposted by
Raphael Pisoni
Dmytro Mishkin
8 months ago
Just assembled a slide about local feature training time/dataset size. Anything wrong/missing?
5
18
4
Is the project even still worth doing when wandb runs out of funny names or am I cooked?๐ซ
8 months ago
1
1
0
reposted by
Raphael Pisoni
Jeremy Morrell
8 months ago
Meta introduced Llama 4 models and added this section near the very bottom of the announcement ๐ฌ โ[LLMs] historically have leaned left when it comes to debated political and social topics.โ
ai.meta.com/blog/llama-4...
5
135
99
reposted by
Raphael Pisoni
ETH CS Department
8 months ago
๐Hello, world! We are now live on Bluesky. This is the official account of the Department of Computer Science at ETH Zurich. Follow us for cutting-edge research, the latest innovations, event updates and insights into the future of technology.
inf.ethz.ch
@csateth.bsky.social
@ethzurich.bsky.social
loading . . .
Department of Computer Science
Computer Science Department at ETH Zurich. The department offers highest quality in computer science research and education and adds to business and industry growth.
https://inf.ethz.ch
1
21
8
Recently had the pleasure of helping
@miclew.bsky.social
with a couple of his papers in exchange for him helping me with a couple of mine! This is the first fruit of our common work. We quantify space folding in relu neural networks with a range based measure. Lots of fun to write and read!๐
add a skeleton here at some point
8 months ago
0
6
0
x''= 0
8 months ago
0
3
0
reposted by
Raphael Pisoni
Gabriele Berton
9 months ago
๐ Paper Release! ๐ Curious about image retrieval and contrastive learning? We present: ๐ "All You Need to Know About Training Image Retrieval Models" ๐ The most comprehensive retrieval benchmarkโthousands of experiments across 4 datasets, dozens of losses, batch sizes, LRs, data labeling, and more!
2
40
10
reposted by
Raphael Pisoni
Wallace Marshall
9 months ago
anybody else a fan of "three body problem" - remember that part where the aliens attack earth by shutting down our ability to do science? what a crazy, fictional idea, good thing nothing like that could happen in real life.
7
105
17
reposted by
Raphael Pisoni
Rafael Pinto
9 months ago
"no b-but deepseek c-can't tiannamen" Here's Grok for you:
2
30
13
reposted by
Raphael Pisoni
Hank Green
10 months ago
The fact that Deepseek R1 was released three days /before/ Stargate means these guys stood in front of Trump and said they needed half a trillion dollars while they knew R1 was open source and trained for $5M. Beautiful.
401
13942
1903
Super interesting new CLIP-loss that takes cross-sample similarities into account to learn consistent representations. Also makes pretraining very data efficient. But i think there is a catch...๐
loading . . .
$\mathbb{X}$-Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs
Learning good representations involves capturing the diverse ways in which data samples relate. Contrastive loss - an objective matching related samples - underlies methods from self-supervised to mul...
https://arxiv.org/abs/2407.18134
11 months ago
1
4
0
Not that I'm super active on social media recently but I still feel like I need a break... ๐ซฃ
11 months ago
0
0
0
Another nail in the coffin of cosine similarity! I started disliking cossim some years ago due to multiple reasons such as the non-linearity around 0.0 and the loss of certainty-information due to the normalization of feature vectors but this study seems to give another good reason to abandon it.
loading . . .
Cosine Similarity: Not the Silver Bullet We Thought It Was | Shaped Blog
In the world of machine learning and data science, cosine similarity has long been a go-to metric for measuring the semantic similarity between high-dimensional objects. However, a new study by resear...
https://www.shaped.ai/blog/cosine-similarity-not-the-silver-bullet-we-thought-it-was
11 months ago
1
32
8
reposted by
Raphael Pisoni
Jeremy Howard
12 months ago
I'll get straight to the point. We trained 2 new models. Like BERT, but modern. ModernBERT. Not some hypey GenAI thing, but a proper workhorse model, for retrieval, classification, etc. Real practical stuff. It's much faster, more accurate, longer context, and more useful. ๐งต
19
620
181
She said YES!๐ฅฐ
11 months ago
2
29
0
reposted by
Raphael Pisoni
Mark Riedl
11 months ago
O3 is costly. These numbers are for a single ARC benchmark task
add a skeleton here at some point
2
36
12
Fantastic Muse quote on a fantastic NeurIPS poster! Doesn't get much better than that!๐
add a skeleton here at some point
12 months ago
1
8
0
reposted by
Raphael Pisoni
Remi Cadene
12 months ago
HOT ๐ฅ fastest, most precise, and most capable hand control setup ever... Less than $450 and fully open-source ๐คฏ by @huggingface, @therobotstudio, @NepYope This tendon-driven technology will disrupt robotics! Retweet to accelerate its democratization ๐ A thread ๐งต
loading . . .
3
73
29
reposted by
Raphael Pisoni
Morticia (MLS, ASCP)
12 months ago
The global rise of anti-intellectualism and anti-science is directly related to the global rise of fascism and right-wing authoritarianism. The defense of truth is inherently anti-fascist.
0
57
15
Stand by while annual NeurIPS FOMO is loading...
12 months ago
1
4
0
I mean no disrespect but the timing of publishing this before moving to OpenAI, who has changed its ethical standpoint quite frequently and is by now openly deploying its tech on the battlefield is a bit unfortunate. I wish all involved parties the best though so let's hope nobody gets burned.๐ค
add a skeleton here at some point
12 months ago
2
16
0
Whaat?
add a skeleton here at some point
12 months ago
0
2
0
It's so funny that this movie was so inspiring to so many people. I also did a project to detect and decode Arrival glyph codes in university but our solution wasn't quite as revolutionary...๐
add a skeleton here at some point
about 1 year ago
1
2
0
Anyone using triplet loss (or similar) for tasks outside of ReID? What are you working on and how well does it work? Would you be interested in an improved version?
about 1 year ago
1
2
0
Wow super interesting!
add a skeleton here at some point
about 1 year ago
0
1
0
Ethical ML: some time ago I developed a method that outperforms the SOTA in vehicle re-identification. It can be useful for any few shot task but it can of course be abused eg for faceID. How do I publish this in the right way? Do i even have a chance and how? What do i have to be careful with?
about 1 year ago
0
3
0
reposted by
Raphael Pisoni
Franรงois Fleuret
about 1 year ago
I find strange that engineering the gradient radically to learn better is not a whole field.
9
54
6
Can't wait for the next social network to get popular so we can have two solid weeks of productive social discourse. But seriously: let's try to cool down a bit. Things are not perfect here and trolls will be trollin. But it's important to stay focused and don't give up on constructive discourse.
about 1 year ago
0
2
0
Hey! What are you doing reading this post? It's only supposed to be read by bots as clearly stated in my humans.txt
about 1 year ago
0
8
1
Just waiting until people find out that using the internet actually means downloading other peoples stuff. And usually nobody posts about it and deletes everything at the slightest sign of discontent.
about 1 year ago
0
7
0
I need more of this!๐
add a skeleton here at some point
about 1 year ago
0
3
0
Load more
feeds!
log in