Stefano Esposito
@s-esposito.bsky.social
๐ค 293
๐ฅ 400
๐ 8
phd student @ uni tรผbingen computer vision
https://s-esposito.github.io/
reposted by
Stefano Esposito
Vision and Graphics Trends
25 days ago
๐ฏ๐-๐๐๐ง๐ง๐: ๐๐ฎ๐๐ฒ๐ป๐ ๐ฆ๐ฝ๐ฎ๐ฐ๐ฒ ๐ฏ๐ ๐๐ฑ๐ถ๐๐ถ๐ป๐ด ๐ณ๐ฟ๐ผ๐บ ๐ง๐ฒ๐ ๐๐๐ฎ๐น ๐๐ป๐๐๐ฟ๐๐ฐ๐๐ถ๐ผ๐ป๐ Maria Parelli, Michael Oechsle, Michael Niemeyer ... Andreas Geiger
arxiv.org/abs/2509.00269
Trending on
www.scholar-inbox.com
0
3
3
reposted by
Stefano Esposito
Andreas Geiger
26 days ago
ellis.eu/news/ellis-p...
loading . . .
ELLIS PhD Program: Call for Applications 2025
The ELLIS mission is to create a diverse European network that promotes research excellence and advances breakthroughs in AI, as well as a pan-European PhD program to educate the next generation of AI...
https://ellis.eu/news/ellis-phd-program-call-for-applications-2025
0
16
9
reposted by
Stefano Esposito
about 1 month ago
๐ Introducing our new paper, MDPO: Overcoming the Training-Inference Divide of Masked Diffusion Language Models. ๐ Paper:
www.scholar-inbox.com/papers/He202...
arxiv.org/pdf/2508.13148
๐ป Code:
github.com/autonomousvi...
๐ Project Page:
cli212.github.io/MDPO/
1
12
10
reposted by
Stefano Esposito
Andreas Geiger
2 months ago
Today, we moved into our new building on the CyberValley campus. Everyone is super excited. PhD students went right back to work. But wait, is there something missing? ;)
1
35
1
reposted by
Stefano Esposito
Andreas Geiger
2 months ago
Today we had our AVG Deep Cave Expedition Day! Exploring the challenges of the (unlit, narrow, crawling-only) Hofener Hรถhle near Grabenstetten ..
0
19
1
reposted by
Stefano Esposito
Zhenjun Zhao
2 months ago
SpatialTrackerV2: 3D Point Tracking Made Easy Yuxi Xiao,
@jianyuanwang.bsky.social
, Nan Xue,
@nikkar.bsky.social
, Yuri Makarov, Bingyi Kang, Xing Zhu, Hujun Bao, Yujun Shen, Xiaowei Zhou tl;dr: DAv2+VGGT->depths & poses->iterative cross-attention-based optimizer
arxiv.org/abs/2507.12462
0
2
2
reposted by
Stefano Esposito
Andreas Geiger
3 months ago
In case you find it as relaxing as we do: Here is a 2h+ video of our autonomous RL driving agent CaRL in action!
@danieldauner.bsky.social
@bernhard-jaeger.bsky.social
@kashyap7x.bsky.social
youtube.com/watch?v=_god...
loading . . .
CaRL: Learning Scalable Planning Policies with Simple Rewards
YouTube video by Daniel Dauner
https://youtube.com/watch?v=_godUKkICec&si=bEKND8vKzI9y7-6b
0
21
6
reposted by
Stefano Esposito
Claire Vernade
3 months ago
At
#ICML
, you can just use scholar inbox to help you find your way through the poster sessions. It just sorts the papers according to your preferences and it really works.
www.scholar-inbox.com/conference/i...
ICML 2025 Planner
3
50
8
reposted by
Stefano Esposito
Onno Eberhard
2 months ago
I am in Vancouver at ICML, and tomorrow I will present our newest paper "Partially Observable Reinforcement Learning with Memory Traces". We argue that eligibility traces are more effective than sliding windows as a memory mechanism for RL in POMDPs. ๐งต
3
59
15
reposted by
Stefano Esposito
Bernhard Jaeger
3 months ago
We have released the code for our work, CaRL: Learning Scalable Planning Policies with Simple Rewards. The repository contains the first public code base for training RL agents with the CARLA leaderboard 2.0 and nuPlan.
github.com/autonomousvi...
loading . . .
GitHub - autonomousvision/CaRL: [ArXiv 2025] CaRL: Learning Scalable Planning Policies with Simple Rewards
[ArXiv 2025] CaRL: Learning Scalable Planning Policies with Simple Rewards - autonomousvision/CaRL
https://github.com/autonomousvision/CaRL
0
20
9
reposted by
Stefano Esposito
Mehdi S. M. Sajjadi
3 months ago
Scaling 4D Representations Self-supervised learning from video does scale! In our latest work, we scaled masked auto-encoding models to 22B params, boosting performance on pose estimation, tracking & more. Paper:
arxiv.org/abs/2412.15212
Code & models:
github.com/google-deepmind/representations4d
0
20
8
reposted by
Stefano Esposito
Vision and Graphics Trends
3 months ago
๐๐ฒ๐ผ๐บ๐ฒ๐๐ฟ๐-๐ฎ๐๐ฎ๐ฟ๐ฒ ๐ฐ๐ ๐ฉ๐ถ๐ฑ๐ฒ๐ผ ๐๐ฒ๐ป๐ฒ๐ฟ๐ฎ๐๐ถ๐ผ๐ป ๐ณ๐ผ๐ฟ ๐ฅ๐ผ๐ฏ๐ผ๐ ๐ ๐ฎ๐ป๐ถ๐ฝ๐๐น๐ฎ๐๐ถ๐ผ๐ป Zeyi Liu, Shuang Li, Eric Cousineau ... Shuran Song
arxiv.org/abs/2507.01099
Trending on
www.scholar-inbox.com
0
4
4
reposted by
Stefano Esposito
Zhenjun Zhao
3 months ago
MoGe-2: Accurate Monocular Geometry with Metric Scale and Sharp Details Ruicheng Wang, Sicheng Xu, Yue Dong, Yu Deng, Jianfeng Xiang, Zelong Lv, Guangzhong Sun, Xin Tong, Jiaolong Yang
arxiv.org/abs/2507.02546
1
5
4
reposted by
Stefano Esposito
Andreas Geiger
3 months ago
I am very proud of my group! These are the nationalities of my current and past team members. Diversity is key. ๐ฉ๐ช ๐ฌ๐ท ๐ฎ๐น ๐ฎ๐ณ ๐ท๐บ ๐บ๐ฆ ๐จ๐ณ ๐ท๐ธ ๐ฏ๐ต ๐ง๐ช ๐บ๐ธ ๐ฐ๐ท ๐น๐ท
1
49
1
reposted by
Stefano Esposito
#CVPR2025
3 months ago
Thatโs a wrap on
#CVPR2025
in Nashville! From online convos to in-person vibes, one thingโs clear: this community is STRONG ๐ช Thanks for following along! Until next time.
@deblinaml.bsky.social
,
@jbhaurum.bsky.social
,
@csprofkgd.bsky.social
signing off.
add a skeleton here at some point
0
12
3
reposted by
Stefano Esposito
Melanie Mitchell
4 months ago
LLM product placement and search optimization is here and it's as dystopian as you expected.
3
37
11
Hey
#CVPR2025
! Curious about this work? I'll be presenting it this morning! Poster 31, from 10:30 to 12:30 ๐ค
@cvprconference.bsky.social
add a skeleton here at some point
4 months ago
0
7
1
reposted by
Stefano Esposito
Angela Dai
4 months ago
Check out the ScanNet++ workshop @CVPR on June 12 in 211 from 8:50am! Exciting keynotes on state-of-the-art NVS & 3D understanding from Andrea Vedaldi, Cordelia Schmid, Gordon Wetzstein, Katja Schwarz, Qianqian Wang, and leading methods on the benchmark!
kaldir.vc.in.tum.de/scannetpp/cv...
0
14
6
reposted by
Stefano Esposito
Elliott / Shangzhe Wu
4 months ago
Join us for the 4D Vision Workshop
#CVPR
on June 11 starting at 9:20am! We'll have an incredible lineup of speakers discussing the frontier of 3D computer vision techniques for dynamic world modeling across spatial AI, robotics, astrophysics, and more.
4dvisionworkshop.github.io
0
9
4
reposted by
Stefano Esposito
Ilya Chugunov
4 months ago
This Wednesday (1-6PM, Room 106A) at CVPR
@cvprconference.bsky.social
we have a great lineup of keynote speakers, posters, and spotlights on neural fields and beyond:
neural-bcc.github.io
Have a question you want answered by a panel of experts in the field? Send it to us via:
tinyurl.com/bdddf36f
0
11
3
reposted by
Stefano Esposito
Haofei Xu
4 months ago
Excited to present our
#CVPR2025
paper DepthSplat next week! DepthSplat is a feed-forward model that achieves high-quality Gaussian reconstruction and view synthesis in just 0.6 seconds. Looking forward to great conversations at the conference!
add a skeleton here at some point
3
27
7
reposted by
Stefano Esposito
Kashyap Chitta
4 months ago
๐ Pseudo-simulation combines the efficiency of open-loop and robustness of closed-loop evaluation. It uses real data + 3D Gaussian Splatting synthetic views to assess error recovery, achieving strong correlation with closed-loop simulations while requiring 6x less compute.
arxiv.org/abs/2506.04218
0
22
11
reposted by
Stefano Esposito
Matthias Niessner
4 months ago
๐๐๐Announcing our $13M funding round to build the next generation of AI: ๐๐ฉ๐๐ญ๐ข๐๐ฅ ๐ ๐จ๐ฎ๐ง๐๐๐ญ๐ข๐จ๐ง ๐๐จ๐๐๐ฅ๐ฌ that can generate entire 3D environments anchored in space & time. ๐๐๐ Interested? Join our world-class team: ๐
spaitial.ai
youtu.be/FiGX82RUz8U
loading . . .
SpAItial AI: Building Spatial Foundation Models
YouTube video by SpAItial AI
https://youtu.be/FiGX82RUz8U
4
53
9
"ILM "artists" are now being paid to make shimpanzini bananini and bombardiro crocodilo"
add a skeleton here at some point
5 months ago
0
0
0
reposted by
Stefano Esposito
Felix Wimbauer
5 months ago
Can you train a model for pose estimation directly on casual videos without supervision? Turns out you can! In our
#CVPR2025
paper AnyCam, we directly train on YouTube videos and achieve SOTA results by using an uncertainty-based flow loss and monocular priors! โฌ๏ธ
loading . . .
1
25
11
reposted by
Stefano Esposito
hardmaru
5 months ago
New Paper: Continuous Thought Machines
pub.sakana.ai/ctm/
Neurons in brains use timing and synchronization in the way that they compute, but this is largely ignored in modern neural nets. We believe neural timing is key for the flexibility and adaptability of biological intelligence. Thread โ
add a skeleton here at some point
4
129
30
reposted by
Stefano Esposito
Katrin Renz
5 months ago
๐ฃ Excited to share our
#CVPR2025
Spotlight paper and my internship project @wayve: SimLingo. A Vision-Language-Action (VLA) model that achieves state-of-the-art driving performance with language capabilities. Code:
github.com/RenzKa/simli...
Paper:
arxiv.org/abs/2503.09594
loading . . .
1
25
9
๐ข New paper CVPRโฏ25! Can meshes capture fuzzy geometry? VolumetricโฏSurfaces uses adaptive textured shells to model hair, furโฏwithout the splatting / volume overhead. Itโs fast, looks great, and runs in real time even on budget phones. ๐
autonomousvision.github.io/volsurfs/
๐
arxiv.org/pdf/2409.02482
loading . . .
5 months ago
1
28
21
reposted by
Stefano Esposito
European Commission
5 months ago
"Science is an investment. We will put forward a new 500 million package for 2025-2027 to support the best and the brightest researchers and scientists from Europe and around the world." โ President
@vonderleyen.ec.europa.eu
at the โChoose Europe for Science' event at La Sorbonne ๐ซ๐ท
loading . . .
35
978
356
reposted by
Stefano Esposito
Bernhard Jaeger
5 months ago
Introducing CaRL: Learning Scalable Planning Policies with Simple Rewards We show how simple rewards enable scaling up PPO for planning. CaRL outperforms all prior learning-based approaches on nuPlan Val14 and CARLA longest6 v2, using less inference compute.
arxiv.org/abs/2504.17838
0
25
15
reposted by
Stefano Esposito
Haiwen Huang
5 months ago
Sharing another video showing how LoftUp significantly improves DINOv2 features! Works like a charm! Try it out: Code:
github.com/andrehuang/l...
Paper:
arxiv.org/abs/2504.14032
loading . . .
0
9
2
reposted by
Stefano Esposito
Andreas Geiger
5 months ago
New CVPR paper by
@s-esposito.bsky.social
in collaboration with Peter Kontschieder's team at Meta.
0
3
1
reposted by
Stefano Esposito
Andreas Geiger
5 months ago
Can we represent fuzzy geometry with meshes? "Volumetric Surfaces" uses layered meshes to represent the look of hair, fur & more without the splatting/volume overhead. Fast, pretty, and runs in real-time on your laptop! ๐
autonomousvision.github.io/volsurfs/
๐
arxiv.org/pdf/2409.02482
1
10
3
reposted by
Stefano Esposito
Jon Barron
6 months ago
A thread of thoughts on radiance fields, from my keynote at 3DV: Radiance fields have had 3 distinct generations. First was NeRF: just posenc and a tiny MLP. This was slow to train but worked really well, and it was unusually compressed --- The NeRF was smaller than the images.
2
92
22
reposted by
Stefano Esposito
Andrew Davison
7 months ago
I remember seeing this drone video a few years ago and thinking "we'll never run SLAM on that".... but here it is, complete with dense reconstruction (single camera, unknown calibration, no IMU). MASt3R-SLAM is absurdly robust.
add a skeleton here at some point
2
35
6
reposted by
Stefano Esposito
Roland Meyer
7 months ago
Apparently, OpenAI is already aligning its LLM to the expectations of the new fascist government โ racism, lies, and conspiracy theories will now be sold as ยซmultiple perspectives on controversial subjectsยป
techcrunch.com/2025/02/16/o...
loading . . .
OpenAI tries to 'uncensor' ChatGPT | TechCrunch
OpenAI is changing how it trains AI models to explicitly embrace "intellectual freedom โฆ no matter how challenging or controversial a topic may be," the
https://techcrunch.com/2025/02/16/openai-tries-to-uncensor-chatgpt/?utm_source=dlvr.it&utm_medium=bluesky
28
606
331
reposted by
Stefano Esposito
Chris Offner
8 months ago
Trying out monocular depth estimation models on abstract images to see what priors they learned. A human might interpret the top as sky and the bottom as a ground, thus giving the top a constant large (blue) depth and the ground a vertical gradient from low (red) to large depths. The model doesn't.
3
36
3
reposted by
Stefano Esposito
Ben Stiller
8 months ago
film still #32 Goats
#severance
photo credit: Trudy Buck
134
4490
274
reposted by
Stefano Esposito
Punish the Villains
8 months ago
It's ok to use chatbots for everyday monotony like writing your wedding vows because this frees you up to use your creative energy for what truly matters: making powerpoints for work meetings
98
10015
713
reposted by
Stefano Esposito
ian bremmer
8 months ago
checking in on deepseek
50
491
144
reposted by
Stefano Esposito
Karen Hao
8 months ago
As someone who has reported on AI for 7 years and covered China tech as well, I think the biggest lesson to be drawn from DeepSeek is the huge cracks it illustrates with the current dominant paradigm of AI development. A long thread. 1/
212
6209
3113
reposted by
Stefano Esposito
Andreas Geiger
9 months ago
This week we had our winter retreat jointly with Daniel Cremer's group in Montafon, Austria. 46 talks, 100 Km of slopes and night sledding with some occasionally lost and found. It has been fun!
0
72
12
reposted by
Stefano Esposito
Thomas Wimmer
9 months ago
We propose MET3R, a new metric for measuring multi-view consistency in generated images. Our method is built upon DUSt3R and evaluates the consistency of projected DINO features between two views. It is able to accurately capture the 3D consistency in generated images.
1
5
2
reposted by
Stefano Esposito
Andreas Geiger
9 months ago
Excited to share that today our paper recommender platform
www.scholar-inbox.com
has reached 20k users! We hope to reach 100k by the end of the year.. Lots of new features are being worked on currently and rolled out soon.
12
190
34
reposted by
Stefano Esposito
Donato Crisostomi โ๏ธ NeurIPS
9 months ago
๐ขPrepend โSingularโ to โTask Vectorsโ and get +15% average accuracy for free! 1. Perform a low-rank approximation of layer-wise task vectors. 2. Minimize task interference by orthogonalizing inter-task singular vectors. ๐งต(1/6)
loading . . .
1
4
4
reposted by
Stefano Esposito
Rosa Ritunnano
9 months ago
Instead of listing my publications, as the year draws to an end, I want to shine the spotlight on the commonplace assumption that productivity must always increase. Good research is disruptive and thinking time is central to high quality scholarship and necessary for disruptive research.
21
1155
432
reposted by
Stefano Esposito
Eugene Vinitsky ๐
9 months ago
A short list of tips for keeping a clean, organized ML codebase for new researchers:
eugenevinitsky.com/posts/quick-...
loading . . .
Eugene Vinitsky
http://eugenevinitsky.com/posts/quick-software-tips/
12
134
32
reposted by
Stefano Esposito
Serge Belongie
9 months ago
Computer Vision: Fact & Fiction is now available on YouTube ๐๐ผ I made a playlist for it with the seven chapters. Enjoy this time capsule from two decades ago!
add a skeleton here at some point
4
58
20
reposted by
Stefano Esposito
Kashyap Chitta
10 months ago
Introducing CARLA Garage, a starter kit for developing algorithms for the challenging new CARLA Leaderboard 2.0! Everything you need to step in to autonomous driving research, open-sourced: expert driver, dataset, pretrained models, evaluation, and training scripts. ๐
github.com/autonomousvi...
1
11
5
reposted by
Stefano Esposito
arxiv cs.CV
about 1 year ago
Stefano Esposito, Anpei Chen, Christian Reiser, Samuel Rota Bul\`o, Lorenzo Porzi, Katja Schwarz, Christian Richardt, Michael Zollh\"ofer, Peter Kontschieder, Andreas Geiger Volumetric Surfaces: Representing Fuzzy Geometries with Multiple Meshes
https://arxiv.org/abs/2409.02482
0
3
1
Load more
feeds!
log in