Apoorv Khandelwal
@apoorvkh.com
š¤ 820
š„ 199
š 16
cs phd student at brown
https://apoorvkh.com
Will be at ACL this week!
#ACL2025
#ACL2025NLP
Presenting Tian Yunās paper on abstract reasoners at CoNLL on Thursday. Iāve been investigating how LLMs internally compose functions lately. Happy to chat about that (among other things) and hang out in Vienna!
2 months ago
1
0
0
reposted by
Apoorv Khandelwal
Eugene Vinitsky š
6 months ago
Tests on USAMO immediately after problems were posted yield surprisingly bad model performance. Suggests there's much more training on test than expected.
arxiv.org/abs/2503.219...
7
29
8
reposted by
Apoorv Khandelwal
Dr Sasha Luccioni
6 months ago
Just read that AIās energy consumption in data centers is nothing to be worried about because most of the hyperscale datacenters running AI are "powered by renewable energy or low-carbon nuclear power." Let's debunk that, shall we?
2
30
13
reposted by
Apoorv Khandelwal
Naomi Saphra
6 months ago
If you're in the northeastern US and you're submitting a paper to COLM on March 27, you should absolutely be sending its abstract to New England NLP on March 28.
loading . . .
New England NLP Meeting Series
https://nenlp.github.io/spr2025/
0
7
3
We made a library (torchrunx) to make multi-GPU / multi-node PyTorch easier, more robust, and more modular! š§µ
github.com/apoorvkh/tor...
Docs:
torchrun.xyz
`(uv) pip install torchrunx` today! (w/ the very talented, Peter Curtin, Brown CS '25)
loading . . .
GitHub - apoorvkh/torchrunx: Easily run PyTorch on multiple GPUs & machines
Easily run PyTorch on multiple GPUs & machines. Contribute to apoorvkh/torchrunx development by creating an account on GitHub.
https://github.com/apoorvkh/torchrunx
7 months ago
1
4
1
reposted by
Apoorv Khandelwal
William Merrill
7 months ago
āØHow does the depth of a transformer affect its reasoning capabilities? New preprint by myself and @Ashish_S_AI shows that a little depth goes a long way to increase transformersā expressive power We take this as encouraging for further research on looped transformers!š§µ
1
12
2
reposted by
Apoorv Khandelwal
Sonia Murthy
8 months ago
(1/9) Excited to share my recent work on "Alignment reduces LM's conceptual diversity" with
@tomerullman.bsky.social
and
@jennhu.bsky.social
, to appear at
#NAACL2025
! š We want models that match our values...but could this hurt their diversity of thought? Preprint:
arxiv.org/abs/2411.04427
2
63
14
I started a blog! First post is everything I know about setting up (fast, reproducible, error-proof) Python project environments using the latest tools. These methods have saved me a lot of grief. Also a short guide to CUDA in appendix :)
blog.apoorvkh.com/posts/projec...
loading . . .
Managing Project Dependencies
https://blog.apoorvkh.com/posts/project-dependencies.html
8 months ago
0
3
1
reposted by
Apoorv Khandelwal
James Tompkin
9 months ago
Can GANs compete in 2025? In 'The GAN is dead; long live the GAN! A Modern GAN Baseline', we show that a minimalist GAN w/o any tricks can match the performance of EDM with half the size and one-step generation -
github.com/brownvc/r3gan
- work of Nick Huang,
@skylion.bsky.social
, Volodymyr Kuleshov
3
69
15
A couple sources for academic talks that I really like! Cohere For AI (
www.youtube.com/playlist?lis...
) Simons Institute (
www.youtube.com/@SimonsInsti...
)
loading . . .
Simons Institute
The Simons Institute for the Theory of Computing is the world's leading venue for collaborative research in theoretical computer science. Established on July 1, 2012, the Institute is housed in Calvin...
https://www.youtube.com/@SimonsInstituteTOC/streams
9 months ago
0
2
0
reposted by
Apoorv Khandelwal
Naomi Saphra
9 months ago
Let he who hath not \usepackage[subtle]{savetrees}
1
13
1
reposted by
Apoorv Khandelwal
Jennifer Hu
10 months ago
Slides from the tutorial are now posted here!
neurips.cc/media/neurip...
loading . . .
https://neurips.cc/media/neurips-2024/Slides/99528_aXgzqdX.pdf
0
17
7
reposted by
Apoorv Khandelwal
Alexander Doria
10 months ago
āThey said it could not be doneā. Weāre releasing Pleias 1.0, the first suite of models trained on open data (either permissibly licensed or uncopyrighted): Pleias-3b, Pleias-1b and Pleias-350m, all based on the two trillion tokens set from Common Corpus.
11
250
104
reposted by
Apoorv Khandelwal
Ben Lipkin
10 months ago
Lots of folks talking about scaling LLM inference over this last year Internally, Iāve been developing and using a library that makes this extremely easy, and I decided to open-source it Meet the decoding library:
github.com/benlipkin/de...
1/7
loading . . .
GitHub - benlipkin/decoding: Composable inference algorithms with LLMs and programmable logic
Composable inference algorithms with LLMs and programmable logic - benlipkin/decoding
https://github.com/benlipkin/decoding
1
26
5
reposted by
Apoorv Khandelwal
Joe Stacey
10 months ago
Okay genius idea to improve quality of
#nlp
#arr
reviews. Literally give gold stars to the best reviewers, visible on open review next to your anonymously ID during review process. Hereās why it would work, and why would you should RT this fab idea:
3
27
6
reposted by
Apoorv Khandelwal
Hamish Ivison
10 months ago
Excited to release Tulu 3! We worked hard to try and make the best open post-training recipe we could, and the results are good! I was lucky enough to work on almost every stage of the pipeline in one way or another. Some comments + highlights ā¬ļø
1
9
5
Nature wrote a nice article about our work!
www.nature.com/articles/d41...
loading . . .
AIās computing gap: academics lack access to powerful chips needed for research
Survey highlights disparity between academic and industry scientistsā access to computing power needed to train machine-learning models.
https://www.nature.com/articles/d41586-024-03792-6
10 months ago
1
11
1
you reached the end!!
feeds!
log in