Chandler Smith
@chansmi.bsky.social
π€ 61
π₯ 230
π 17
Multi-Agent Researcher at CAIF | applied research at IQT | Thinking about making MA systems go well
reposted by
Chandler Smith
Anka Reuel β‘οΈ NeurIPS
12 months ago
Submitting a benchmark to ICML? Check out our NeurIPS Spotlight paper BetterBench! We outline best practices for benchmark design, implementation & reporting to help shift community norms. Be part of the change! π + Add your benchmark to our database for visibility:
betterbench.stanford.edu
add a skeleton here at some point
1
12
3
reposted by
Chandler Smith
Vincent Conitzer
about 1 year ago
The 2025 Cooperative AI summer school (9-13 July 2025 near London) is now accepting applications, due March 7th!
www.cooperativeai.com/summer-schoo...
loading . . .
Cooperative AI
https://www.cooperativeai.com/summer-school/summer-school-2025
1
14
5
Very excited to read this!
add a skeleton here at some point
about 1 year ago
0
3
0
On my way to NeurIPS β24 βοΈ to present our Spotlight paper Betterbench and the Concordia Contest! Would love to connect with folks and chat anything multi-agent, agentic AI, benchmarking, etc. I am applying for fall β25 PhDs. Ping me if you have advice or there may be a fit!
about 1 year ago
1
2
0
ππ¨ Excited to announce our work on Multi-Agent LLM Training! MALT is a multi-agent configuration that leverages synthetic data generation and credit assignment strategies for post-training specialized models solving problems together
loading . . .
about 1 year ago
1
1
0
π Check out our
@neuripsconf.bsky.social
Spotlight paper Betterbench, which outlines new standards in benchmarking AI! Delighted to have it featured in
@techreviewjp.bsky.social
add a skeleton here at some point
about 1 year ago
1
2
0
reposted by
Chandler Smith
Anka Reuel β‘οΈ NeurIPS
about 1 year ago
π¨ NeurIPS 2024 Spotlight Did you know we lack standards for AI benchmarks, despite their role in tracking progress, comparing models, and shaping policy? π€― Enter BetterBenchβour framework with 46 criteria to assess benchmark quality:
betterbench.stanford.edu
1/x
5
139
32
you reached the end!!
feeds!
log in