Wayne
@waynechi.bsky.social
📤 34
📥 170
📝 17
CS Ph.D. at CMU. Building Copilot Arena. Editor at
http://blog.ml.cmu.edu
reposted by
Wayne
Chris Donahue
about 1 year ago
Inaugurating new acct to share work from my PhD student! Wayne et al have been running a live eval platform Copilot Arena - a VSCode extension serving code completions from AI systems to real developers. See 🧵 for findings and preprint Excited to be evaluating human-AI *workflows* holistically!
add a skeleton here at some point
0
10
3
What do developers 𝘳𝘦𝘢𝘭𝘭𝘺 think of AI coding assistants? In October, we launched Copilot Arena to collect user preferences on real dev workflows. After months of live service, we’re here to share our findings in our recent preprint. Here's what we have learned /🧵
about 1 year ago
1
1
2
Got to test out InceptionAILab's newest model, Mercury Coder Mini, on Copilot Arena! Mercury Coder Mini is blazing fast and overtakes Codestral as the fastest coding model out there (0.24s end-to-end latency) while boasting similar performance (joint #2). Congrats to InceptionAILabs! 📸
about 1 year ago
0
1
0
I had the same problem. I only use cursor for newer, small projects. I use Copilot Arena's edit feature for projects in VSCode (but obviously I'm biased)
add a skeleton here at some point
about 1 year ago
0
1
0
Deepseek v3 (FiM) is now available in Copilot Arena for free! Download at
lmarena.ai/copilot
about 1 year ago
0
0
0
These lists are better than most "2024's best games" lists
add a skeleton here at some point
over 1 year ago
0
0
0
Copilot Arena's leaderboard is now live on
lmarena.ai/leaderboard
! We've collected over 15k votes on 11 models (2 new models since our last blogpost release). Congrats
@deepseek.bsky.social
🥇and
@anthropic.com
🥇!
loading . . .
Chatbot Arena (formerly LMSYS): Free AI Chat to Compare & Test Best AI Chatbots
https://lmarena.ai/leaderboard
over 1 year ago
0
0
0
I'm not physically at NeurIPS, but my good friend
@naveenraman.bsky.social
will be presenting in my stead. In this work, we found that UI element ordering significantly affected GUI agent performance. Come check out the poster (and quiz Naveen) at the Workshop on Open-World Agents (OWA-2024)!
over 1 year ago
0
0
0
Bruh what... 💀
over 1 year ago
0
0
0
We've open sourced CopilotArena’s server code! Check out how we handle code completions and share your ideas for new system prompts! Github:
github.com/lmarena/copi...
Technical details in the blog:
blog.lmarena.ai/blog/2024/co...
Download Copilot now at:
lmarena.ai/copilot
over 1 year ago
0
0
0
Trying out Bluesky. Will mostly be posting about Copilot Arena!
over 1 year ago
0
0
0
you reached the end!!
feeds!
log in