Akshita Bhagia
@akshitab.bsky.social
π€ 2192
π₯ 146
π 5
Research Engineer at Ai2
https://akshitab.github.io/
reposted by
Akshita Bhagia
Ai2
28 days ago
Announcing Olmo 3, a leading fully open LM suite built for reasoning, chat, & tool use, and an open model flowβnot just the final weights, but the entire training journey. Best fully open 32B reasoning model & best 32B base model. π§΅
1
70
21
reposted by
Akshita Bhagia
Ai2
3 months ago
The Cancer AI Alliance (CAIA) is already prototyping Asta DataVoyager in a federated, multi-institution setup for cancer studiesβkeeping clinical data local and secure. Read more about CAIA here:
buff.ly/ACpxLNT
1
3
2
reposted by
Akshita Bhagia
Ai2
5 months ago
Introducing FlexOlmo, a new paradigm for language model training that enables the co-development of AI through data collaboration. π§΅
loading . . .
1
15
8
reposted by
Akshita Bhagia
Ai2
9 months ago
Announcing OLMo 2 32B: the first fully open model to beat GPT 3.5 & GPT-4o mini on a suite of popular, multi-skill benchmarks. Comparable to best open-weight models, but a fraction of training compute. When you have a good recipe, β¨ magical things happen when you scale it up!
3
58
18
reposted by
Akshita Bhagia
Luca Soldaini π
10 months ago
They made me do video π¬ but for a good reason! We are launching an iOS appβit runs OLMoE locally π± We're gonna see more on-device AI in 2025, and wanted to offer a simple way to prototype with it App:
apps.apple.com/us/app/ai2-o...
Code:
github.com/allenai/OLMo...
Blog:
allenai.org/blog/olmoe-app
add a skeleton here at some point
5
48
16
reposted by
Akshita Bhagia
Kyle Lo
12 months ago
kicking off 2025 with our OLMo 2 tech report while payin homage to the sequelest of sequels π«‘ π 2 OLMo 2 Furious π₯ is everythin we learned since OLMo 1, with deep dives into: π stable pretrain recipe π lr anneal π€ data curricula π€ soups π tulu post-train recipe π compute infra setup ππ§΅
2
69
18
reposted by
Akshita Bhagia
Jiacheng Liu
about 1 year ago
Want to predict the task performance of LMs before pretraining them? We develop task scaling laws and model ladders, which predict the accuracy on individual tasks by OLMo 2 7B & 13B models within 2 points of absolute error. The cost is 1% of the compute used to pretrain them.
2
33
14
reposted by
Akshita Bhagia
Ai2
about 1 year ago
Meet OLMo 2, the best fully open language model to date, including a family of 7B and 13B models trained up to 5T tokens. OLMo 2 outperforms other fully open models and competes with open-weight models like Llama 3.1 8B β As always, we released our data, code, recipes and more π
5
151
48
reposted by
Akshita Bhagia
Luca Soldaini π
almost 2 years ago
release day release day π₯³ OLMo 1b +7b out today and 65b soon... OLMo accelerates the study of LMs. We release *everything*, from toolkit for creating data (Dolma) to train/inf code blog
blog.allenai.org/olmo-open-la...
olmo paper
allenai.org/olmo/olmo-pa...
dolma paper
allenai.org/olmo/dolma-p...
loading . . .
OLMo: Open Language Model
A State-Of-The-Art, Truly Open LLM and Framework
https://blog.allenai.org/olmo-open-language-model-87ccfc95f580
1
29
16
reposted by
Akshita Bhagia
Ian Magnusson
almost 2 years ago
LMs are used to process text from many topics, styles, dialects, etc., but how well do they do? π Evaluating perplexity on just one corpus like C4 doesn't tell the whole story π β¨πβ¨ We introduce Paloma, a benchmark of 585 domains from NY Times to r/depression on Reddit.
1
17
8
you reached the end!!
feeds!
log in