Akshita Bhagia
@akshitab.bsky.social
π€ 2190
π₯ 144
π 4
Research Engineer at Ai2
https://akshitab.github.io/
reposted by
Akshita Bhagia
Ai2
13 days ago
The Cancer AI Alliance (CAIA) is already prototyping Asta DataVoyager in a federated, multi-institution setup for cancer studiesβkeeping clinical data local and secure. Read more about CAIA here:
buff.ly/ACpxLNT
1
3
2
reposted by
Akshita Bhagia
Ai2
3 months ago
Introducing FlexOlmo, a new paradigm for language model training that enables the co-development of AI through data collaboration. π§΅
loading . . .
1
15
8
reposted by
Akshita Bhagia
Ai2
7 months ago
Announcing OLMo 2 32B: the first fully open model to beat GPT 3.5 & GPT-4o mini on a suite of popular, multi-skill benchmarks. Comparable to best open-weight models, but a fraction of training compute. When you have a good recipe, β¨ magical things happen when you scale it up!
3
58
18
reposted by
Akshita Bhagia
Luca Soldaini π
8 months ago
They made me do video π¬ but for a good reason! We are launching an iOS appβit runs OLMoE locally π± We're gonna see more on-device AI in 2025, and wanted to offer a simple way to prototype with it App:
apps.apple.com/us/app/ai2-o...
Code:
github.com/allenai/OLMo...
Blog:
allenai.org/blog/olmoe-app
add a skeleton here at some point
5
48
16
reposted by
Akshita Bhagia
Kyle Lo
9 months ago
kicking off 2025 with our OLMo 2 tech report while payin homage to the sequelest of sequels π«‘ π 2 OLMo 2 Furious π₯ is everythin we learned since OLMo 1, with deep dives into: π stable pretrain recipe π lr anneal π€ data curricula π€ soups π tulu post-train recipe π compute infra setup ππ§΅
2
69
18
reposted by
Akshita Bhagia
Jiacheng Liu
10 months ago
Want to predict the task performance of LMs before pretraining them? We develop task scaling laws and model ladders, which predict the accuracy on individual tasks by OLMo 2 7B & 13B models within 2 points of absolute error. The cost is 1% of the compute used to pretrain them.
2
33
14
reposted by
Akshita Bhagia
Ai2
11 months ago
Meet OLMo 2, the best fully open language model to date, including a family of 7B and 13B models trained up to 5T tokens. OLMo 2 outperforms other fully open models and competes with open-weight models like Llama 3.1 8B β As always, we released our data, code, recipes and more π
5
151
48
reposted by
Akshita Bhagia
Luca Soldaini π
over 1 year ago
release day release day π₯³ OLMo 1b +7b out today and 65b soon... OLMo accelerates the study of LMs. We release *everything*, from toolkit for creating data (Dolma) to train/inf code blog
blog.allenai.org/olmo-open-la...
olmo paper
allenai.org/olmo/olmo-pa...
dolma paper
allenai.org/olmo/dolma-p...
loading . . .
OLMo: Open Language Model
A State-Of-The-Art, Truly Open LLM and Framework
https://blog.allenai.org/olmo-open-language-model-87ccfc95f580
1
29
16
reposted by
Akshita Bhagia
Ian Magnusson
almost 2 years ago
LMs are used to process text from many topics, styles, dialects, etc., but how well do they do? π Evaluating perplexity on just one corpus like C4 doesn't tell the whole story π β¨πβ¨ We introduce Paloma, a benchmark of 585 domains from NY Times to r/depression on Reddit.
1
17
8
you reached the end!!
feeds!
log in