Christopher Akiki
@cakiki.bsky.social
📤 1287
📥 76
📝 57
data and research at
@lichess.org
reposted by
Christopher Akiki
Lichess
6 days ago
We recently crossed 6 million chess puzzles in our open database. Like the platform itself, Lichess data is free and open source. Go and build something cool with it, Available wherever you get your datasets!
5
44
8
work in progress: 1 million synthetic personas from NVIDIA's Nemotron-Personas-USA dataset.
10 days ago
0
2
0
Been trying to reproduce datashader functionality using only Apache Arrow and Acero as a learning exercise. This is a 1-billion point Clifford attractor rendered in ~8s from a 9GB parquet file. (No JIT)
28 days ago
0
1
0
reposted by
Christopher Akiki
Marco
5 months ago
I'm looking for 5-10
#chess
players to test out a tool I'm building. Preferably who play on
@lichess.org
and are 1200+ in rapid or blitz. And if you coach chess at all, I'd be extra grateful to have you test it! NOTE: it is _NOT_ an "LLM chess coach" tool, I promise! 🙏
2
6
7
reposted by
Christopher Akiki
Lichess
5 months ago
Any chess position with 8 pieces on the board and at least one pair of opposed pawns has been solved! Lichess can now tell you definitively if it's a win, loss or draw with no engine required. Read about the massive technological undertaking to accomplish this partial 8 piece tablebase on our blog:
2
40
11
UMAP connectivity plots of 3,627 chess openings from the
@lichess.org
datasets (
huggingface.co/datasets/Lic...
)
5 months ago
1
6
1
Which colormap do you think looks the nicest? I'm leaning toward plasma.
5 months ago
3
0
0
Scatterplot of 4 million computer science authors, laid out according to co-authorship connections. Large blob in the bottom left are all single authors; removing them lets the plot breathe more somehow. The source of the data is the
@dblp.org
bibliography.
6 months ago
1
4
2
reposted by
Christopher Akiki
Shayne Longpre
7 months ago
Who is winning the open AI race? Our new study Economies of Open Intelligence maps
@hf.co
851k models' downloads 2020→2025. 1) Power rebalance: US tech ↓; China + community ↑ 2) Models size & efficient ↑ (MoE, quant, multimodal) 3) Intermediary layers ↑ (adapters/quantizers) 4) Transparency ↓ /🧵
1
7
3
reposted by
Christopher Akiki
Lichess
8 months ago
Researchers at Google DeepMind used our free puzzle database and reinforcement learning to train a model to generate creative chess puzzles. ➡️ Read more on this by Tom Zahavy from the DeepMind discovery team:
lichess.org/@/tomas135/b...
loading . . .
AI-Generated Chess Puzzles
A new research by the Discovery team at @GoogleDeepMind using RL and generative models to discover creative chess puzzles
https://lichess.org/@/tomas135/blog/ai-generated-chess-puzzles/j4zc0pmZ
0
12
3
Three different ways to represent colo(u)r. Work in progress, inspired by an old post by Kat Zhang / The Poet Engineer.
8 months ago
1
5
1
I made this annotated scatter plot of 1 million FineWeb-Edu documents for
@sashamtl.bsky.social
's new TED talk.
8 months ago
1
4
1
reposted by
Christopher Akiki
Xiaoyi
8 months ago
When the fish left the river:
2
150
47
Also really love how organic the plot looks with "inferno" (left) and "viridis" (right).
add a skeleton here at some point
8 months ago
0
4
1
Thanks to
@jamesabednar.bsky.social
I realized I had used the wrong background color for the colormap I had chosen. This is another version of the plot (different embeddings) with the corrected background.
add a skeleton here at some point
8 months ago
1
1
1
Map of the internet: 1.3M nodes (BGP)
8 months ago
4
30
8
reposted by
Christopher Akiki
Lichess
9 months ago
We're cooking.. đź‘€
4
19
1
526.9 million player deaths in 24.7 million levels of Super Mario Maker 2. Data by
@tgr.bsky.social
9 months ago
0
5
0
Really cool new embeddings exploration tool by
@domoritz.de
and colleagues from Apple. Can't wait to build with this. Also includes a streamlit component and a Jupyter widget.
12 months ago
1
2
0
Woah! EA just open sourced "Command and Conquer: Red Alert" and a bunch of other CnC games!
github.com/electronicar...
over 1 year ago
1
2
1
reposted by
Christopher Akiki
Lichess
over 1 year ago
Lichess is now on
@kaggle.com
! Use our puzzles, openings, and engine evaluation datasets directly in your kaggle notebooks:
https://www.kaggle.com/organizations/lichess
♟️
1
52
4
The folks at Foursquare released a
@hf.co
dataset of 104.5 million places of interest and here's all of them plotted using datashader
over 1 year ago
2
17
3
I recently used the
@lichess.org
puzzles dataset to experiment with chess position embeddings and visualize 4.5M starting positions. (
hf.co/datasets/Lic...
)
add a skeleton here at some point
over 1 year ago
2
28
5
reposted by
Christopher Akiki
Lichess
over 1 year ago
The Lichess database of games, puzzles, and engine evaluations is now onÂ
@hf.co
-
https://huggingface.co/Lichess
. Billions of chess data points to download, query, and stream and we're excited to see what you'll build with it! ♟️ 🤗
3
94
25
Early experiment visualizing of Cohere For AI's newly-released Aya dataset. Multilingual corpora are always so fun to play with.
over 2 years ago
0
2
0
Clifford-inspired strange attractor.
over 2 years ago
1
7
1
10 million digits of Pi. Kind of.
almost 3 years ago
1
2
0
835 languages. 3.5 million bible verses. Work in progress.
almost 3 years ago
1
4
0
UMAP connectivity graphs—with edgehammer bundling—are always something to gaze at.
almost 3 years ago
0
3
0
Revisiting John Williamson's prime factors plot with a few differences in implementation. I am using UMAP and Datashader to visualize the first million integers. Not quite there yet.
almost 3 years ago
0
3
0
Multilingual text corpus or Petri dish?
about 3 years ago
1
9
1
Code Dataset Visualization—11.66 million files from the Stack, a dataset sourced from permissively-licensed GitHub repositories spanning 86 programming languages (StarCoder languages subset).
about 3 years ago
1
14
5
you reached the end!!
feeds!
log in