Tokenization Workshop (TokShop) @ICML2025
@tokshop.bsky.social
π€ 94
π₯ 11
π 25
Let's Talk about Tokenization
https://tokenization-workshop.github.io
π₯ Videos of our invited talks and the panel discussion are now also available on YouTube:
www.youtube.com/@tokenizatio...
βΆοΈ
add a skeleton here at some point
19 days ago
0
6
2
π₯ Videos from our Tokenization Workshop are now live! Watch invited talks, panel discussions, and the best paper presentation at
icml.cc/virtual/2025...
#Tokenization
#NLP
#LLMs
loading . . .
Tokenization Workshop (TokShop)ICML 2025
https://icml.cc/virtual/2025/workshop/39998
28 days ago
1
16
8
π Announcing our Best Paper Awards! π₯ Winner: "BPE Stays on SCRIPT: Structured Encoding for Robust Multilingual Pretokenization"
openreview.net/forum?id=AO7...
π₯ Runner-up: "One-D-Piece: Image Tokenizer Meets Quality-Controllable Compression"
openreview.net/forum?id=lC4...
Congrats! π
2 months ago
0
14
1
π₯ The Tokenization Workshop is happening NOW, and we have a packed room! It's great to see so much interest in tokenization research.
#ICML2025
#Tokenization
#LLM
#NLP
2 months ago
0
11
0
Three invited speakers will share their insights at TokShop! Hear from Yuval Pinter
@uvp.bsky.social
, Desmond Elliott
@delliott.bsky.social
, and Adrian ΕaΕcuck on cutting-edge tokenization research. Don't miss these keynote presentations!
#ICML2025
tokenization-workshop.github.io/speakers
2 months ago
0
9
2
π€ Meet our expert panelists! Join Albert Gu, Alisa Liu, Kris Cao, Sander Land, and Yuval Pinter as they discuss the Future of Tokenization on July 18 at 3:30 PM at TokShop at
#ICML2025
.
2 months ago
0
7
4
The TokShop schedule is now live! Join us at
#ICML2025
for invited talks, poster sessions, and a panel on the future of tokenization.
tokenization-workshop.github.io/schedule
#Tokenization
#LLM
#NLP
2 months ago
0
11
4
TokShop @
#ICML2025
got way more submissions than expected! π We could really use a few more reviewers to help out. If you have the capacity to review a
#tokenization
paper by Saturday, please fill out this form:
forms.gle/32A6sQHQrMSb...
π
loading . . .
TokShop 2025
Registering interest in all things tokenization at TokShop @ ICML 2025 (July 18) Consider joining the Google group for future updates! https://groups.google.com/g/tokshop
https://forms.gle/32A6sQHQrMSb6hpE9
4 months ago
0
0
6
π£ We extend the submission deadline by 24 hours to avoid conflict with ACL camera-ready deadline. π New Submission Deadline: May 31, 2025 (23:59 AoE) π© OpenReview:
openreview.net/group?id=ICM...
add a skeleton here at some point
4 months ago
0
1
1
Got a good tokenization paper under review at COLM, but the scores were a letdown? π¬ Why bother with rebuttal when the perfect venue is right around the corner! Submit your paper to the
#ICML2025
Tokenization Workshop (TokShop) by May 30! π
4 months ago
0
11
4
Beyond text: Modern AI tokenizes images too! Vision models split photos into patches, treating each 16x16 pixel square as a "token." πΌοΈβ‘οΈπ€
#VisualTokenization
Interested in tokenization? Join our workshop
tokenization-workshop.github.io
The submission deadline is already May 30!
loading . . .
https://tokenization-workshop.github.io/**
4 months ago
0
4
2
Got a tokenization paper rejected from ACL? Didn't submit to EMNLP/NeurIPS? Want to present your ACL/EMNLP/NeurIPS work non-archivally? Submit to TokShop @ ICML 2025! The deadline is already May 30!
openreview.net/group?id=ICM...
tokenization-workshop.github.io
add a skeleton here at some point
4 months ago
0
3
2
Language matters: Low-resource languages are severely overtokenized: While English uses ~1.2 tokens per word, e.g., Tamil requires more tokens than characters, making
#LLMs
much costlier for billions of speakers! πΈπ Check out our ICML workshop π
tokenization-workshop.github.io
loading . . .
Tokenization Workshop @ ICML 2025
https://tokenization-workshop.github.io
4 months ago
0
3
0
Did you know BPE (Byte Pair Encoding), the most common LLM tokenizer, was originally a compression algorithm from 1994?
#Tokenization
#LLM
#NLP
Want to find out more about tokenization? Attend our workshop at ICML!
tokenization-workshop.github.io
loading . . .
Tokenization Workshop @ ICML 2025
https://tokenization-workshop.github.io
4 months ago
0
0
0
π£ Call for Paper Alert: TokShop @ ICML 2025 TokShop explores tokenization across all data modalities. Topics include: subword NLP techniques, multimodal approaches, multilingual challenges, post-training modification, alternative representations, and statistical perspectives.
loading . . .
ICML 2025 Workshop TokShop
Welcome to the OpenReview homepage for ICML 2025 Workshop TokShop
https://openreview.net/group?id=ICML.cc/2025/Workshop/TokShop
4 months ago
1
18
14
Got a tokenization paper that just didn't make the cut for ICML? Submit it to the Tokenization Workshop TokShop at
#ICML2025
-- we'd love to see it there!
tokenization-workshop.github.io
loading . . .
Tokenization Workshop @ ICML 2025
https://tokenization-workshop.github.io/
5 months ago
0
8
6
π¨ NEW WORKSHOP ALERT π¨ We're thrilled to announce the first-ever Tokenization Workshop (TokShop) at
#ICML2025
@icmlconf.bsky.social
! π Submissions are open for work on tokenization across all areas of machine learning. π Submission deadline: May 30, 2025 π
tokenization-workshop.github.io
loading . . .
Tokenization Workshop @ ICML 2025
https://tokenization-workshop.github.io
5 months ago
1
23
11
you reached the end!!
feeds!
log in