Kyutai
@kyutai-labs.bsky.social
๐ค 511
๐ฅ 4
๐ 28
https://kyutai.org/
Open-Science AI Research Lab based in Paris
Our latest open-source speech-to-text model just claimed 1st place among streaming models and 5th place overall on the OpenASR leaderboard ๐ฅ๐๏ธ While all other models need the whole audio, ours delivers top-tier accuracy on streaming content. Open, fast, and ready for production!
5 months ago
1
4
3
Talk to
unmute.sh
๐, the most modular voice AI around. Empower any text LLM with voice, instantly, by wrapping it with our new speech-to-text and text-to-speech. Any personality, any voice. Interruptible, smart turn-taking. Weโll open-source everything within the next few weeks.
loading . . .
6 months ago
2
8
3
๐ Thrilled to announce Helium 1, our new 2B-parameter LLM, now available alongside dactory, an open-source pipeline to reproduce its training dataset covering all 24 EU official languages. Helium sets new standards within its size class on European languages!
6 months ago
1
3
1
Have you enjoyed talking to ๐ขMoshi and dreamt of making your own speech to speech chat experience๐งโ๐ฌ๐ค? It's now possible with the moshi-finetune codebase! Plug your own dataset and change the voice/tone/personality of Moshi ๐๐๐ฟ. An example after finetuning w/ only 20 hours of the DailyTalk dataset. ๐งต
loading . . .
7 months ago
1
6
3
Meet MoshiVis๐๏ธ๐ผ๏ธ, the first open-source real-time speech model that can talk about images! It sees, understands, and talks about images โ naturally, and out loud. This opens up new applications, from audio description for the visual impaired to visual access to information.
loading . . .
8 months ago
1
6
4
Even Kavinsky ๐ง๐ชฉ can't break Hibiki! Just like Moshi, Hibiki is robust to extreme background conditions ๐ฅ๐.
loading . . .
9 months ago
0
8
5
Meet Hibiki, our simultaneous speech-to-speech translation model, currently supporting ๐ซ๐ทโก๏ธ๐ฌ๐ง. Hibiki produces spoken and text translations of the input speech in real-time, while preserving the speakerโs voice and optimally adapting its pace based on the semantic content of the source speech. ๐งต
loading . . .
9 months ago
1
11
4
Helium 2B running locally on an iPhone 16 Pro at ~28 tok/s, faster than you can read your loga lessons in French ๐ All that thanks to mlx-swift with q4 quantization!
loading . . .
10 months ago
0
1
2
Meet Helium-1 preview, our 2B multi-lingual LLM, targeting edge and mobile devices, released under a CC-BY license. Start building with it today!
huggingface.co/kyutai/heliu...
loading . . .
kyutai/helium-1-preview-2b ยท Hugging Face
Weโre on a journey to advance and democratize artificial intelligence through open source and open science.
https://huggingface.co/kyutai/helium-1-preview-2b
10 months ago
1
16
10
you reached the end!!
feeds!
log in