Max kuhn
@topepo.bsky.social
📤 4959
📥 298
📝 220
Writing modeling packages at
@posit.co
(née RStudio). Opinions are my own.
https://max-kuhn.org/
pinned post!
I last posted here about 6 months ago. Here's what I've been working on and/or thinking about.
#rstats
,
#statistics
,
#ml
package upkeep: we are doing major preventive maintenance on the tidymodels packages ("upkeep week!"). It's rote but very rewarding work. Error messages are 100x better. 1/3
loading . . .
Applied Machine Learning for Tabular Data
https://aml4td.org
over 1 year ago
1
27
4
Oooof
add a skeleton here at some point
16 days ago
3
17
3
reposted by
Max kuhn
Emil Hvitfeldt
24 days ago
I'm at
#PyConUS
2026 🙌 if you are here too please come by and say hi! would love to hear what you are working on, will most likely be found at the Posit booth
0
3
1
"Posit Assistant and NES reflect our learnings from twenty years of building for data scientists and researchers. As the token subsidies end, the differentiator between products becomes more about the value those products provide, and we think we’re ahead in value for the data community."
add a skeleton here at some point
26 days ago
0
3
1
reposted by
Max kuhn
Dr Di Cook
28 days ago
Super excited to get this at my door today! Ursula Laa and my book on exploring high-d data and models is in print! Book website is
www.routledge.com/Interactivel...
if interested I think my 30% discount code might work for you - msg me. Time to update
dicook.github.io/mulgar_book/
too!
#rstats
4
153
35
If I were Campar, I would insist on *Doctor* Sticks With Meat On Them.
#CaptivesWar
28 days ago
0
1
0
reposted by
Max kuhn
Simon P. Couch
about 1 month ago
The newest release of Posit Assistant, an agent for coding and data analysis, includes a "data cleaning mode." When enabled, the agent will run quality checks and surface decisions about e.g. import issues, factor levels, etc to the user. In the AI Newsletter:
opensource.posit.co/blog/2026-05...
loading . . .
2
23
12
Tom Morello and Bruce Springsteen is the best superhero team up ever.
about 1 month ago
1
4
1
reposted by
Max kuhn
rOpenSci
about 1 month ago
🗒️ [blog] New Mentoring Team with Experienced Mentors and New Voices Meet the new team of mentors who will accompany our Champions in their projects for a year, sharing experience, guidance, and enthusiasm. https://ropensci.org/blog/2026/05/06/mentors-2026/
#rstats
#mentoring
#rse
#foss
1
15
7
reposted by
Max kuhn
Dr Mircea Zloteanu 🌺🌞🍃
about 1 month ago
#statstab
#526 Splines, B-splines, P-splines, and a disapproving kitten Thoughts: Nice R tutorial on splines with some explanations and illustrations.
#splines
#r
#rstats
#tutorial
#guide
#polynomial
#nonlinear
blog.djnavarro.net/posts/2025-0...
loading . . .
Splines, B-splines, P-splines, and a disapproving kitten – Notes from a data witch
No, I do not care about splines. But I am trying to learn about GAMLSS regression, and yes, it is to this dark place that this topic has taken me
https://blog.djnavarro.net/posts/2025-09-06_p-splines/
3
18
6
reposted by
Max kuhn
Merriam-Webster
about 1 month ago
We have friends everywhere.
32
1901
433
It's been about 8 years but better late than never!
add a skeleton here at some point
about 1 month ago
0
5
0
We've had a flurry of new tidymodels
#rstats
releases. Here's a summary of what's been going on with a focus on ordered outcomes and quantile regression.
opensource.posit.co/blog/2026-04...
loading . . .
New tidymodels Releases for April 2026
A small cascade of CRAN releases brings new model types to tidymodels.
https://opensource.posit.co/blog/2026-04-27_tidymodels-april-2026/
about 1 month ago
0
47
9
reposted by
Max kuhn
Antoine Vernet
about 2 months ago
I managed to install panache (
panache.bz
) and for now I only got it to do one thing, but it is one thing that is really nice to do with a one liner: put every sentence in a quarto document on a newline. So the economy of a paragraph is visible at a glance.
#writing
#quarto
loading . . .
Panache: Language Server, Formatter, and Linter for Pandoc, Quarto, and R Markdown
Panache provides authoring tools for Pandoc, Quarto, and R Markdown: a language server, formatter, and linter to help you write better documents, more efficiently.
https://panache.bz
0
14
6
reposted by
Max kuhn
Thomas Lin Pedersen
about 2 months ago
I am excited beyond description to lift the veil on what we have been working on in 2026: Please meet ggsql! A new extension of the SQL language for creating visualisations using the grammar of graphics. Read all about it in the blog post or visit the website at
ggsql.org
loading . . .
ggsql: A grammar of graphics for SQL
Introducing ggsql, a grammar of graphics for SQL that lets you describe visualizations directly inside SQL queries.
https://opensource.posit.co/blog/2026-04-20_ggsql_alpha_release/
13
391
97
I loved reading this. It’s so well written!
add a skeleton here at some point
about 2 months ago
0
12
1
reposted by
Max kuhn
Joachim Schork
about 2 months ago
How to tell if a statistical method really works? Monte Carlo simulations give a clear answer. Instead of relying on one dataset, you generate data repeatedly and compare methods across many runs. New module in the Statistics Globe Hub:
statisticsglobe.com/hub
#Statistics
#DataScience
#RStats
loading . . .
1
9
3
The panache
#quartopub
formatter and linter by
@jolars.co
is excellent! Very configurable and uses external tools (e.g., air or ruff) to format code blocks.
panache.bz
loading . . .
Panache: Language Server, Formatter, and Linter for Pandoc, Quarto, and R Markdown
Panache provides authoring tools for Pandoc, Quarto, and R Markdown: a language server, formatter, and linter to help you write better documents, more efficiently.
https://panache.bz/
2 months ago
0
16
2
#quartopub
version 2 is being cooked up in the lab.
@cscheid.net
has a post discussing the why and when (with some hints about some related future projects)
quarto.org/docs/blog/po...
loading . . .
What’s next: Quarto 2 – Quarto
We’ve started working on quarto-dev/q2, a full rewrite of Quarto in Rust.
https://quarto.org/docs/blog/posts/2026-04-06-whats-next-quarto-2/
2 months ago
0
6
0
For me, this could not have been any more on target.
add a skeleton here at some point
2 months ago
1
5
1
We've released the first version of our tabpfn
#rstats
package to CRAN. This is an interface to the Python
#TabPFN
package.
tidyverse.org/blog/2026/03...
The model is a pre-trained deep learning model that has performed exceedingly well on every data set I've tested it on.
loading . . .
tabpfn 0.1.0
A new R package for tabular deep learning models.
https://tidyverse.org/blog/2026/03/tabpfn-0-1-0/
2 months ago
1
32
6
Our list of 2026
#rstats
and
#python
summer internships has been posted. We can't wait to work with you and make great things!
tidyverse.org/blog/2026/03...
loading . . .
2026 Posit Internships
Posit is sponsoring four summer internship positions in 2026.
https://tidyverse.org/blog/2026/03/2026-internships/
3 months ago
1
28
19
reposted by
Max kuhn
Posit
3 months ago
We’re thrilled to welcome Sara Altman and Simon Couch of the Posit AI Core team to the
#positconf
2026 keynote stage! They’ll be sharing a practical, hype-free look at how AI is being thoughtfully integrated into the tools you use every day. ✨ See what’s next for AI in open source:
pos.it/conf
0
16
5
Cool little details about how to structure tree splits. And don't worry, the cat is not in the dryer. I checked.
add a skeleton here at some point
3 months ago
0
4
2
reposted by
Max kuhn
Emil Hvitfeldt
3 months ago
I'm over the moon excited to the release of 0.5.0 of orbital 🛰️ This release adds full support for boosted tree, faster creation of orbital objects, optimization of execution! We can finally reliably predict with a xgboost model from a database!
tidyverse.org/blog/2026/03...
#rstats
#tidymodels
loading . . .
orbital 0.5.0
orbital 0.5.0 is on CRAN! More models and faster execution.
https://tidyverse.org/blog/2026/03/orbital-0-5-0/
2
44
9
reposted by
Max kuhn
Emil Hvitfeldt
3 months ago
something big is coming 🛰️
loading . . .
0
29
1
🤩
add a skeleton here at some point
3 months ago
0
1
0
reposted by
Max kuhn
Charlie Gao
3 months ago
I presented at Shiny in Production 2025, an incredible conference hosted by
@jumpingrivers.com
up in Newcastle! I was glad to share the very latest from the Shiny team directly. The topics were bleeding edge at the time, so still new and relevant now. My video:
www.youtube.com/watch?v=vxai...
add a skeleton here at some point
0
20
4
reposted by
Max kuhn
Simon P. Couch
3 months ago
We are covering 40 people's travel, lodging, and registration for posit::conf() this fall! If you are from a group that is underrepresented in data science or open source, please consider applying for the Opportunity Scholarship—we'd love to have you join.
posit.co/blog/apply-t...
2
21
15
reposted by
Max kuhn
Jim Gruman
3 months ago
March's tabular playground
#rstats
#databs
#tidytuesday
www.kaggle.com/code/jimgrum...
add a skeleton here at some point
0
3
1
reposted by
Max kuhn
Edgar
3 months ago
#tidymodels
now has its very first cheatsheet! "Preprocessing data with {recipes}" is now available in Web and PDF versions here:
rstudio.github.io/cheatsheets/...
#rstats
#posit
#rstudio
0
49
14
A very helpful post on AI helpers, especially if you are new to AI and Claude
add a skeleton here at some point
3 months ago
1
12
0
Too bad Claude can’t shovel snow.
3 months ago
2
8
1
reposted by
Max kuhn
The Data Science & AI Conference Presented by Lander Analytics
4 months ago
Here’s a clip from Max Kuhn (
@topepo.bsky.social
) of Posit breaking down how we can truly quantify LLM performance using a clear, generalizable framework. See the full conference talk here:
youtu.be/TQKbaIR-8J4
#AI
#MachineLearning
#DataBS
loading . . .
0
3
1
reposted by
Max kuhn
Sharon Machlis
4 months ago
Want to check if code using
#GenAI
generates the responses you want? Here's how to automate LLM evals with the {vitals}
#RStats
📦 by
@simonpcouch.com
@posit.co
My latest at
#InfoWorld
:
www.infoworld.com/article/4130...
#LLMs
loading . . .
How to choose the best LLM using R and vitals
Use the vitals package with ellmer to evaluate and compare the accuracy of LLMs, including writing evals to test local models.
https://www.infoworld.com/article/4130274/how-to-choose-the-best-llm-using-r-and-vitals.html
0
7
2
reposted by
Max kuhn
@kearneymw
4 months ago
random.seed(42) </cringe>
1
4
1
With
bookdown.org
being decommissioned, I've been working on converting
#rstats
books from
#bookdown
to
#quartopub
. After spending a lot of time with Claude to get it right for my two books, here's a repo that might help you do the same (especially with Claude Code):
github.com/topepo/bookd...
4 months ago
0
61
12
reposted by
Max kuhn
Davis Vaughan
4 months ago
Last week we released dplyr 1.2.0, but we left off something VERY important 🙂 `dplyr::if_else()` and `dplyr::case_when()` are now up to 30x faster and use 10x less memory! We dive into how we achieved these numbers in this new
#rstats
post!
tidyverse.org/blog/2026/02...
loading . . .
`dplyr::if_else()` and `dplyr::case_when()` are up to 30x faster
dplyr 1.2.0 comes with much faster and more memory efficient `if_else()` and `case_when()` functions!
https://tidyverse.org/blog/2026/02/dplyr-performance/
3
127
21
reposted by
Max kuhn
Nick Strayer
4 months ago
For more than a year I have been working on a brand new Jupyter Notebook editor for Positron. This is a ground-up build of a new Jupyter Notebook experience built to leverage all the knowledge and tools Posit/Positron brings to the data science table. 🧵#jupyter
1
26
8
reposted by
Max kuhn
Davis Vaughan
4 months ago
dplyr 1.2.0 is out now and we are SO excited! - `filter_out()` for dropping rows - `recode_values()`, `replace_values()`, and `replace_when()` that join `case_when()` as a complete family of recoding/replacing tools These are huge quality of life wins for
#rstats
!
tidyverse.org/blog/2026/02...
loading . . .
dplyr 1.2.0
dplyr 1.2.0 fills in some important gaps in dplyr's API: we've added a new complement to `filter()` focused on dropping rows, and we've expanded the `case_when()` family with three new recoding and re...
https://tidyverse.org/blog/2026/02/dplyr-1-2-0/
12
464
145
reposted by
Max kuhn
Rami Krispin
4 months ago
The hexagon here is priceless 😎
taf-society.github.io/caretForecast/
#rstats
#timeseries
loading . . .
caretForecast
Conformal Time Series Forecasting Using Machine Learning
https://taf-society.github.io/caretForecast/
1
16
2
reposted by
Max kuhn
Isabella Velásquez
4 months ago
Tomorrow at the Data Science Lab 🧪 we are hearing from the amazing
@theotheredgar.bsky.social
about the {mall} package: Run Natural Language Processing against your
#RStats
tibbles or
#Python
Polars DataFrames for sentiment analysis, text summaries, and more! Join us at 12 pm ET:
pos.it/dslab
0
17
2
reposted by
Max kuhn
Davis Vaughan
5 months ago
I sent 200 pull requests using Claude Code and wrote about the experience. It's pretty wild! For dplyr releases, we send a PR any time we break an
#rstats
package. This release advances a lot of deprecated functions, triggering issues in many old packages!
blog.davisvaughan.com/posts/2026-0...
loading . . .
Semi-automating 200 Pull Requests with Claude Code – Davis Vaughan
https://blog.davisvaughan.com/posts/2026-01-09-claude-200-pull-requests/
6
61
14
reposted by
Max kuhn
Emil Hvitfeldt
6 months ago
We are excited to see that xgboost recently had a big CRAN release! We have worked hard on the tidymodels team to make sure you all have a smooth transition. Please yet us know if you are experiencing any issues with the releases
tidyverse.org/blog/2025/12...
#rstats
#tidymodels
loading . . .
tidymodels & xgboost
The tidymodels ecosystem is prepared for big xgboost CRAN release.
https://tidyverse.org/blog/2025/12/tidymodels-xgboost/
2
25
4
reposted by
Max kuhn
alex hayes
11 months ago
~~ making sense of academic statistics ~~ i wrote about the confusing relationship between statistics and data analysis, and also about how statistics relates to science
#statistics
#rstats
#datascience
www.alexpghayes.com/post/making-...
15
112
27
reposted by
Max kuhn
Emil Hvitfeldt
6 months ago
I'm excited to announce the newest release of {tidypredict}! This release brings for standardization to outputs, faster trees for parsing and prediction, and glmnet support
tidyverse.org/blog/2025/12...
#rstats
#tidymodels
loading . . .
tidypredict 1.0.0
tidypredict 1.0.0 brings faster computations for tree-based models, more efficient tree representations, glmnet model support, and a change in how random forests are handled.
https://tidyverse.org/blog/2025/12/tidypredict-1-0-0/
2
33
9
We’ve released two new tidymodels
#rstats
packages for feature selection: filter and important.
tidyverse.org/blog/2025/11...
loading . . .
Two New tidymodels Packages
Two new tidymodels packages focus on supervised feature selection.
https://tidyverse.org/blog/2025/11/two-new-tidymodels-packages/
6 months ago
1
39
7
reposted by
Max kuhn
Joe Kirincic
7 months ago
I’m not aware of an Arrow or Parquet format, but there is the ONNX format (see
onnx.ai
). Depending on the model, you could try Posit’s orbital project, which translates your model to SQL (see here
orbital.tidymodels.org
).
loading . . .
ONNX | Home
https://onnx.ai
0
0
2
Load more
feeds!
log in