the real reason it's a poor model is that modern LLM chatbots are not actually modeling next-token prediction
the pretraining objective is "predict the next token", but the post-training objective is closer to "create a response that is correct, properly formatted, and in line with style+safety"
add a skeleton here at some point
25 days ago