Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Presumably because it takes 6 months to distill Claude - but if they keep it closed like they are doing with Mythos it may take significantly longer.
 help



They do quite a lot of distillation. As we've seen from the American open weight models from AI2 (OLMo series of models). They have a lot of incentive to distill beyond just copying, they're much more compute constrained, so open model companies distill, but also do really good architectural work to make their models run faster. Theres also technical challenges to distillation when all of the top models have their reasoning traces hidden, so we have to assume these open weight labs also have really great training pipelines as well.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: