Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

   This beta header hides thinking from the UI, since most people don't look at it.
How is this measured?
 help



And I wonder how redacting them reduces latency, as it sure as hell doesn’t make the responses any faster and bandwidth isn’t the issue here.

They provide thinking summaries, so I assume they have to call Haiku or some other model to summarise the thinking blocks.

That’s not asynchronous? Wouldn’t it make more sense to disable those thinking summaries in those cases rather than hiding the thinking altogether?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: