Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

check out rtk that does this for a bunch of commands


Do the larger LLM platforms just do this for you? Or perhaps they do this behinds the scenes, and charge you for the same amount of tokens?


The tokens still land in the context window either way. Prompt caching gives you a discount on repeated input, but only for stable prefixes like system prompts. Git output changes every call, so it's always uncached, always full price. Nit reduces what goes into the window in the first place.


I was thinking more if you write a prompt into an IDE that has first-party integration with an LLM platform (e.g. VS Code with Github Copilot), it would make sense on their end to reduce and remove redundant input before ingesting the token into their models, just to increase throughput (increase customers) and decrease latency (reduce costs). They would be foolish not to do this kind of optimisation, so surely they must be doing it. Whether they would pass on those token savings to the user, I couldn't say.


no because tool calls are all client side generally. unless you mean using a remote environment where Claude Code is running separately but usually those aren't being charged by the token.


this is awesome! thanks for sharing rtk.. going to check it out.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: