So if you’re streaming tokens from an LLM and you cancel the request from the client you’ll be wasting money.
So if you’re streaming tokens from an LLM and you cancel the request from the client you’ll be wasting money.