I was the only one who handed in a solution for that particular problem, it was scored 70 out of 100. I no longer have my solution, but I doubt that it was very accurate, and I didn't have time for experiments.
for long agent sessions, I would expect a very high cache hit rate unless you're editing the system prompt, tools, or history between turns, or some turns take longer than the cache timeout
So long as perceived LLM skill is still "spiky" - e.g. within a domain, still showing relatively high variation in ability (often depending on the task or user, to be fair), people will continue to dismiss it
In the general case I think this is a great idea - if we do a good job of documenting intent etc. in commit messages, agents have an easier time understanding why lines of code exist with no additional specs/mechanisms/etc.
Interested to see what techniques in this area pull ahead and gain traction!
Interesting that non-salty water didn't make the string conductive(?) enough - I'd have thought that there might have been enough soluble stuff in string.
Also I believe this person runs the ISP I use (and I couldn't speak more highly of it - Andrews & Arnold).
For me, a lot of the draw is that it's cheaper than managed db services for small/toy projects of mine (that I don't want to use dynamo db for) - that and in a previous job it was useful as relatively temporary multi-tenant storage.
The partner for these projects has a benchmark that the top frontier LLM labs seem to be running on their new model releases - I think there's _some_ value to these numbers in helping people compare and contrast model performance.
Seems like pure profit-maximizing/greed!