This was my suspicion. They had a bad training run that was really good at a few... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		mortsnort 3 days ago \| parent \| context \| favorite \| on: Exploiting the most prominent AI agent benchmarks This was my suspicion. They had a bad training run that was really good at a few things.

		help

latentsea 3 days ago | [–]

Doesn't seem to match the buzz internally at Anthropic about it?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact