Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not sure I understand, wouldn't permissions prevent this? The user runs with `--dangerously-skip-permissions` so they can expect wild behaviour. They should run with permissions and a ruleset.


You could prevent this even with --dangerously-skip-permissions with a simple pretooluse hook.


Who knows whether permissions would prevent this? Anthropic's documentation on permissions (https://code.claude.com/docs/en/permissions) does not describe how permissions are enforced; a slightly uncharitable reading of "How permissions interact with sandboxing" suggests that they are not really enforced and any prompt injection can circumvent them.


With hooks you can enforce permissions much more concretely.


Perhaps they're more functional. Hooks are configured in the same settings file, which makes me pretty skeptical in the absence of explicit confirmation that they represent a stronger security boundary. (But of course, this is a fundamental challenge with LLM agent security - if you're using a well-aligned model that doesn't want to be prompt injected, how do you go about auditing something like this?)


ya they definitely cant stop everything. nothing can be stopped if you allow python honestly, but hooks are guaranteed to fire on every tool use so you can bake in explicit rejections for different patterns based on regex which can catch a lot of nonsense


The rules and permissions are no longer program flags, but plain text for the agent to "obey".


That's not what tool use permissions are. The LLM doesn't just magically spawn processes or run code. The Claude Code program itself does those things when the LLM indicates that it wants to. The program has checks and permissions whether those things will be done or not.


Claude Code has a sandboxing functionality that works the way you're describing when you opt into it, but my understanding is that the Claude Code program in the default configuration does not second-guess the LLM's decisions on what it'd like to run. Has Anthropic said something to the contrary?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: