More

msdz · 2026-04-23T19:23:52 1776972232

Such an increase tracks the company's valuation trend, which they constantly, somehow have to justify (let alone break even on costs).

msdz · 2026-04-23T13:24:09 1776950649

"Writing is nature's way of letting you know how sloppy your thinking is." – Dick Guindon

If your text hasn't undergone that process, it's still sloppy thinking.

msdz · 2026-04-20T09:34:17 1776677657

> I look at the starts when choosing dependencies, it's a first filter for sure.

Unfortunately I still look at them, too, out of habit: The project or repo's star count _was_ a first filter in the past, and we must keep in mind it no longer is.

> Good reminder that everything gets gamed given the incentives.

Also known as Goodhart's law [1]: "When a measure becomes a target, it ceases to be a good measure".

Essentially, VCs screwed this one up for the rest of us, I think?

[1] https://en.wikipedia.org/wiki/Goodhart%27s_law

yuppiepuppie · 2026-04-20T10:11:30 1776679890

> The project or repo's star count _was_ a first filter in the past, and we must keep in mind it no longer is.

Id suggest the first question to ask is "if the project is an AI project or not?" If it is, dont pay attention to the stars - if it's not, use the stars as a first filter. That's the way I analyse projects on Github now.

GrinningFool · 2026-04-20T11:31:07 1776684667

> The project or repo's star count _was_ a first filter in the past, a

I agree that it has been a first filter, but should it ever have been? A star only says that someone had a passing interest in a project. Not significantly different from a 'like' on a social media post.

msdz · 2026-04-01T16:24:06 1775060646

This got me thinking, and it might actually even be a comparable amount. Let's estimate 12 years of schooling run at minimum $100,000 per student, at least in the US [1], and then add onto that number whatever else you may do after that, i.e. a bunch more money if paid (college) or "unpaid" (self-taught skills and improvements) education, and then the likely biggest portion for white-collar workers, yet hard-to-quantify, in experience and "value" professional work will equip one with.

Now divide the average SOTA LLM's training cost (or a guess, since these numbers aren't always published as far as I'm aware) by the number of users, or if you wanted to be more strict, the number of people it's proven to be useful for (what else would training be for), and it might not be so far off anymore?

Of course, whether it makes sense to divide and spread out the LLMs' costs across users in order to calculate an "average utility" is debatable.

[1] https://www.publicschoolreview.com/average-spending-student-...

msdz · 2026-03-31T18:50:55 1774983055

Personally I've heard Odin [1] to do a decent job with this, at least from what I've superficially learned about its stdlib and included modules as an "outsider" (not a regular user). It appears to have things like support for e.g. image file formats built-in, and new things are somewhat liberally getting added to core if they prove practically useful, since there isn't a package manager in the traditional sense. Here's a blog post by the language author literally named "Package Managers are Evil" [2]

(Please do correct me if this is wrong, again, I don't have the experience myself.)

[1] https://pkg.odin-lang.org/

[2] https://www.gingerbill.org/article/2025/09/08/package-manage...

msdz · 2026-03-23T20:46:23 1774298783

The difference is that the work a contracted tradesperson will do is typically under some sort of guarantee, e.g. typically 2 years on work done in your home (up to 5 for bigger construction etc. type work), at least here in Germany… which you don’t (need to) factor in when DIY-ing.

msdz · 2026-03-20T21:49:08 1774043348

> that they're unable to [manage and] kill child processes they themselves spawn makes it seem like they have zero clue about what they're doing.

Yeah, at the bare minimum these projects could also use something like portless[1] which literally maps ports to human- (and language model-)readable, named .localhost URLs. Which _should_ heavily alleviate assignment of processes to projects and vice versa, since at that point, hard-to-remember port numbers completely leave the equation. You could even imagine prefixing them if you've got that much going on for the ultimate "overview", like project1-db.localhost, project1-dev.localhost, etc.

[1] https://port1355.dev/

embedding-shape · 2026-03-21T10:50:34 1774090234

Well, or just use port 0 like we've done for decades, read what port got used, then use that. No more port collisions ever. I thought most people were already aware of that by now, but judging from that project even existing, seems I was wrong.

hackerthemonkey · 2026-03-22T15:57:35 1774195055

That’s a little different, right? Using port 0 would imply that clients have not hard coded what port they should connect to and also we don’t mind having duplicate processes occupying other ports which are no longer on active use

msdz · 2026-03-17T12:49:12 1773751752

Felt an instant urge to nuke your comment if I could. Excellent work.

msdz · 2026-03-06T13:08:49 1772802529

Interesting article you’ve linked. I’m not sure I agree, but it was a good read and food for thought in any case.

Work is still being done on how to bulletproof input “sanitization”. Research like [1] is what I love to discover, because it’s genuinely promising. If you can formally separate out the “decider” from the “parser” unit (in this case, by running two models), together with a small allowlisted set of tool calls, it might just be possible to get around the injection risks.

[1] Google DeepMind: Defeating Prompt Injections by Design. https://arxiv.org/abs/2503.18813

zbentley · 2026-03-06T13:51:04 1772805064

Sanitization isn’t enough. We need a way to separate code and data (not just to sanitize out instructions from data) that is deterministic. If there’s a “decide whether this input is code or data” model in the mix, you’ve already lost: that model can make a bad call, be influenced or tricked, and then you’re hosed.

At a fundamental level, having two contexts as suggested by some of the research in this area isn’t enough; errors or bad LLM judgement can still leak things back and forth between them. We need something like an SQL driver’s injection prevention: when you use it correctly, code/data confusion cannot occur since the two types of information are processed separately at the protocol level.

TheFlyingFish · 2026-03-06T20:44:12 1772829852

The linked article isn't describing a form of input sanitization, it's a complete separation between trusted and untrusted contexts. The trusted model has no access to untrusted input, and the untrusted model has no access to tools.

Simon Willison has a good explainer on CaMeL: https://simonwillison.net/2025/Apr/11/camel/

zbentley · 2026-03-07T02:22:52 1772850172

That’s still only as good as the ability of the trusted model to delineate instructions from data. The untrusted model will inevitably be compromised so as to pass bad data to the trusted model.

I have significant doubt that a P-LLM (as in the camel paper) operating a programming-language-like instruction set with “really good checks” is sufficient to avoid this issue. If it were, the P-LLM could be replaced with a deterministic tool call.

msdz · 2026-03-05T09:27:09 1772702829

They’d probably get the farthest, but they won’t pursue that because they don’t want to end up leaking the original data from training. It is possible in regular language/text subsets of models to reconstruct massive consecutive parts of the training data [1], so it ought to be possible for their internal code, too.

[1] https://arxiv.org/abs/2601.02671

foota · 2026-03-05T20:52:36 1772743956

Copyright for me not for thee? :) That's a good point though. Maybe they could round trip things? E.g., use the model trained only on internal content to generate training data (which you could probably do some kind of screening to remove anything you don't want leaking) and then train a new model off just that?