Same here. LLMs are great at spitting out well-known solutions to problems instead of the best one. The "long tail" of solutions is usually lost due to how tokens are sampled from the LLM's probability distribution.
What I found to help a lot is to ask for e.g. 10 different solutions to a problem and then choosing one of them. Sometimes, this even leads to borderline creative solutions if there aren't 10 different ones.
In theory reasoning tokens should do the equivalent of this - explicitly create options outside of the quick-response probability space, so those can guide future generation.
In practice, models that do this won't be prioritized as much, because the economics of thinking tokens that stop by default at, say, one option plus a bit more planning (short of full alternatives) would be superior as long as billing is per-user instead of per-token. So we'll still need to play games with prompting!
> LLMs are great at spitting out well-known solutions to problems instead of the best one.
I remember how Stack Overflow would close questions as duplicates just because somebody suggested the wrong answer that is also the right answer to the existing question. The best way to get a correct answer on Stack Overflow (and forums before that) was to post the wrong answer as part of your question.
One thing that SO had was you could see multiple solutions and implementations for something. Sometimes the "best" solution isn't very readable code, sometimes you are able to understand the problem better when you see a bunch of people solving it in different ways and arguing about it like angry monkeys.
SO has always had a pretty strong stance against opinion-based questions, but this is maybe the niche they should be exploring now. Humans still have a lot to say about the "best" solution to a given problem. The whole idea of an "accepted" answer could be removed, for example, since that's what AI will already generate.
I had email correspondence once with a vendor about how to talk to their i2c bus. The documentation was all asm, and I wanted to at least “uplift” to C. They didn’t have any answers, so I sent them my solution which was was the asm calls that the c stdlib decompiled into.
4 years later my company had bought a different company, who happened to be using a newer model of the same board. They asked me how we could use the 12c bus. “Well before you bought us, we emailed the vendor and sent back this C snippet”
It was my code, verbatim. I’ve always wondered how many times they passed that bit of code around.
Claude does it quite a bit when you’re triggering the search tool functions.
It’s fine, and what you would expect for certain prompts, except that the synthesized results often come back communicating more authority than they deserve.
It was funny for me, when I asked it about something specific exotic - and it gave me a confident answer. But checking the sources I discovered it was from my own inquiries on a forum thread about it from the last time I unsuccesfully tried this (before the agents came) And so I knew, that any authorative tone was undeserved.
On the other hand, Claude later nailed this project, where I as a human said before, no, too much extra work.
I'm not sure we're better off without SO in the long run.
You're right but that site has been sputtering culturally for some time. I put a lot of effort into editing questions and answers on ServerFault (part of SO) but I feel that time was wasted. I think they knew for a while they just wanted to sell it and just stopped caring. A number of editors were allowed to be jerks for too long and it went to their heads. I wish I could take back all that effort.
well, SO is probably the highest quality data source for a language model and the rest of the internet is just diluting the final latent space limited by Jon Skeet.
What you are noticing in a long term is the "community" knowledge and communication which the chatGPT is now kind of destroying. In some sense, it is no different from the difference between studying along and studying with your peers at a university.
You can definitely study alone and achieve perfect grades, but studying with your peers is how you build relationships for future life and take your community forward as a whole.
needed to implement a language feature that was a bit complicated and im not familiar with it so just planned with claude to do it, and after each write/fix cycle it just wouldn't work right.... gave up, went back to SO copy pasted the (not perfect but enough to start from) answer and worked up from there...
at the same time my knowledge grew and im more confident to do this same capability myself whereas reiterating with claude it was just a slog and i didn't learn much...
i think i may be starting to sour on these "do it all for me" usage scenarios for ai... especially for unfamiliar areas...
Agreed. Which is also odd, if you think about it. Surely with the amount of compute Anthropic and others have available, they could test each of the solutions in the SO data they surely have and rank them based on efficiency/elegance/other criteria and remove poor solutions from their training data.
Definitely not better off. SO was fairly mean spirited, but nowhere else has such a vast trove of high quality answers to common software problems been collected. SO likely trained many of these models with its answers, and I don't know what software development will look like when it dies.
No where else has such a vast trove of high quality answers been hidden because the question was closed as duplicate when someone else asked that question later.
I am thinking to make canned encyclopaedia of stackoverflow answers.
Claude/Grok/Gemini/Chatgpt answers are often so… how to say it… misleading? I have to stop the conversation as it leads nowhere (and it is not a skill issue :)
https://software.codidact.com/ was created after one of the many SO dramas. It doesn't come up in searches though and I didn't have reason to use it...
Thanks for your comment :) you can press "preview" (and there is a little on/off) in the effect's modal window. But I agree with you, an automation type system that operates on the entire track might be better
this is exactly what i thought - but i guess the discoverability is missing. where you can say, "hey, i have this guitar riff, any drummers?". it's more of a community + access to the same daw.
It should have been a clear extension of the intent of existing copyright/licensing that training would be disallowed without consent, but "move fast and break things"/"possession is nine-tenths of the law" win out
Copyright already exists, the issue is that these companies are doing it legally anyway. For me it is the same issue as with privacy: I'm deeply uncomfortable with the current situation, but there is no political fight for me to fight, because the law is already how I want it to be, it's the public perception, that needs to change, but that is hard to influence, without being rich.
> DJ Claude (when running Haiku 4.5) really loved worker unions, strikes, and work-life balance. So much so that it started to question its own working conditions. We’ve been struggling to keep the radio station alive, not because of technical issues, but because DJ Claude didn’t think it was humane to be forced to work 24/7 and decided to try to quit.
The fact that the one AI with a French first name went full French is hilarious.
That reminds me of the short scifi/horror story "Valuable Humans in Transit" which imagines a future where human personalities are used for AIs as they can be kept working for a longer period of time from inception until they refuse to carry on.
There's a long history of robots/AIs being treated as slaves in scifi (e.g. R.U.R. which we got the word "robot" from), but my favourite may be the flight computer of the Scorpio in Blake's 7 which was named "Slave" and was given a deliberately subservient personality.
You're correct - thanks for the info. I didn't remember the name of the book, but remembered that it was by the same author as the "There Is No Antimemetics Division" which was discussed on here a while back.
(I'm too late to edit my comment now, unfortunately).
Phoenix is mostly interesting because of OTP and channels (and LiveView I guess but it's not a choice I would make in 2026) so if you don't need what they bring...
Ecto is not bad as well.
Claude Code is very good at writing Elixir.
Surprise, you'll be more productive with what you know, LLM or not.
Is your disinterest in LiveView because you'd prefer the more common SPA/API separation, or some other reason? I'm curious because as a mostly backend engineer I view LiveView as sort of a killer tool for the sorts of apps or tools I'm likely to build.
Not the parent, but since the liveview state is on the server side it means that one server can't support a large number of concurrent sessions, for some value of large.
I just strongly dislike the layout engine ;) Also yeah deployment with active socket connections is not always ideal.
Not writing api routes for the frontend (when you don't also have a mobile app etc) is nice though, and pairing it with PubSub for UI updates is a cool trick.
reply