You are forgetting about how with the multiple high quality open weights models,...

embedding-shape · 2025-12-28T18:28:22 1766946502

> You are forgetting about how with the multiple high quality open weights models, we are gonna quickly/(already have?) reached the point where using completely local models will make sense.

I'm not, since I'm a heavy user of local models myself, and even with the beast of a card I work with locally daily (RTX Pro 6000), the LLMs you can run locally are basically toy models compared to the hosted ones. I think, if you haven't already, you need to give it a try yourself to see the difference. I didn't mention or address it, as it's basically irrelevant because of the context.

And besides that, how affordable how GPUs today in the developing world? Electricity costs? How to deal with thermals? Frequent black outs? And so on, many variables you seemingly haven't considered yet.

Best way of making the difference between hosted models and local modals is to run your own private benchmarks against both of them and compare. Been doing this for years, and still local models are nowhere near the hosted ones, sadly. I'm eager for the day to come, but it will still take a while.

seanmcdirmid · 2025-12-28T18:34:55 1766946895

I’ve got a Max M3 with 64 GB ram and can run more than just toy models, even if they are obviously less than hosted ones. Honestly, I think local LLMs are the future and we are just going to be doing hosted until hardware catches up (and now they have something to catch up to!).

embedding-shape · 2025-12-29T01:21:49 1766971309

> Honestly, I think local LLMs are the future and we are just going to be doing hosted

Same here, otherwise I wouldn't be investing in local hardware :) But I'd be lying if I said I think it's ready for that today. I don't think the hardware as much to catch up with, it's the software that has a bunch of low hanging fruits available for performance and resource usage, since every release seems to favor "time to paper" above all else.

seanmcdirmid · 2025-12-29T02:23:14 1766974994

There are lots of things you can do on local hardware already, and you don’t have to worry about safeguards or token limits. There are lots of crazy models, especially Chinese ones, that have a lot of capabilities and aren’t just there for academic papers.

embedding-shape · 2025-12-29T11:45:56 1767008756

Again, put those under test with your private benchmarks, then compare the results with hosted models.

I'm not saying it's completely useless, or that I don't think it won't be better in the future. What I am saying is that even the top "weights available" models today really don't come close to today's SOTA. This is very clear when you have benchmarks to get hard concrete numbers that aren't influenced by public benchmarking data.

seanmcdirmid · 2025-12-29T23:13:24 1767050004

> even the top "weights available" models today really don't come close to today's SOTA.

This is the statement thatI'm disagreeing with. They do come close, even if they are somehow less, it is a fixed distance away where the hosted models aren't more than a magnitude better. Hosted models are still better, just not incredibly so.

Imustaskforhelp · 2025-12-28T19:33:04 1766950384

I 100% agree with your comment. That yes, we should test the models on our own private benchmarks and there is no denying that local has a long way to go.

I was just proposing that local feels the most sustainable way for things to go, perhaps even an API Openrouter like things but you can read my other comment on how I found their finances to be loss/zero profit so its good while it lasts (on the AI bubble) if one person needs it but long term I feel like its prices are gonna rise whereas local would still remain stable (Also worth mentioning that there is no free lunch so I think the losses would be distributed to everybody in the form of the financial crisis caused by AI, I hope that the impacts of the financial crisis lessens because I am genuinely worried about the crisis more so at this point.

Agreed, I myself understand that right now using these AI bubble fuel money sponsored might make sense (look at my other comments where I went into the rabbithole on how most companies are losing/zero profitting while investing billions)

Although these aren't sustainable, the one idea where it makes sense is that we transition to a local model based (which yes I know are inefficient) but the inevitability in my opinion if the bubble bursts but there are definitely some steal deals nowadays if one wishes.

Also You may have understood me wrong in this comment (if so my apologies) in the sense that what I was mentioning was for the secretary use-case and not a company (using AI?)/selling Ai related services which need 24x7 access

One wouldn't have to worry about Blackouts because if your secretary's house is blacked out, lets just be honest that AI won't turn the magic lights on

Also the machines in our devices are beast. I am pretty sure that for basic communication tasks as the grandfather comment suggested, we are very likely that even the "toy" models as you say would be "good enough"

cm2012 · 2025-12-28T22:10:51 1766959851

A tiny model would definitely be good enough for her use cases.

Imustaskforhelp · 2025-12-28T22:15:58 1766960158

So yea, a tiny model can even run locally (perhaps even on the phone)can solve her use case so the moat for AI companies is close to zero (as expected)

This is what I was trying to say actually, thanks for responding.

That being said, the original point about Americans/Europeans does become a bit moot after this discovery because the fact is, I don't think that most are against small models but rather the SOTA models run in AI centric datacenters which they hate as it actively acts as a tax on them by increasing electricity rates etc. while taking workforce from them

A tiny model on the other doesn't really do all the above. I definitely feel like the concerns of American people are definitely valid for the AI datacenters though so I hope something can be done about it in a timely and helpful manner which brings real help to the average american.