feels like an insult to readers to try to pretend that their revenue per month i...

simonjgreen · 2026-03-31T21:12:42 1774991562

Anthropic is definitely gaining ground over OpenAI in the business world. Cowork is the absolute hotness right now, and even prompted MSFT to drop their own variant yesterday

strongpigeon · 2026-03-31T21:15:45 1774991745

Ask anybody you know that works in Big Tech. They're all pushing hard for Claude Code adoption.

operatingthetan · 2026-03-31T21:19:55 1774991995

Codex and Gemini CLI seem 1-2 months behind Claude Code. They will catch up. This race will eventually be won by whoever can come up with the cheapest compute.

a1studmuffin · 2026-03-31T21:23:57 1774992237

And that's a dangerous game because the cheaper compute gets, the more likely consumers are to self-host rather than pay a subscription.

ds2df · 2026-03-31T21:26:04 1774992364

Apple could figure out a way to neatly package it into their ecosystem.

winrid · 2026-03-31T22:03:52 1774994632

Not really. Most people won't self host.

jonah · 2026-03-31T22:32:50 1774996370

The general public will self-host it's built in to your next phone or laptop straight out of the box or maybe from the App Store.

delecti · 2026-03-31T23:24:10 1774999450

I agree that that's what it would take, but compute would need to get very cheap for it to be feasible to keep models running locally. That's an awful lot of memory to have just sitting with the model running in it.

winrid · 2026-03-31T22:50:33 1774997433

True. I was thinking more of power users. Do you think Opus level capabilities will run on your average laptop in a year? I think that's pretty far away if ever.

zozbot234 · 2026-03-31T23:00:13 1774998013

You can demonstrate "running" the latest open Kimi or GLM model on a top-of-the-line laptop at very low throughput (Kimi at 2 tok/s, which is slow when you account for thinking time) today, courtesy of Flash-MoE with SSD weights offload. That's not Opus-like, it's not an "average" laptop and it's not really usable for non-niche purposes due to the low throughput. But it's impressive in a way, and it does give a nice idea of what might be feasible down the line.

miki123211 · 2026-03-31T21:26:12 1774992372

> how impossible is a world where open source base models are collectively trained similar to a proof of work style pool

Current multi-GPU training setups assume much higher bandwidth (and lower latency) between the GPUs than you can get with an internet connection. Even cross-datacenter training isn't really practical.

LLM training isn't embarrassingly parallel, not like crypto mining is for example. It's not like you can just add more nodes to the mix and magically get speedups. You can get a lot out of parallelism, certainly, but it's not as straightforward and requires work to fully utilize.

thomasahle · 2026-03-31T21:10:43 1774991443

It's hard to train models in the open. All the big players are using lots of "dodgy" training data. Like books, video, code, destinations. If you did that in the open, the lawyers would shut you down.

ravenstine · 2026-03-31T21:16:40 1774991800

Though I think these companies are wildly overvalued, I don't see LLMs as a service going away in the future. The value in OpenAI is that it provides extra compute, data access, etc. My money is on local AI becoming more of a thing, while services like OpenAI still exist for local AIs to consult with. If a local model can somehow know that it's out of it's depth on a question/prompt, it can ask an OpenAI model if it's available, but otherwise still work locally if OpenAI fails to respond or goes out of business. To me that makes a lot more sense than the future being either-or.

clhodapp · 2026-03-31T21:20:51 1774992051

Models not being able to reliably know if they are out of their depth is a foundational limitation of the currently generation of models, though.

Best they can do is to somewhat reliably react to objective signals that they've failed at something (like test failures).

Aurornis · 2026-03-31T21:22:41 1774992161

> What is their next step to ensure local models never overtake them?

As someone who experiments with local models a lot, I don’t see this as a threat. Running LLMs on big server hardware will always be faster and higher quality than what we can fit on our laptops.

Even in the future when there are open weight models that I can run on my laptop that match today’s Opus, I would still be using a hosted variant for most work because it will be faster, higher quality, and not make my laptop or GPU turn into a furnace every time I run a query.

zozbot234 · 2026-03-31T22:11:59 1774995119

If your laptop overheats when you push your GPU, you can buy purpose-built "gaming" laptops that are at least nominally intended to sustain those workloads with much better cooling. Of course, running your inference on a homelab platform deployed for that purpose, without the thermal constraints of a laptop, is also possible.

Aurornis · 2026-03-31T22:40:18 1774996818

I didn't say it overheats. It gets hot and the fans blow, neither of which are enjoyable.

MacBook Pro laptops are preferred over "gaming" laptops for LLM use because they have large unified memory with high bandwidth. No gaming laptop can give you as much high-bandwidth LLM memory as a MacBook Pro or an AMD Strix Halo integrated system. The discrete gaming GPUs are optimized for gaming with relatively smaller VRAM.

mlsu · 2026-03-31T21:18:44 1774991924

You can host a website on any rackmount server for pennies compared to AWS. But people still use AWS.

The market for local models is always gonna be a small niche, primarily for the paranoid.

lukan · 2026-03-31T21:22:57 1774992177

"The market for local models is always gonna be a small niche, primarily for the paranoid."

Have you ever heard of industrial espionage? Pr privacy regulations? Or military applications?

(Also the US military runs claude as a local model)

notnullorvoid · 2026-04-01T15:12:27 1775056347

The goal of web hosting is to provide low latency wide availability to many users.

AI in this context has a very different goal as a tool for individual users.

You wouldn't say that hosting instances of Photoshop on servers and charging for usage is a long term viable business would you? Even if current consumer computers struggled to run Photoshop.

sidrag22 · 2026-04-01T17:19:04 1775063944

I don't see an issue with the comparison, I don't think it is meant to be a 1 to 1 or anything, just an illustration of how consumers are overwhelmingly lazy.

I'd take issue with the statement that it is for the paranoid, but I guess it might be a defense mechanism because of course i am interested in local models. If my new workflow is going to be dependent on 3 companies, I'd prefer if there is a light at the end of the tunnel that breaks us free.

FpUser · 2026-03-31T21:40:38 1774993238

>"But people still use AWS"

I do not, I self host. My current client is also got rid from AWS packing up nice savings as a result