feels like an insult to readers to try to pretend that their revenue per month is comparable to google or apples growth when the funding is absurdly different, not to mention inflation itself.
I am very much onboard with AI within my workflow. I just don't really see a future where openai/anthropic are the absolute front runners for devs though. Maybe OpenAI does just have the better vision by targeting the general public instead, and just competing to become the next google before google can just stay google?
What is their next step to ensure local models never overtake them? If i could use opus 4.6 as a local model isntead and wrap it in someone else's cli tool, i 100% do it today. are the future model's gonna be so far beyond in capability that this sounds foolish? the top models are more than enough to keep up with my own features before i can think of more... so how do they stretch further than that?
A side note i keep thinking about, how impossible is a world where open source base models are collectively trained similar to a proof of work style pool, and then smaller companies simply spin off their own finishing touches or whatever based on that base model? am i thinking of thinks too simplistically? is this not a possibility?
Anthropic is definitely gaining ground over OpenAI in the business world. Cowork is the absolute hotness right now, and even prompted MSFT to drop their own variant yesterday
Codex and Gemini CLI seem 1-2 months behind Claude Code. They will catch up. This race will eventually be won by whoever can come up with the cheapest compute.
I agree that that's what it would take, but compute would need to get very cheap for it to be feasible to keep models running locally. That's an awful lot of memory to have just sitting with the model running in it.
True. I was thinking more of power users. Do you think Opus level capabilities will run on your average laptop in a year? I think that's pretty far away if ever.
You can demonstrate "running" the latest open Kimi or GLM model on a top-of-the-line laptop at very low throughput (Kimi at 2 tok/s, which is slow when you account for thinking time) today, courtesy of Flash-MoE with SSD weights offload. That's not Opus-like, it's not an "average" laptop and it's not really usable for non-niche purposes due to the low throughput. But it's impressive in a way, and it does give a nice idea of what might be feasible down the line.
> how impossible is a world where open source base models are collectively trained similar to a proof of work style pool
Current multi-GPU training setups assume much higher bandwidth (and lower latency) between the GPUs than you can get with an internet connection. Even cross-datacenter training isn't really practical.
LLM training isn't embarrassingly parallel, not like crypto mining is for example. It's not like you can just add more nodes to the mix and magically get speedups. You can get a lot out of parallelism, certainly, but it's not as straightforward and requires work to fully utilize.
It's hard to train models in the open. All the big players are using lots of "dodgy" training data. Like books, video, code, destinations. If you did that in the open, the lawyers would shut you down.
Though I think these companies are wildly overvalued, I don't see LLMs as a service going away in the future. The value in OpenAI is that it provides extra compute, data access, etc. My money is on local AI becoming more of a thing, while services like OpenAI still exist for local AIs to consult with. If a local model can somehow know that it's out of it's depth on a question/prompt, it can ask an OpenAI model if it's available, but otherwise still work locally if OpenAI fails to respond or goes out of business. To me that makes a lot more sense than the future being either-or.
> What is their next step to ensure local models never overtake them?
As someone who experiments with local models a lot, I don’t see this as a threat. Running LLMs on big server hardware will always be faster and higher quality than what we can fit on our laptops.
Even in the future when there are open weight models that
I can run on my laptop that match today’s Opus, I would still be using a hosted variant for most work because it will be faster, higher quality, and not make my laptop or GPU turn into a furnace every time I run a query.
If your laptop overheats when you push your GPU, you can buy purpose-built "gaming" laptops that are at least nominally intended to sustain those workloads with much better cooling. Of course, running your inference on a homelab platform deployed for that purpose, without the thermal constraints of a laptop, is also possible.
I didn't say it overheats. It gets hot and the fans blow, neither of which are enjoyable.
MacBook Pro laptops are preferred over "gaming" laptops for LLM use because they have large unified memory with high bandwidth. No gaming laptop can give you as much high-bandwidth LLM memory as a MacBook Pro or an AMD Strix Halo integrated system. The discrete gaming GPUs are optimized for gaming with relatively smaller VRAM.
The goal of web hosting is to provide low latency wide availability to many users.
AI in this context has a very different goal as a tool for individual users.
You wouldn't say that hosting instances of Photoshop on servers and charging for usage is a long term viable business would you? Even if current consumer computers struggled to run Photoshop.
I don't see an issue with the comparison, I don't think it is meant to be a 1 to 1 or anything, just an illustration of how consumers are overwhelmingly lazy.
I'd take issue with the statement that it is for the paranoid, but I guess it might be a defense mechanism because of course i am interested in local models. If my new workflow is going to be dependent on 3 companies, I'd prefer if there is a light at the end of the tunnel that breaks us free.
I am very much onboard with AI within my workflow. I just don't really see a future where openai/anthropic are the absolute front runners for devs though. Maybe OpenAI does just have the better vision by targeting the general public instead, and just competing to become the next google before google can just stay google?
What is their next step to ensure local models never overtake them? If i could use opus 4.6 as a local model isntead and wrap it in someone else's cli tool, i 100% do it today. are the future model's gonna be so far beyond in capability that this sounds foolish? the top models are more than enough to keep up with my own features before i can think of more... so how do they stretch further than that?
A side note i keep thinking about, how impossible is a world where open source base models are collectively trained similar to a proof of work style pool, and then smaller companies simply spin off their own finishing touches or whatever based on that base model? am i thinking of thinks too simplistically? is this not a possibility?