The Chinese models are distilled from GPT and Claude, so it's not like China would pull ahead if those companies went away for six months. They really are at the forefront of innovation right now, as much as I hate to think of the consequences of this (a single company owning a superintelligence is basically a nightmare scenario for me).
There will be a blinding flash which signals the superintelligence singularity. When the smoke clears, you'll see a 50-foot tall Altman/Borg hybrid. He is about to destroy humanity with his death ray. Suddenly, a 50-foot tall Musk/Borg hybrid appears out of nowhere, and stops Altman just in time. Then they work together to destroy all humans.
I think that’s the realm of conspiracy theories. There are also not only Chinese alternatives- Mistral in Europe is doing pretty good in several categories they’ve opted to focus on.
This kind of reiterates the parent’s question I think - people are maybe too focused on the gpt/claude model and forget about all the other ways of using the tech.
i don't buy this. distilled how? you don't get access to logprobs, and the thinking traces are fake and compressed. it's an expensive way to get potentially substandard training data.