It smells like an architecture-related issue to me. They wanted to release the model asap, but they're still implementing the fine-grained controls to constrain the model to non-subscription users.
> The loudest reaction to Mythos Preview from other security leaders has been about speed - scan faster, patch faster, compress the response cycle. More than one team we have spoken with is now operating under a two-hour SLA from CVE release to patch in production [...] If regression testing takes a day, you cannot get to a two-hour SLA without skipping it, and the bugs you ship when you skip regression testing tend to be worse than the bugs you were trying to patch.
Over time, I wonder if these models will be able to generate more secure code by default by doing this kind of exploitability testing before ever merging their code.
I don't know, but it always seems weird to me when people notice AI isn't performing super well and then they conclude that the solution to problem is to try using more AI
Yeah why not? That's how I work. If I don't review my work, it's way worse than if I do review it and revise and iterate. I don't see why AI should be different: in fact it very clearly seems to be the case that is isn't.
I mean, I was sold something different. Something super human, vastly more intelligent, world changing. The reality is not that. Am I allowed to be disappointed and discouraged?
Because you can have it review itself and iterate on its own work before showing you. If you insist on reviewing its one-shot output you'll be disappointed, but if you consider its internal work private and only consider its final output, it's different.
Also we're still in the middle of the transformation, clearly the AI we'll have in 5 years will be radically different and better (by some definition of better) than what we see today. It's kind of weird that you'd be disappointed that the world will only be totally transformed in ten years, and not five.
"clearly the AI we'll have in 5 years will be radically different and better"
Based on past performance that's not clear at all. Remember Elon predicting full self driving by 2017. It's almost 10 years past that predicted date and it's still not quite there. 5 years is nothing in tech. It takes 5 years to get a slightly improved chip designed and manufactured. It's been 3.5 years since ChatGPT was released and the LLMs of today are not radically different from that, and no radical changes have been teased. We're still in the throw-more-hardware-at-it phase. We could be here a while.
It has changed the world in major ways, although its not entirely visible because we've become numb to the idea of AI and AI being in everything.
It hasn't changed the way we sleep, wake up, eat, walk and talk so its not "life changing" or "world changing" in the sense a meteorite hit us, but each day thousands of mini meteorites are hitting Earth and we're becoming normalized to it one step at a time.
You are allowed to be disappointed and discouraged! For all the good tech that has come out of the AI revolution, most of it is ignored or shelved for things that can squeeze more and more money out of us and make our lifes worse, not better. Despite there being real potential to generate nice code, assist with biomedical research, self-driving cars, etc.
Which is it? Major changes or a bunch of small changes. I'm well aware of the small changes. I worked for an autonomous drone company back in 2008. It was really cool! In 2020 I started working for an autonomous car company. Again, amazing! None of it was a quick step function improvement. It was a lot of hard work. None of it was quite superhumanly smart either. LLMs are impressive pattern completion machines but they kinda suck at producing anything truly novel. Plus they are compulsive liars about that, lol!
I like simonw's take that open source should be more valuable [0]
>An interesting result of this is that open source libraries become more valuable, since the tokens spent securing them can be shared across all of their users. This directly counters the idea that the low cost of vibe-coding up a replacement for an open source library makes those open source projects less attractive.
I can understand why the reflexive move to fork the code and move it in-house, but how sustainable will that be when eng teams have MORE code to manage and mitigate vulnerabilities for?
I agree. The reflexive move is by a specific F50 that has the size, internal controls, headcount, and liability risk that they are taking such an approach.
Most other places will continue to use OSS, but much more locked down access to third party dependencies will be granted. I personally think it'll be a great time to be in the AppSec and SBOM validation space.
> I happen to have an account I post with that I don't generally want associated with my real name, so I figured it was a great test case
Interesting. I have a hard time believing that current models were actually able to place that connection by analyzing your writing style, unless both accounts were a part of the training set (they likely were), and the model was able to encode the similarities in the accounts during training.
The real world impact of this would be that new accounts that were not trained into the model would not be able to be deanonymized like this.
It said it was able to correlate where you live (I don't think it said anything about the tone, unless it said more than you included in the article). At best, it's really just using that for justification though. The model can't feasibly search the web for your writing style. That correlation had to be trained in.
My point is that, if you created a new account and actively used it for a while, I don't think the current version of Grok (never trained on your new account) would ever be able to make that association. It's certainly interesting that it could do it, I just want to drill into the how.
Took me a minute to realize Sid isn't associated with 0xide.computer. Clever domain name!
Getting Google to index my personal site has been a pain. Every other search engine works fine, but ever since I switched the images on my site to .webp (a format created by Google!), my site's content just doesn't get indexed anymore. I've given up since web search traffic matters less and less these days with LLMs, and it only really bothers me when I'm trying to search for my own articles.
Ha, thank you. I spent more time than I'm willing to admit to come up with it.
I use my older, much longer domain for email and identity (it used to be #3 on SERP for "Sid"). This one is just for giggles so I can blog in peace without affecting the main one.
reply