More

MattSayar · 2026-06-09T19:48:33 1781034513

It smells like an architecture-related issue to me. They wanted to release the model asap, but they're still implementing the fine-grained controls to constrain the model to non-subscription users.

MattSayar · 2026-05-18T16:04:45 1779120285

> The loudest reaction to Mythos Preview from other security leaders has been about speed - scan faster, patch faster, compress the response cycle. More than one team we have spoken with is now operating under a two-hour SLA from CVE release to patch in production [...] If regression testing takes a day, you cannot get to a two-hour SLA without skipping it, and the bugs you ship when you skip regression testing tend to be worse than the bugs you were trying to patch.

Over time, I wonder if these models will be able to generate more secure code by default by doing this kind of exploitability testing before ever merging their code.

krupan · 2026-05-18T18:07:20 1779127640

I don't know, but it always seems weird to me when people notice AI isn't performing super well and then they conclude that the solution to problem is to try using more AI

tskj · 2026-05-18T18:42:57 1779129777

Yeah why not? That's how I work. If I don't review my work, it's way worse than if I do review it and revise and iterate. I don't see why AI should be different: in fact it very clearly seems to be the case that is isn't.

krupan · 2026-05-18T19:28:54 1779132534

I mean, I was sold something different. Something super human, vastly more intelligent, world changing. The reality is not that. Am I allowed to be disappointed and discouraged?

tskj · 2026-05-20T16:11:28 1779293488

Because you can have it review itself and iterate on its own work before showing you. If you insist on reviewing its one-shot output you'll be disappointed, but if you consider its internal work private and only consider its final output, it's different.

Also we're still in the middle of the transformation, clearly the AI we'll have in 5 years will be radically different and better (by some definition of better) than what we see today. It's kind of weird that you'd be disappointed that the world will only be totally transformed in ten years, and not five.

krupan · 2026-05-21T17:12:45 1779383565

"clearly the AI we'll have in 5 years will be radically different and better"

Based on past performance that's not clear at all. Remember Elon predicting full self driving by 2017. It's almost 10 years past that predicted date and it's still not quite there. 5 years is nothing in tech. It takes 5 years to get a slightly improved chip designed and manufactured. It's been 3.5 years since ChatGPT was released and the LLMs of today are not radically different from that, and no radical changes have been teased. We're still in the throw-more-hardware-at-it phase. We could be here a while.

HDBaseT · 2026-05-18T23:19:05 1779146345

It has changed the world in major ways, although its not entirely visible because we've become numb to the idea of AI and AI being in everything.

It hasn't changed the way we sleep, wake up, eat, walk and talk so its not "life changing" or "world changing" in the sense a meteorite hit us, but each day thousands of mini meteorites are hitting Earth and we're becoming normalized to it one step at a time.

You are allowed to be disappointed and discouraged! For all the good tech that has come out of the AI revolution, most of it is ignored or shelved for things that can squeeze more and more money out of us and make our lifes worse, not better. Despite there being real potential to generate nice code, assist with biomedical research, self-driving cars, etc.

krupan · 2026-05-18T23:34:09 1779147249

Which is it? Major changes or a bunch of small changes. I'm well aware of the small changes. I worked for an autonomous drone company back in 2008. It was really cool! In 2020 I started working for an autonomous car company. Again, amazing! None of it was a quick step function improvement. It was a lot of hard work. None of it was quite superhumanly smart either. LLMs are impressive pattern completion machines but they kinda suck at producing anything truly novel. Plus they are compulsive liars about that, lol!

germandiago · 2026-05-19T08:58:23 1779181103

Reminds me of people adding more intervention and bureaucracy bc the last one did not do well, so we need more of it.

The problem is never the results of it. It is that we did not do well enough.

edu · 2026-05-18T16:42:20 1779122540

Or they don’t, and they* sell access to Mythos and successors through their services company or network of partners and charge a premium.

* they, I mean all foundation models providers, as OpenAI seems to go in the same direction

MattSayar · 2026-05-01T19:35:48 1777664148

I like simonw's take that open source should be more valuable [0]

>An interesting result of this is that open source libraries become more valuable, since the tokens spent securing them can be shared across all of their users. This directly counters the idea that the low cost of vibe-coding up a replacement for an open source library makes those open source projects less attractive.

I can understand why the reflexive move to fork the code and move it in-house, but how sustainable will that be when eng teams have MORE code to manage and mitigate vulnerabilities for?

[0] https://simonwillison.net/2026/Apr/14/cybersecurity-proof-of...

alephnerd · 2026-05-01T22:33:56 1777674836

I agree. The reflexive move is by a specific F50 that has the size, internal controls, headcount, and liability risk that they are taking such an approach.

Most other places will continue to use OSS, but much more locked down access to third party dependencies will be granted. I personally think it'll be a great time to be in the AppSec and SBOM validation space.

MattSayar · 2026-04-27T18:40:55 1777315255

You can borrow ebooks via Libby to your Kindle.

MattSayar · 2026-04-26T02:18:40 1777169920

I taped an Airtag-equivalent to one of my bikes as well

MattSayar · 2026-04-16T16:58:13 1776358693

I recognize the sarcasm. The data I can find says it's performing at baseline however?

https://marginlab.ai/trackers/claude-code/

ACCount37 · 2026-04-16T17:07:41 1776359261

Yeah, that's my point. Humans are not reliable LLM evaluators. "Secret model nerfs" happen in "vibes" far more often than they do in any reality.

MattSayar · 2026-03-10T23:12:46 1773184366

It wasn't the username MattSayar, it's an alt account.

Zambyte · 2026-03-10T23:20:53 1773184853

I missed:

> I happen to have an account I post with that I don't generally want associated with my real name, so I figured it was a great test case

Interesting. I have a hard time believing that current models were actually able to place that connection by analyzing your writing style, unless both accounts were a part of the training set (they likely were), and the model was able to encode the similarities in the accounts during training.

The real world impact of this would be that new accounts that were not trained into the model would not be able to be deanonymized like this.

MattSayar · 2026-03-11T00:21:00 1773188460

It was able to piece together some other details that I've dropped related to where I live, in addition to my casual tone etc

Zambyte · 2026-03-11T03:19:28 1773199168

It said it was able to correlate where you live (I don't think it said anything about the tone, unless it said more than you included in the article). At best, it's really just using that for justification though. The model can't feasibly search the web for your writing style. That correlation had to be trained in.

My point is that, if you created a new account and actively used it for a while, I don't think the current version of Grok (never trained on your new account) would ever be able to make that association. It's certainly interesting that it could do it, I just want to drill into the how.

MattSayar · 2026-03-11T12:34:52 1773232492

Yeah I left out a lot of details! But it did more than just my writing style.

MattSayar · 2026-02-25T16:02:55 1772035375

Took me a minute to realize Sid isn't associated with 0xide.computer. Clever domain name!

Getting Google to index my personal site has been a pain. Every other search engine works fine, but ever since I switched the images on my site to .webp (a format created by Google!), my site's content just doesn't get indexed anymore. I've given up since web search traffic matters less and less these days with LLMs, and it only really bothers me when I'm trying to search for my own articles.

ssiddharth · 2026-02-25T18:56:37 1772045797

Ha, thank you. I spent more time than I'm willing to admit to come up with it.

I use my older, much longer domain for email and identity (it used to be #3 on SERP for "Sid"). This one is just for giggles so I can blog in peace without affecting the main one.

MattSayar · 2026-02-14T14:56:35 1771080995

The link is a 404. Is the repo still Private?

MattSayar · 2026-01-30T22:36:55 1769812615

Small world, Matt! It's been fun seeing you pop up from time to time after writing for the same PSP magazine together