Only for chat sessions, not for agentic coding. It's just too slow to be practic...

ac29 · 2026-04-06T06:35:27 1775457327

This article is about a MoE model with only 4B active parameters, it shouldn't take 10 minutes to answer a question about a small project.

I measured a 4bit quant of this model at 1300t/s prefill and ~60t/s decode on Ryzen 395+.

nl · 2026-04-06T04:11:43 1775448703

Doesn't the framework desktop have a Ryzen 395 AI? That's a unified memory architecture like the Macs.

pshirshov · 2026-04-06T13:39:41 1775482781

Ah, forgot to add, it's not really "unified" you have to explicitly specify your allocations. You may have a reasonably good 48gb chunk assigned to the GPU, but that DDR5 is 5-10 times slower than GDDR/HBM and the GPU itself isn't stellar.

So, framework laptops are great for chatting but nearly useless in agentic coding.

My Radeon W7900 answers a question ("what is this project") in 2 minutes, it takes my Framework 16 with 5070 addon around 11 minutes without the addon - around 23 (qwen 3.5 27b, claude code)

pshirshov · 2026-04-06T11:16:54 1775474214

That's discrete DDR5, it's not as fast as your regular VRAM.