Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I recently tested every version from 0.7 to 0.11.1 trying to run q5 mistral-3.1 on a system with 48GB of available vram across 2 GPUs. Everything past 0.7.0 gave me OOM or other errors. Now that I've migrated back to llama.cpp I'm not particularly interested in fucking around with ollama again.

as for 4chan, they've hated ollama for a long time because they built on top of llama.cpp and then didn't contribute upstream or give credit to the original project



ah! This must be downloaded from elsewhere and not from Ollama? So sorry about this.

To help future optimizations for given quantizations, we have been trying to limit the quantizations to ones that fit for majority of users.

In the case of mistral-small3.1, Ollama supports ~4bit (q4_k_m), ~8bit (q8_0) and fp16.

https://ollama.com/library/mistral-small3.1/tags

I'm hopeful that in the future, more and more model providers will help optimize for given model quantizations - 4 bit (i.e. NVFP4, MXFP4), 8 bit, and a 'full' model.


Yeah, I think the idea that models that don't come from ollama.com are second class citizens was what made me fist start to think about migrating back to llama.cpp and then the memory stuff just broke the camel's back. I don't want to use a project that editorializes about what models and quants I should be using, if I wanted a product I don't have control over I'd just use a commercial provider. For what it's worth I actually did download the full fp16 and quant it using ollama and still had the memory error for completion's sake.

I truly don't understand the reasoning behind removing support for all the other quants, it's really baffling to me considering how much more useful running a 70b parameter at q3 is that not being able to run a 70b parameter model at all, etc. Not to mention forcing me to download hundreds of gigabytes of fp16 because compatibility with other quants is apparently broken, and forcing me to quant models myself.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: