Hacker Newsnew | past | comments | ask | show | jobs | submit | danielhanchen's commentslogin

We made Unsloth Studio which should help :)

1. Auto best official parameters set for all models

2. Auto determines the largest quant that can fit on your PC / Mac etc

3. Auto determines max context length

4. Auto heals tool calls, provides python & bash + web search :)


Yea, I actually tried it out last time we had one of these threads. It's undeniably easy to use, but it is also very opinionated about things like the directory locations/layouts for various assets. I don't think I managed to get it to work with a simple flat directory full of pre-downloaded models on an NFS mount to my NAS. It also insists on re-downloading a 3GB model every time it is launches, even after I delete the model file. I probably have to just sit down and do some Googleing/searching in order to rein the software in and get it to work the way I want it to on my system.

Sadly doesn't support fine tuning on AMD yet which gave me a sad since I wanted to cut one of these down to be specific domain experts. Also running the studio is a bit of a nightmare when it calls diskpart during its install (why?)

Thanks for that. Did you notice that the unsloth/unsloth docker image is 12GB? Does it embed CUDA libraries or some default models that justifies the heavy footprint?

I applaud that you recently started providing the KL divergence plots that really help understand how different quantizations compare. But how well does this correlate with closed loop performance? How difficult/expensive would it be to run the quantizations on e.g. some agentic coding benchmarks?

Is unsloth working on managing remote servers, like how vscode integrates with a remote server via ssh?

Lmstudio Link is GREAT for that right now

what are you using for web search?

Great project! Thank you for that!

Haha :)

Do you get early access so you can prep the quants for release?

IIRC they mentioned they do.

Haha :) We had some issues with Kimi-2.6 since it was int4 and we were investigating how to handle it :)

Appreciate what y'all do! We were slacking about how many HGX-B300 it would take to run Kimi and it looks like we could actually fit 2-3 Kimis on a single HGX.

We also made some dynamic MLX ones if they help - it might be faster for Macs, but llama-server definitely is improving at a fast pace.

https://huggingface.co/unsloth/Qwen3.6-27B-UD-MLX-4bit


What exactly does the .sh file install? How does it compare to running the same model in, say, omlx?

Yes sadly CUDA 13.2 is broken - NVIDIA will push a fix in CUDA 13.3

Love the JPEG analogy :)

Oh that is pretty good! And the SVG one!

They sometimes do! Qwen, Google etc do them!

Oh hey - we're actually the 4th largest distributor of OSS AI models in GB downloads - see https://huggingface.co/unsloth

https://unsloth.ai/docs/basics/unsloth-dynamic-2.0-ggufs is what might be helpful. You might have heard 1bit dynamic DeepSeek quants (we did that) - not all layers can be 1bit - important ones are in 8bit or 16bit, and we show it still works well.


Yes this is fair - we try our best to communicate issues - I think we're mostly the only ones doing the communication that model A or B has been fixed etc.

We try our best as model distributors to fix them on day 0 or 1, but 95% of issues aren't our issues - as you mentioned it's the chat template or runtime etc


I have to ask - what do you run locally on your laptop (model, backend, and agentic cli)?

Feature request:

A leader board with filtering so you can enter your machine specs and it will sort all models along with all the various quantisation and then rank them all - because so far model ranking site either don’t include all available quants, don’t compare apples to apples (ie was one model tested with Claude code while another benchmark done with opencode) etc

Oh - and as bonus, scoring also ranked by which agentic CLI :)


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: