Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Try the 27B dense model. It will likely do much better than the 35b MoE with only 3B active experts.

Also, performance on research-y questions isn't always a good indicator of how the model will do for code generation or agent orchestration.



Currently sat waiting for the unsloth fixed quants to drop, but I'm on the edge of my seat for this.


Wait, didn't they drop like two days ago?


The 35b did but not the 27b. Looks like the latter has been updated in the last half hour.


Neat! Thanks for correcting me there. I'll go and take a look.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: