Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For 27B, just get a used 3090 and hop on to r/LocalLLaMA. You can run a 4bpw quant at full context with Q8 KV cache.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: