Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Any idea on the VRAM footprint for the 1.7B model? I guess it fits on consumer cards but I am wondering if it works on edge devices.


The demo uses 6GB dedicated VRAM on Windows, but keep in mind that it's without FlashAttention. I expect it would drop a bit if I got that working.

Haven't looked into the demo to see if it could be optimized by moving certain bits to CPU for example.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: