I have DeepSeek-R1 1.5b running on a Raspberry Pi 5. I have DS-R1 14b Q6 running...

smallerize · on Jan 30, 2025

That's not the DeepSeek R1 model that they're offering via the API on these servers. That's a Qwen model that's been fine-tuned on output from the big R1 model.

xinayder · on Jan 30, 2025

Source?

dreilide · on Jan 30, 2025

https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-...

DeepSeek-R1-Distill-Qwen-1.5B, DeepSeek-R1-Distill-Qwen-7B, DeepSeek-R1-Distill-Qwen-14B and DeepSeek-R1-Distill-Qwen-32B are derived from Qwen-2.5 series, which are originally licensed under Apache 2.0 License, and now finetuned with 800k samples curated with DeepSeek-R1. DeepSeek-R1-Distill-Llama-8B is derived from Llama3.1-8B-Base and is originally licensed under llama3.1 license. DeepSeek-R1-Distill-Llama-70B is derived from Llama3.3-70B-Instruct and is originally licensed under llama3.3 license.

kgwgk · on Jan 30, 2025

https://arxiv.org/abs/2501.12948