Competive with Claude 3.5 haiku, beats all major open models like Llama 3.1 70B, Qwen 2.5 (except MATH) and Nemotron
All their recipe - code, datasets and model checkpoints are public and out in open!
Competive with Claude 3.5 haiku, beats all major open models like Llama 3.1 70B, Qwen 2.5 (except MATH) and Nemotron
All their recipe - code, datasets and model checkpoints are public and out in open!