Zyphra/ZR1-1.5B is a 1.5 billion parameter reasoning model developed by Zyphra, trained extensively on verified coding and mathematics problems using reinforcement learning. This model, with a context length of 131072 tokens, significantly outperforms its base model and even larger LLMs like Llama-3.1-70B-Instruct on hard coding tasks. It excels in complex reasoning, mathematical problem-solving, and code generation, making it suitable for applications requiring high accuracy in these domains.
No reviews yet. Be the first to review!