arcee-ai/Arcee-Maestro-7B-Preview
Arcee-Maestro-7B-Preview is a 7.6 billion parameter reasoning model developed by arcee-ai, based on a Qwen2.5-7B DeepSeek-R1 distillation. This model is specifically trained with GRPO reinforcement learning on 450,000 verified math problems and coding examples. It demonstrates promising improvements in mathematical and coding abilities, making it suitable for advanced reasoning tasks.
Loading preview...
Arcee-Maestro-7B-Preview: Enhanced Reasoning Model
Arcee-Maestro-7B-Preview is arcee-ai's initial reasoning model, featuring 7.6 billion parameters and built upon a Qwen2.5-7B DeepSeek-R1 distillation base. This model distinguishes itself through its GRPO (Generative Reinforcement Learning with Policy Optimization) training, which involved 450,000 verified math problems and additional bootstrapped coding examples. This specialized training has led to notable improvements in its mathematical and coding capabilities.
Key Capabilities
- Advanced Reasoning: Designed to excel in complex logical and analytical tasks.
- Mathematical Proficiency: Significantly enhanced performance in mathematics due to extensive GRPO training.
- Coding Ability: Shows strong gains in coding tasks, competing with larger models.
Intended Use Cases
- Mathematics: Ideal for applications requiring strong mathematical problem-solving.
- Coding: Suitable for code generation, analysis, and related programming tasks.
- Advanced Reasoning: Applicable to scenarios demanding sophisticated logical inference.
This preview model, while an early release, already surpasses the performance of the O1 preview in several metrics, particularly in its target domains. It operates under an Apache-2.0 License, allowing for broad commercial and non-commercial use.