Salesforce/E1-Math-7B
Salesforce/E1-Math-7B is a 7.6 billion parameter language model fine-tuned from Skywork-OR1-Math-7B, featuring a 32768-token context length. It is specifically trained for Elastic Reasoning using a budget-constrained rollout strategy, enabling adaptive reasoning even when computational resources are limited. This model excels at mathematical tasks and generalizes effectively to varying budget constraints without additional training.
Loading preview...
E1-Math-7B: Elastic Reasoning for Mathematical Tasks
E1-Math-7B is a 7.6 billion parameter language model developed by Salesforce, fine-tuned from Skywork-OR1-Math-7B. Its core innovation lies in its Elastic Reasoning capability, achieved through a budget-constrained rollout strategy integrated into GRPO. This approach allows the model to adaptively reason, even when the thinking process is cut short, and to generalize effectively to unseen budget constraints without requiring further training.
Key Capabilities
- Adaptive Reasoning: Learns to reason efficiently under varying computational budgets.
- Mathematical Performance: Demonstrates improved accuracy on mathematical tasks compared to its base model, Skywork-OR1-Math-7B, particularly with increased token usage.
- Generalization: Maintains performance across different budget constraints without additional fine-tuning.
Performance Highlights
E1-Math-7B shows notable improvements in accuracy on mathematical benchmarks. For instance, at an average of 11768 tokens, it achieves 69.6% accuracy, outperforming Skywork-OR1-Math-7B's 68.3% at 13803 tokens. More significantly, its accuracy scales with token budget, reaching 32.9% at 3742 tokens, a substantial increase over the base model's 14.0% at 4023 tokens for a specific metric.
Ethical Considerations
This model is released for research purposes only, supporting an academic paper. Users are strongly advised to evaluate and address potential concerns regarding accuracy, safety, and fairness before deployment, especially in high-risk scenarios.