Overview
Salesforce/E1-AceReason-14B is a 14.8 billion parameter language model developed by Salesforce, fine-tuned from AceReason-Nemotron-14B. It is designed for Elastic Reasoning, a novel approach that integrates a budget-constrained rollout strategy into GRPO. This allows the model to reason adaptively even when its thinking process is cut short, and to generalize effectively to new budget constraints without further training.
Key Capabilities
- Adaptive Reasoning: Learns to reason efficiently under varying computational budgets.
- Generalization: Maintains performance across different budget constraints without retraining.
- Fine-tuned from AceReason-Nemotron-14B: Leverages a strong base model for its reasoning capabilities.
Performance Highlights
While E1-AceReason-14B shows a slight decrease in accuracy compared to AceReason-Nemotron-14B on benchmarks like AIME24 and LiveCodeBenchv5 when given full token budgets, its strength lies in its ability to perform under constrained token budgets. For instance, on AIME24, it achieves 44.6% accuracy with 3448 tokens, demonstrating its elastic reasoning efficiency.
Use Cases
This model is particularly suited for research purposes in areas requiring efficient and adaptive reasoning, especially where computational resources or inference time are limited. Users should evaluate its suitability for specific applications, considering its design for elastic reasoning rather than raw peak performance under unconstrained conditions.