Cerebrum-1.0-7b: A Reasoning-Optimized LLM
Cerebrum-1.0-7b is a 7 billion parameter large language model developed by AetherResearch, built upon the Mistral 7b architecture. Its core innovation lies in its fine-tuning process, which includes a small custom dataset of native chain of thought data and a novel technique called targeted RLHF (tRLHF). This approach enables the model to devise tactical plans before solving complex problems, making it highly effective for reasoning-intensive tasks.
Key Capabilities & Differentiators
- Superior Reasoning Performance: Cerebrum-1.0-7b significantly outperforms few-shot prompted Mistral 7b and even larger models like Llama 2 70b on benchmarks such as ARC Challenge, GSM8k, and Math, despite its smaller size.
- Native Chain of Thought: The model is trained to generate a thought process, breaking down complex problems into manageable steps, which enhances accuracy and relevance.
- Efficient Training: Achieves high performance with a remarkably small training footprint, utilizing under 5000 training prompts and even fewer labeled datapoints for tRLHF.
- Low Temperature Stability: Operates effectively at very low temperatures (including temperature 0), which is beneficial for tasks requiring precise answers and helps avoid repetitions without needing a repetition penalty.
Optimal Usage
For best results, Cerebrum-1.0-7b should be prompted using an Alpaca-style template that explicitly requests a "thought process" description. This encourages the model to leverage its native chain of thought capabilities. While it excels at reasoning, it will typically omit verbose considerations for brainstorming, knowledge-intensive, and creative tasks.