AetherResearch/Cerebrum-1.0-7b

Warm
Public
7B
FP8
8192
License: apache-2.0
Hugging Face
Overview

Cerebrum-1.0-7b: A Reasoning-Optimized LLM

Cerebrum-1.0-7b is a 7 billion parameter large language model developed by AetherResearch, built upon the Mistral 7b architecture. Its core innovation lies in its fine-tuning process, which includes a small custom dataset of native chain of thought data and a novel technique called targeted RLHF (tRLHF). This approach enables the model to devise tactical plans before solving complex problems, making it highly effective for reasoning-intensive tasks.

Key Capabilities & Differentiators

  • Superior Reasoning Performance: Cerebrum-1.0-7b significantly outperforms few-shot prompted Mistral 7b and even larger models like Llama 2 70b on benchmarks such as ARC Challenge, GSM8k, and Math, despite its smaller size.
  • Native Chain of Thought: The model is trained to generate a thought process, breaking down complex problems into manageable steps, which enhances accuracy and relevance.
  • Efficient Training: Achieves high performance with a remarkably small training footprint, utilizing under 5000 training prompts and even fewer labeled datapoints for tRLHF.
  • Low Temperature Stability: Operates effectively at very low temperatures (including temperature 0), which is beneficial for tasks requiring precise answers and helps avoid repetitions without needing a repetition penalty.

Optimal Usage

For best results, Cerebrum-1.0-7b should be prompted using an Alpaca-style template that explicitly requests a "thought process" description. This encourages the model to leverage its native chain of thought capabilities. While it excels at reasoning, it will typically omit verbose considerations for brainstorming, knowledge-intensive, and creative tasks.