EphAsad/Atem-4B
EphAsad/Atem-4B is a 4 billion parameter reasoning model built on Qwen3-4B, specifically fine-tuned using a single CoT-preserving SFT pass. This method distills multi-domain reasoning capabilities from frontier teacher models while preserving the base model's native thinking. It excels at multi-step reasoning tasks such as mathematics, code explanation, and analytical problem-solving, making it suitable for applications requiring structured, step-by-step thought processes.
Loading preview...
Atem-4B: A CoT-Preserving Reasoning Model
Atem-4B is a 4 billion parameter reasoning model developed by EphAsad, built upon the Qwen3-4B base. It distinguishes itself through a unique single-pass, Chain-of-Thought (CoT)-preserving Supervised Fine-Tuning (SFT) method. This approach integrates advanced reasoning capabilities from larger teacher models directly onto Qwen3-4B's existing cognitive foundation, avoiding the loss of native reasoning observed in prior multi-stage fine-tuning.
Key Capabilities & Features
- CoT-Preserving SFT: Unlike previous iterations, Atem-4B maintains the base model's inherent reasoning abilities while layering on distilled multi-domain CoT. This results in selective CoT activation, engaging reasoning for complex problems and suppressing it for simpler ones.
- Full 16-bit LoRA: Utilizes full 16-bit LoRA (r=64, alpha=128) for training, leveraging available VRAM headroom for improved accuracy and speed over QLoRA.
- Enhanced Reasoning Performance: Achieves notable improvements in commonsense reasoning (e.g., +2.9pp on HellaSwag) and maintains strong performance on MMLU and ARC-Challenge, demonstrating genuine reasoning transfer.
- Diverse Training Data: Trained on a corpus of 56,573 records covering mathematics, coding, general, scientific, and medical reasoning, all with explicit CoT traces.
Intended Use Cases
- Multi-step Mathematical Reasoning: Solving complex math problems requiring detailed steps.
- Code Explanation & Debugging: Providing precise explanations, implementations, and debugging assistance for code.
- Analytical Reasoning: Evaluating arguments, identifying fallacies, and analyzing policy consequence chains.
- Scientific & Technical Explanation: Generating in-depth explanations across various scientific and technical domains.
This model is not designed for real-time information retrieval or tasks where a direct, non-reasoned answer is preferred.